Transcript dama-jun00

High Performance
Enterprise
Data Propagation
Russell Donovan
BMC Company Profile

Established in 1980

Leader in Application Management

Estimated FY2000 Revenues of $1.8B

Over 6,000 Employees

Development Labs in Austin (TX), Conyers (GA),
Houston, San Jose, Sunnyvale (CA), Waltham (MA)
Germany, Israel, Singapore

Market Coverage in Over 50 Countries

Member of the S&P 500
BMC Software e-Business Availability

Provides application management solutions that ensure
the availability, performance, and recovery of businesscritical applications.

We call this application service assurance and it
means that the applications companies and their
customers rely on will be there when they need
them.
e-vailability - We Guarantee Our Solutions!
Enterprise Data Propagation (EDP)
Requirement For All Enterprises
Need to synchronize data between legacy systems
and distributed relational databases for:

Data warehousing, operational data stores, data mining

e-Business applications access to legacy data

Enterprise application integration

Distributed enterprises, ERP solutions, Acquisitions
IMS
DB2
70% of corporate data in IMS,
VSAM, DB2
Need high performance solutions
Need near real time solutions
VSAM
Other
Data Propagation - Strategies For
Synchronizing Multiple Copies of Data
Copy
Unload/Load
SQL Query
Distributed Database
2 Phase Commit
Source
Change Capture --- With
Asynchronous Propagation
Target
Key Challenges Implementing a Data
Warehouse
Data Management Review Survey

Business rule analysis

Managing End User Expectation

Business data modeling

Reliability and integrity of data

Data acquisition

Meta Data management

Managing Management Expectation

Database performance
Data Warehouse Implementations
For Customers With



Large operational databases
High transaction rates
24x7 operations requirements
Critical Management Issues






Availability of operational systems
Performance of operational transactions
Maintaining service levels
Increasing volumes of data
Time required to load and refresh data warehouse
Quality, currency & accuracy of decision making data
Building Data Warehouses:
A Perspective
VSAM Files
Operational
Databases
Operational
Database
Images
Subject
Time Variant
Oriented Subject Oriented
Operational Data Warehouses
Images
Data
Marts
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
Data Warehouse
Data
WareHouse
DB2
IMS
VSAM
Oper.
Data
Store
Data
Mart
Data
Mart
End-Users
Query Tools
Brio, Bus. Objects
COGNOS
Microstrategy
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
Data Warehouse
Integration
DB2
IMS
Area
Data
WareHouse
Oper.
Data
Store
VSAM
Dist. SystemsTools
Informatica, Constellar
D2K, Sagent, Ardent
Data
Mart
Data
Mart
End-Users
Query Tools
Brio, Bus. Objects
COGNOS
Microstrategy
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
BMC
Solution
Data Warehouse
Integration
Area
DB2
Bulk DataMove
IMS
Data
WareHouse
Oper.
Data
Store
ChangeDataMove
VSAM
Dist. SystemsTools
Informatica, Constellar
D2K, Sagent, Ardent
Data
Mart
End-Users
Query Tools
Data
Mart
Brio, Bus. Objects
COGNOS
Microstrategy
Change Data Propagation:
A Perspective
Change Data Propagation Is Preferred When:

Databases are large and bulk move would take too long
 Batch
window limitations
 Database availability limitations





Support for 24 x 7 is a requirement of operational application
Minimum latency “Near-real-time” is required in target database!
Currency of information in target database is important
Small percentage of a large database has changed
Need to reduce network traffic by transmitting only data changes
Source
Target
Transaction Based
Change Data Propagation
Synchronous Data Propagation
Original

update waits until all targets are updated
Single, global transaction with multi-site, coordinated commit processing
Asynchronous Data Propagation
Propagation
Source
of updates occurs asynchronous to originating
transaction


Target
Minimizes resource consumption at source
Minimizes impact on source transaction response times
Source
Target
Synchronous vs Asynchronous
Change Data Propagation
Synchronous 2 Phased Commit
Source transaction completes
when all databases updated
Advantages:
 Real time propagation
All sites always synchronized
 Disadvantages:
Transaction response time
Data availability impact
System resiliency

Usually
not practical
Asynchronous Data Propagation
Source transaction does not wait
for target databases to be
updated
 Advantages:
Minimum
performance impact
Availability
Autonomy
Recoverability
Disadvantages:
target locations updates may be delayed
All sites not always synchronized
Asynchronous Change Capture:
Implementation Considerations
Trigger Based




Triggers used to capture changes to database records
Incremental updates collected in staging tables
Significant resource consumption for triggers and logging
Typically low volume applications (< 20 transactions/second)
Log Exit Based




Increased logging in operational environment
Increased response times for source transactions
Increased resource consumption
Log management issues
Log Post Process Based



Increased logging in operational environment
Log management issues
Long latency interval can not support near real time
Enterprise Data Propagation (EDP)
The BMC Solution
A Data Propagation Management System
A single point of access for
managing Legacy data
propagation across the
enterprise
DB2
IMS
Fast
Path
VSAM
Bulk Data
Change Data
Efficient change capture
Basic data transformation
High performance data movement
High performance utilities

Operational
Data
Store
Common look and feel
 Integrated transformations and mappings
 Integrated recovery/restart

ChangeDataMove:
Product Positioning
Positioning
ChangeDataMove is a high performance, efficient, change data
propagation solution, which captures changes made to IMS, Fast Path,
VSAM, and DB2 databases, and propagates those changes to the most
prevalent relational databases.
What It Does
Transaction-based data propagation
Supports high volume production applications with hundreds of transactions per second
Supports ‘near real-time’ as well as scheduled data propagation
Advantages
A data propagation system (complete solution vs a point product)
Highly efficient change capture does not impact applications
Only solution for IMS, FastPath and VSAM that does not require logging
Optionally integrated with DataMove for bulk data movement
Change Data Propagation
for IMS and VSAM
Synchronous Change Capture
Asynchronous Data Propagation
 Transparent high performance
 Data Propagated Within Context of
change capture
 Minimum impact on source system
logging, CPU & user response time
 Data is available immediately for
asynchronous propagation
Original Transaction
 Updates applied in proper sequence
 Inter and intra-table consistency
 Source and target(s) consistent
within transaction boundaries
1
1
2
3
EDP
Log
EDP
Apply
2
3
 Not affected by network delays or slow remote processors
 Supports “Near Real Time” and/or Scheduled Propagation
IMS Change Capture
 Resides
within the IMS environment
 Captures DL/I calls as they occur


Supports IMS/TM (MPP,BMP), Fast Path, CICS DBCTL, Batch DL/I
Commits updates at transaction or job (batch) end
User
Application
IMS
Subsystem
Based on BMC Software’s
CHANGE
RECORDING FACILITY
EDP
Logger
IMS
Database
ECCR
EDP
Log
LRP
BMC
Apply
DB2
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
CICS/VSAM Change Capture
Captures changes at each Get, Put & Erase request
 Utilizes CICS TRUE, File, and Re-sync exits
 Resides as functional part of CICS address space



Participates in two phase commit with CICS transaction
Updates are committed when transaction commits
User
Application
BMC
Apply
CICS
Subsystem
EDP
Logger
VSAM
Database
LRP
Oracle
SQL
Server
TNR
ECCR
EDP
Log
DB2
OEM
Apply
Sybase
UDB
VSAM Batch Change Capture
Journad exit dynamically activated
 ECCR resides within the batch address space
 UOW is complete when application closes VSAM file

DB2
User
Application
Batch
Address
Space
Based on BMC Software’s RECOVERY PLUS BMC
Apply
for CICS/VSAM product.
EDP
Logger
VSAM
Database
ECCR
EDP
Log
LRP
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
DB2 MVS Change Capture
Requires DB2 change data capture be activated
 Reads log records via DB2 IFI, external decompression
 Maintains multiple versions of schema

DB2
MVS
DB2
Uses DB2
IFI Facility
User
Application
DB2
ECCR
EDP
Logger
BMC
Apply
LRP
Oracle
SQL
Server
TNR
Sybase
EDP
Log
OEM
Apply
UDB
The Transformation Process
 Transforms IMS, Fast Path and VSAM data to relational formats
Hierarchical structures to relational structures
 Converts non-relational data types to relational
 Uses relational DBMS catalog information
 Uses copy libraries and IMS database descriptors
 Automatically handles Dates, Times, Data Types
 Repeating groups, Redefined records
 Customizable through user exits

VSAM Files
Transformation
Possible Target Keys

To allow resulting target rows to be unique

Replication Key (REPKEY)


Ancestor Keys


If REPKEY is a composite key (I.e. IMS concatenated key) each level is
available to be used as the key of the target row
Sequential number


This key will make the target row unique
 For IMS it is the full concatenated key or segments RBA
If a single input segment or record creates multiple output rows, a sequential
numeric column can be generated.
Any field in the input segment or record
Transforming Cobol Structures
 Repeating Groups
 all repeated fields to a single target column
 As individual rows in the same or a different table

Update results in set of deletes and inserts for target rows
 Redefined Records
 assigned unique names and schema definitions
 Record identification exit identifies record types
 Schema applied to segment or record based on redefined record type

Redefined records can be propagated to same or different targets
High Performance Transport & Apply


Data is blocked, compressed and encrypted
Multi-threaded apply tasks for increased performance
EDP
Apply
Send
T
R
A
N
S
P
O
R
T
Receive
TCP/IP
T
R
A
N
S
P
O
R
T
DB2
Dynamic
Memory
Staging
Queue
Oracle
Dynamic
Memory
Queue
EDP
Apply
EDP
Apply
EDP
Apply
EDP
Apply
EDP
Apply
Automated Schema Replication
 Reduce
administration costs by automating the creation
of target tables from IMS, VSAM, and DB2 source
schema
DB2
DBD
Copybook
Copybook
VSAM Files
SchemaMove
Oracle
DB2
Catalog
MS SQL
Server
Bulk Data Propagation
Bulk move is usually simpler and easier to implement
Needed to initially create or to refresh a target database
Bulk move is the preferred solution when:
 Data volumes are not large and the move can be performed within time
constraints
 Database availability is not a concern (source/target)
 Network volumes and network overhead are not issues
 Currency of information in target database is not a concern
 Change data propagation cannot handle the volumes
Source
Target
Bulk Data Movement DB2 to Oracle
The Traditional Approach
Time
MVS
Host
File
DB2
Extract
DB2
DB2
35%
DB2 Unload 20 min.
13%
File Transfer 7 min.
Gateway
TCP/IP
UNIX
Server
Gateway
File
Oracle
Oracle
Loader
52%
Oracle SQL Load 28 min.
Oracle
Total Time 55 min.
Bulk Data Movement DB2 to Oracle
Parallel Unload Parallel load
Time
MVS
Host
File
DB2
Extract
DB2
DB2
Parallel Unload 7 min
Gateway
TCP/IP
File Transfer 7 min.
UNIX
Server
Gateway
File
Oracle
Oracle
Loader
Oracle SQL Load 16M
Oracle
Total Time 30 min.
Bulk Data Movement DB2 to Oracle
Parallel Unload/load & Piping
MVS
Host
File
DB2
Extract
DB2
DB2
Parallel Unload
Gateway
PIPING
TCP/IP
UNIX
Server
Parallel load
Oracle load starts as first
record is read from DB2
Gateway
File
Oracle
Oracle
Loader
Oracle
Total Time 17 min.
DataReach: Product positioning
Positioning
DataReach is a high performance, high availability data movement solution
for extracting MVS/ESA DB2 data and loading it into Informix, Oracle or
Sybase database on Unix.
A joint development effort of EMC & BMC - Not A Product We Sell Today
What It Does

Uses EMC Storage to move data at channel speeds vs network
speeds
 Moves the work of extracting DB2 MVS data from MVS to Unix
Advantages





Moves data 10 to 100 times faster than network solutions
Completely eliminates mainframe processing
Completely eliminates network traffic and network overhead
Allows nearly 100% availability of the source DB2 database
Enables customers to more frequently refresh data warehouses
Bulk Data Movement DB2 to Oracle
The DataReach Approach
DataReach Directly Extracts DB2 Data
Eliminates
network traffic & network overhead
Familiar SQL-based SELECT syntax
Subset of data via WHERE predicate
Optional parallel extraction capability
Optional access via DB2 Index structures
Data conversion
 EBCDIC to ASCII
 DB2 to generic format
Direct load of Oracle, Sybase, Informix
Optional parallel load capability
Distributed capabilities
Intermediate
File
MVS
Host
DB2
DB2
UNIX
Host
DB2
Extract
Oracle
Loader
Oracle
Oracle
DataReach: How It Works
MVS System
DB2
Escon Channels
CKD Volumes
FBA Volumes
DB2
Source
Target
DBMS
SYMMETRIX
ESP
SCSI Channels
Extractor
UNIX
Translation
Module
Target
DBMS
Native load utility
Target RDBMS
Flat
File
DataReach: Performance Benchmark
DB2 to Oracle on HP/UX
1,000.00
900.00
800.00
Minutes
700.00
600.00
Traditional
500.00
DataReach
400.00
300.00
200.00
100.00
0.00
10 Mbytes 100 Mbytes
1 Gbyte
Size
5 Gbytes
10 Gbytes
Traditional Process vs DataReach
Elapsed Time Components
1 GB of Data
DB2 to Oracle on HP/UX
1:04:48
0:57:36
Elapsed Time in Minutes
0:50:24
0:43:12
DataReach Process Time
0:27:53
0:36:00
Oracle SQL Load Time
File Transfer Time
0:28:48
DB2 Unload Time
0:07:10
0:21:36
0:14:24
0:20:06
0:07:12
0:16:35
0:00:00
Traditional Process
DataReach Process
DataReach: Operational Considerations
Data Consistency: Quiesce DB2
 High Availability: Use A mirror copy in Symmetrix
 Security: DataReach Authorization Table in DB2




DB2 Read access
Unix Login
Target RDBMS authorizations
Extract, Transform, Move & Load Options
A Performance Perspective
M Bytes per Hour
4000
Making The Right Choice
3500
3000
2500
2000
1500
1000
500
0
Change Data
Propagation
RYO Bulk
Move
Solutions
DataReach
Parallel
Unload/Load
Piping
High Performance Data Propagation
Strategy for Supporting Data Warehouse
Operational
Applications
IMS
Fast Path
DB2
VSAM
Other
Integration Area
Operational
Data
Store
Change
History
High
Performance
Data Propagation
Data
Warehouse
Data Warehouse
Refresh
Data
Mart
Business Intelligence Systems
Data
Mart
High Performance Data Propagation
Strategy for Supporting DW & e-Business
Operational
Applications
Web
Server
Updates
App.
Server
IMS
Fast Path
DB2
Integration Area
VSAM
Operational
Data
Store
Other
Change
History
High
Performance
Data Propagation
Data
Warehouse
Data Warehouse
Refresh
Data
Mart
Business Intelligence Systems
Data
Mart
High Performance Data Propagation
Strategy for Enterprise Application Integration
Operational
Applications
Note: This is a BMC Services Offering
IMS
DB2
PeopleSoft
Baan
VSAM
Other
Oracle
Messaging
Bulk
Message Queue
High
Performance
Data Propagation
SAP
ERP
Tools
Change
Message Queue
Data
Warehouse
Data
Mart
e-business
Applications
Major U.S. Brokerage Firm
Application Integration example
Global corporation headquartered in New York
City providing:
 Securities
 Asset Management
 Credit and transaction services
The Problem
Business challenge
 Migration to new strategic DBMS could not
impact business operations
Technical challenge
 Keep current ADABAS DBMS synchronized
with new strategic DB2 DBMS
 The solution had to be sustainable for the longterm and also be scalable
The Solution



Client already had an ADABAS log capture mechanism and MQSeries.
A “Custom Adapter for Source MQSeries” to Change Data Move
 written in ASM
 runs as a started task
Primarily batch with over 700 files (as sources).
ADABAS
Batch
Log Capture Address
Space
MQSeries
Queue
MQGET
EDM
Logger
Custom
Adapter
EDM
Log
Major U.S. Bank
e-Business example
Provides anytime, anywhere access to
products and services through:
 Walk up services
 Automated Teller Machines (ATM)
 24-Hour Phone Banking
 Internet banking
Offices in 17 Midwestern and Western states
The Problem
Business Challenge
 Multiple access methods drive a need to
provide a common method to authenticate
an account owner
Technical Challenge
 Account verification information is
maintained in purchased IMS application
 Move to leading edge Storage Area Network
technology and required integration.
The Solution
Process Action Controller
LRP



TNR
Target is not a “conventional DBMS” but
a storage area network.
High data volumes
Target data written to MQSeries
End UOW
Data
DB2
Custom
Adapter
MQSeries
Queue
MQPUT
High Performance Data Propagation
Facilitating DBMS Migrations

Change target DBMS without impacting operational applications

Move target DB from Sybase to Oracle to SQL Server to UDB to ??
DB2
User
Application
DB2
IMS
Fast Path
VSAM
BMC
Apply
EDP
Logger
ECCR
EDP
Log
LRP
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
BMC’s Data Propagation is Different?
Transaction based data propagation supports applications
executing hundred’s of transactions/second

For IMS, Fast Path, CICS VSAM and VSAM Batch





For
Does not use IBM* capture exits, logs, or require any additional logging
Automatically transforms non-relational data structures to relational
Supports “Near-Real-Time” with minimum latency for target updates
No requirement for DB2 staging tables and associated logging
Captures changes from VSAM batch applications even when no logs are used
DB2



No requirement for DB2 staging tables and associated logging
Transaction consistent propagation
Supports “Near-Real-Time” with minimum latency for target updates
Component



of a Complete Enterprise Data Movement Solution
Common management console - Easy to administer
Integrated restart/recovery of the propagation process
Shared data transformations
Extract, Transform, Move & Load Options
A Performance Perspective
M Bytes per Hour
4000
Making The Right Choice
3500
3000
2500
2000
1500
1000
500
0
Change Data
Propagation
RYO Bulk
Move
Solutions
DataReach
Parallel
Unload/Load
Piping