Transcript dama-jun00
High Performance
Enterprise
Data Propagation
Russell Donovan
BMC Company Profile
Established in 1980
Leader in Application Management
Estimated FY2000 Revenues of $1.8B
Over 6,000 Employees
Development Labs in Austin (TX), Conyers (GA),
Houston, San Jose, Sunnyvale (CA), Waltham (MA)
Germany, Israel, Singapore
Market Coverage in Over 50 Countries
Member of the S&P 500
BMC Software e-Business Availability
Provides application management solutions that ensure
the availability, performance, and recovery of businesscritical applications.
We call this application service assurance and it
means that the applications companies and their
customers rely on will be there when they need
them.
e-vailability - We Guarantee Our Solutions!
Enterprise Data Propagation (EDP)
Requirement For All Enterprises
Need to synchronize data between legacy systems
and distributed relational databases for:
Data warehousing, operational data stores, data mining
e-Business applications access to legacy data
Enterprise application integration
Distributed enterprises, ERP solutions, Acquisitions
IMS
DB2
70% of corporate data in IMS,
VSAM, DB2
Need high performance solutions
Need near real time solutions
VSAM
Other
Data Propagation - Strategies For
Synchronizing Multiple Copies of Data
Copy
Unload/Load
SQL Query
Distributed Database
2 Phase Commit
Source
Change Capture --- With
Asynchronous Propagation
Target
Key Challenges Implementing a Data
Warehouse
Data Management Review Survey
Business rule analysis
Managing End User Expectation
Business data modeling
Reliability and integrity of data
Data acquisition
Meta Data management
Managing Management Expectation
Database performance
Data Warehouse Implementations
For Customers With
Large operational databases
High transaction rates
24x7 operations requirements
Critical Management Issues
Availability of operational systems
Performance of operational transactions
Maintaining service levels
Increasing volumes of data
Time required to load and refresh data warehouse
Quality, currency & accuracy of decision making data
Building Data Warehouses:
A Perspective
VSAM Files
Operational
Databases
Operational
Database
Images
Subject
Time Variant
Oriented Subject Oriented
Operational Data Warehouses
Images
Data
Marts
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
Data Warehouse
Data
WareHouse
DB2
IMS
VSAM
Oper.
Data
Store
Data
Mart
Data
Mart
End-Users
Query Tools
Brio, Bus. Objects
COGNOS
Microstrategy
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
Data Warehouse
Integration
DB2
IMS
Area
Data
WareHouse
Oper.
Data
Store
VSAM
Dist. SystemsTools
Informatica, Constellar
D2K, Sagent, Ardent
Data
Mart
Data
Mart
End-Users
Query Tools
Brio, Bus. Objects
COGNOS
Microstrategy
Building Data Warehouses:
A Perspective
Mainframe Tools
Prism, ETI
Carleton, SAS
Platinum
BMC
Solution
Data Warehouse
Integration
Area
DB2
Bulk DataMove
IMS
Data
WareHouse
Oper.
Data
Store
ChangeDataMove
VSAM
Dist. SystemsTools
Informatica, Constellar
D2K, Sagent, Ardent
Data
Mart
End-Users
Query Tools
Data
Mart
Brio, Bus. Objects
COGNOS
Microstrategy
Change Data Propagation:
A Perspective
Change Data Propagation Is Preferred When:
Databases are large and bulk move would take too long
Batch
window limitations
Database availability limitations
Support for 24 x 7 is a requirement of operational application
Minimum latency “Near-real-time” is required in target database!
Currency of information in target database is important
Small percentage of a large database has changed
Need to reduce network traffic by transmitting only data changes
Source
Target
Transaction Based
Change Data Propagation
Synchronous Data Propagation
Original
update waits until all targets are updated
Single, global transaction with multi-site, coordinated commit processing
Asynchronous Data Propagation
Propagation
Source
of updates occurs asynchronous to originating
transaction
Target
Minimizes resource consumption at source
Minimizes impact on source transaction response times
Source
Target
Synchronous vs Asynchronous
Change Data Propagation
Synchronous 2 Phased Commit
Source transaction completes
when all databases updated
Advantages:
Real time propagation
All sites always synchronized
Disadvantages:
Transaction response time
Data availability impact
System resiliency
Usually
not practical
Asynchronous Data Propagation
Source transaction does not wait
for target databases to be
updated
Advantages:
Minimum
performance impact
Availability
Autonomy
Recoverability
Disadvantages:
target locations updates may be delayed
All sites not always synchronized
Asynchronous Change Capture:
Implementation Considerations
Trigger Based
Triggers used to capture changes to database records
Incremental updates collected in staging tables
Significant resource consumption for triggers and logging
Typically low volume applications (< 20 transactions/second)
Log Exit Based
Increased logging in operational environment
Increased response times for source transactions
Increased resource consumption
Log management issues
Log Post Process Based
Increased logging in operational environment
Log management issues
Long latency interval can not support near real time
Enterprise Data Propagation (EDP)
The BMC Solution
A Data Propagation Management System
A single point of access for
managing Legacy data
propagation across the
enterprise
DB2
IMS
Fast
Path
VSAM
Bulk Data
Change Data
Efficient change capture
Basic data transformation
High performance data movement
High performance utilities
Operational
Data
Store
Common look and feel
Integrated transformations and mappings
Integrated recovery/restart
ChangeDataMove:
Product Positioning
Positioning
ChangeDataMove is a high performance, efficient, change data
propagation solution, which captures changes made to IMS, Fast Path,
VSAM, and DB2 databases, and propagates those changes to the most
prevalent relational databases.
What It Does
Transaction-based data propagation
Supports high volume production applications with hundreds of transactions per second
Supports ‘near real-time’ as well as scheduled data propagation
Advantages
A data propagation system (complete solution vs a point product)
Highly efficient change capture does not impact applications
Only solution for IMS, FastPath and VSAM that does not require logging
Optionally integrated with DataMove for bulk data movement
Change Data Propagation
for IMS and VSAM
Synchronous Change Capture
Asynchronous Data Propagation
Transparent high performance
Data Propagated Within Context of
change capture
Minimum impact on source system
logging, CPU & user response time
Data is available immediately for
asynchronous propagation
Original Transaction
Updates applied in proper sequence
Inter and intra-table consistency
Source and target(s) consistent
within transaction boundaries
1
1
2
3
EDP
Log
EDP
Apply
2
3
Not affected by network delays or slow remote processors
Supports “Near Real Time” and/or Scheduled Propagation
IMS Change Capture
Resides
within the IMS environment
Captures DL/I calls as they occur
Supports IMS/TM (MPP,BMP), Fast Path, CICS DBCTL, Batch DL/I
Commits updates at transaction or job (batch) end
User
Application
IMS
Subsystem
Based on BMC Software’s
CHANGE
RECORDING FACILITY
EDP
Logger
IMS
Database
ECCR
EDP
Log
LRP
BMC
Apply
DB2
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
CICS/VSAM Change Capture
Captures changes at each Get, Put & Erase request
Utilizes CICS TRUE, File, and Re-sync exits
Resides as functional part of CICS address space
Participates in two phase commit with CICS transaction
Updates are committed when transaction commits
User
Application
BMC
Apply
CICS
Subsystem
EDP
Logger
VSAM
Database
LRP
Oracle
SQL
Server
TNR
ECCR
EDP
Log
DB2
OEM
Apply
Sybase
UDB
VSAM Batch Change Capture
Journad exit dynamically activated
ECCR resides within the batch address space
UOW is complete when application closes VSAM file
DB2
User
Application
Batch
Address
Space
Based on BMC Software’s RECOVERY PLUS BMC
Apply
for CICS/VSAM product.
EDP
Logger
VSAM
Database
ECCR
EDP
Log
LRP
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
DB2 MVS Change Capture
Requires DB2 change data capture be activated
Reads log records via DB2 IFI, external decompression
Maintains multiple versions of schema
DB2
MVS
DB2
Uses DB2
IFI Facility
User
Application
DB2
ECCR
EDP
Logger
BMC
Apply
LRP
Oracle
SQL
Server
TNR
Sybase
EDP
Log
OEM
Apply
UDB
The Transformation Process
Transforms IMS, Fast Path and VSAM data to relational formats
Hierarchical structures to relational structures
Converts non-relational data types to relational
Uses relational DBMS catalog information
Uses copy libraries and IMS database descriptors
Automatically handles Dates, Times, Data Types
Repeating groups, Redefined records
Customizable through user exits
VSAM Files
Transformation
Possible Target Keys
To allow resulting target rows to be unique
Replication Key (REPKEY)
Ancestor Keys
If REPKEY is a composite key (I.e. IMS concatenated key) each level is
available to be used as the key of the target row
Sequential number
This key will make the target row unique
For IMS it is the full concatenated key or segments RBA
If a single input segment or record creates multiple output rows, a sequential
numeric column can be generated.
Any field in the input segment or record
Transforming Cobol Structures
Repeating Groups
all repeated fields to a single target column
As individual rows in the same or a different table
Update results in set of deletes and inserts for target rows
Redefined Records
assigned unique names and schema definitions
Record identification exit identifies record types
Schema applied to segment or record based on redefined record type
Redefined records can be propagated to same or different targets
High Performance Transport & Apply
Data is blocked, compressed and encrypted
Multi-threaded apply tasks for increased performance
EDP
Apply
Send
T
R
A
N
S
P
O
R
T
Receive
TCP/IP
T
R
A
N
S
P
O
R
T
DB2
Dynamic
Memory
Staging
Queue
Oracle
Dynamic
Memory
Queue
EDP
Apply
EDP
Apply
EDP
Apply
EDP
Apply
EDP
Apply
Automated Schema Replication
Reduce
administration costs by automating the creation
of target tables from IMS, VSAM, and DB2 source
schema
DB2
DBD
Copybook
Copybook
VSAM Files
SchemaMove
Oracle
DB2
Catalog
MS SQL
Server
Bulk Data Propagation
Bulk move is usually simpler and easier to implement
Needed to initially create or to refresh a target database
Bulk move is the preferred solution when:
Data volumes are not large and the move can be performed within time
constraints
Database availability is not a concern (source/target)
Network volumes and network overhead are not issues
Currency of information in target database is not a concern
Change data propagation cannot handle the volumes
Source
Target
Bulk Data Movement DB2 to Oracle
The Traditional Approach
Time
MVS
Host
File
DB2
Extract
DB2
DB2
35%
DB2 Unload 20 min.
13%
File Transfer 7 min.
Gateway
TCP/IP
UNIX
Server
Gateway
File
Oracle
Oracle
Loader
52%
Oracle SQL Load 28 min.
Oracle
Total Time 55 min.
Bulk Data Movement DB2 to Oracle
Parallel Unload Parallel load
Time
MVS
Host
File
DB2
Extract
DB2
DB2
Parallel Unload 7 min
Gateway
TCP/IP
File Transfer 7 min.
UNIX
Server
Gateway
File
Oracle
Oracle
Loader
Oracle SQL Load 16M
Oracle
Total Time 30 min.
Bulk Data Movement DB2 to Oracle
Parallel Unload/load & Piping
MVS
Host
File
DB2
Extract
DB2
DB2
Parallel Unload
Gateway
PIPING
TCP/IP
UNIX
Server
Parallel load
Oracle load starts as first
record is read from DB2
Gateway
File
Oracle
Oracle
Loader
Oracle
Total Time 17 min.
DataReach: Product positioning
Positioning
DataReach is a high performance, high availability data movement solution
for extracting MVS/ESA DB2 data and loading it into Informix, Oracle or
Sybase database on Unix.
A joint development effort of EMC & BMC - Not A Product We Sell Today
What It Does
Uses EMC Storage to move data at channel speeds vs network
speeds
Moves the work of extracting DB2 MVS data from MVS to Unix
Advantages
Moves data 10 to 100 times faster than network solutions
Completely eliminates mainframe processing
Completely eliminates network traffic and network overhead
Allows nearly 100% availability of the source DB2 database
Enables customers to more frequently refresh data warehouses
Bulk Data Movement DB2 to Oracle
The DataReach Approach
DataReach Directly Extracts DB2 Data
Eliminates
network traffic & network overhead
Familiar SQL-based SELECT syntax
Subset of data via WHERE predicate
Optional parallel extraction capability
Optional access via DB2 Index structures
Data conversion
EBCDIC to ASCII
DB2 to generic format
Direct load of Oracle, Sybase, Informix
Optional parallel load capability
Distributed capabilities
Intermediate
File
MVS
Host
DB2
DB2
UNIX
Host
DB2
Extract
Oracle
Loader
Oracle
Oracle
DataReach: How It Works
MVS System
DB2
Escon Channels
CKD Volumes
FBA Volumes
DB2
Source
Target
DBMS
SYMMETRIX
ESP
SCSI Channels
Extractor
UNIX
Translation
Module
Target
DBMS
Native load utility
Target RDBMS
Flat
File
DataReach: Performance Benchmark
DB2 to Oracle on HP/UX
1,000.00
900.00
800.00
Minutes
700.00
600.00
Traditional
500.00
DataReach
400.00
300.00
200.00
100.00
0.00
10 Mbytes 100 Mbytes
1 Gbyte
Size
5 Gbytes
10 Gbytes
Traditional Process vs DataReach
Elapsed Time Components
1 GB of Data
DB2 to Oracle on HP/UX
1:04:48
0:57:36
Elapsed Time in Minutes
0:50:24
0:43:12
DataReach Process Time
0:27:53
0:36:00
Oracle SQL Load Time
File Transfer Time
0:28:48
DB2 Unload Time
0:07:10
0:21:36
0:14:24
0:20:06
0:07:12
0:16:35
0:00:00
Traditional Process
DataReach Process
DataReach: Operational Considerations
Data Consistency: Quiesce DB2
High Availability: Use A mirror copy in Symmetrix
Security: DataReach Authorization Table in DB2
DB2 Read access
Unix Login
Target RDBMS authorizations
Extract, Transform, Move & Load Options
A Performance Perspective
M Bytes per Hour
4000
Making The Right Choice
3500
3000
2500
2000
1500
1000
500
0
Change Data
Propagation
RYO Bulk
Move
Solutions
DataReach
Parallel
Unload/Load
Piping
High Performance Data Propagation
Strategy for Supporting Data Warehouse
Operational
Applications
IMS
Fast Path
DB2
VSAM
Other
Integration Area
Operational
Data
Store
Change
History
High
Performance
Data Propagation
Data
Warehouse
Data Warehouse
Refresh
Data
Mart
Business Intelligence Systems
Data
Mart
High Performance Data Propagation
Strategy for Supporting DW & e-Business
Operational
Applications
Web
Server
Updates
App.
Server
IMS
Fast Path
DB2
Integration Area
VSAM
Operational
Data
Store
Other
Change
History
High
Performance
Data Propagation
Data
Warehouse
Data Warehouse
Refresh
Data
Mart
Business Intelligence Systems
Data
Mart
High Performance Data Propagation
Strategy for Enterprise Application Integration
Operational
Applications
Note: This is a BMC Services Offering
IMS
DB2
PeopleSoft
Baan
VSAM
Other
Oracle
Messaging
Bulk
Message Queue
High
Performance
Data Propagation
SAP
ERP
Tools
Change
Message Queue
Data
Warehouse
Data
Mart
e-business
Applications
Major U.S. Brokerage Firm
Application Integration example
Global corporation headquartered in New York
City providing:
Securities
Asset Management
Credit and transaction services
The Problem
Business challenge
Migration to new strategic DBMS could not
impact business operations
Technical challenge
Keep current ADABAS DBMS synchronized
with new strategic DB2 DBMS
The solution had to be sustainable for the longterm and also be scalable
The Solution
Client already had an ADABAS log capture mechanism and MQSeries.
A “Custom Adapter for Source MQSeries” to Change Data Move
written in ASM
runs as a started task
Primarily batch with over 700 files (as sources).
ADABAS
Batch
Log Capture Address
Space
MQSeries
Queue
MQGET
EDM
Logger
Custom
Adapter
EDM
Log
Major U.S. Bank
e-Business example
Provides anytime, anywhere access to
products and services through:
Walk up services
Automated Teller Machines (ATM)
24-Hour Phone Banking
Internet banking
Offices in 17 Midwestern and Western states
The Problem
Business Challenge
Multiple access methods drive a need to
provide a common method to authenticate
an account owner
Technical Challenge
Account verification information is
maintained in purchased IMS application
Move to leading edge Storage Area Network
technology and required integration.
The Solution
Process Action Controller
LRP
TNR
Target is not a “conventional DBMS” but
a storage area network.
High data volumes
Target data written to MQSeries
End UOW
Data
DB2
Custom
Adapter
MQSeries
Queue
MQPUT
High Performance Data Propagation
Facilitating DBMS Migrations
Change target DBMS without impacting operational applications
Move target DB from Sybase to Oracle to SQL Server to UDB to ??
DB2
User
Application
DB2
IMS
Fast Path
VSAM
BMC
Apply
EDP
Logger
ECCR
EDP
Log
LRP
Oracle
SQL
Server
TNR
OEM
Apply
Sybase
UDB
BMC’s Data Propagation is Different?
Transaction based data propagation supports applications
executing hundred’s of transactions/second
For IMS, Fast Path, CICS VSAM and VSAM Batch
For
Does not use IBM* capture exits, logs, or require any additional logging
Automatically transforms non-relational data structures to relational
Supports “Near-Real-Time” with minimum latency for target updates
No requirement for DB2 staging tables and associated logging
Captures changes from VSAM batch applications even when no logs are used
DB2
No requirement for DB2 staging tables and associated logging
Transaction consistent propagation
Supports “Near-Real-Time” with minimum latency for target updates
Component
of a Complete Enterprise Data Movement Solution
Common management console - Easy to administer
Integrated restart/recovery of the propagation process
Shared data transformations
Extract, Transform, Move & Load Options
A Performance Perspective
M Bytes per Hour
4000
Making The Right Choice
3500
3000
2500
2000
1500
1000
500
0
Change Data
Propagation
RYO Bulk
Move
Solutions
DataReach
Parallel
Unload/Load
Piping