A Year in the Life: Managing a CASE Project

Download Report

Transcript A Year in the Life: Managing a CASE Project

Application Integration in an
E-Commerce World
Leslie M. Tierstein
STR LLC
1
Application Integration

2
Overview
–
Acquire data from one or more sources
–
Transform its meaning and/or format
–
Deliver it to one or more targets
Application Integration

3
Processing scenarios:
–
Data warehouse loads
–
Conversions from legacy systems or application
interfaces performed in a “batch window”
–
Ongoing “real-time” interfaces

What are the properties of each scenario?

How are the scenarios different/the same?
Warehouse Loads (1)

Repeatable, regularly scheduled
–
–

Load must ensure consistent user views
–
4
Data is initially loaded (see Conversions)
Then it is “refreshed”, typically via Change Data
Propagation
“Checkpoint” OLAP environment
Warehouse Loads (2)

Data must be “transformed”
–
–
–
–
5
From third-normal form OLTP source systems to
star-schema OLAP target systems
Possibly to an Operational Data Store (ODS)
Usually from multiple, heterogenous sources
Summarization may also be required to the
desired level of detail
Warehouse Loads (3)


Vast amounts of operational data
Importance of metadata
–
6
Oracle’s Common Warehouse Metadata (CWM)
Legacy Conversions


7
“One-time” task
–
In “big-bang” implementation
–
Some phased implementations need conversions
repeated numerous times
–
Scheduled “cut over” to the new system
Data in the source system is expendable after
it is converted -- “quick and dirty” is an option
Application Interfaces (1)

Repeatable
–
–


Small to large volumes of data
Operational data at both ends
–
–
8
Regularly scheduled (“batch”)
Event-driven (“near” real-time)
Source and target
Custom, COTS, external applications (owned by
another entity/business)
Old Terminology

E(T)TL
–
Extract, (Transport,) Transform, Load




–
9
Extract source data
(Transport data to new platform)
Transform data to new format
Load data into new database
Typically applied to batch application integration or
warehouse loads
Newer Terminology

EAI
–
–
10
Enterprise Application Integration

Acquire data from source application(s)

Transform data

Deliver data to target application(s)
Exchange of data between two or more
applications
Newest Terminology (1)

11
A2A: Application to Application Integration
–
Exchange of data between two or more
applications, typically without a web interface
–
May be “real-time” or batch
–
“Interfaces” between systems/applications (cf:
Oracle Applications Interface tables)
Newest Terminology (2)

12
B2C: Business to Consumer Integration
–
A consumer, via a web site, interacts with software
owned by one business
–
The business’s corporate database(s) is (are)
queried in the transaction
–
The business’s corporate database(s) is (are)
updated as a result of the transaction
Newest Terminology (3)

13
B2B: Business to Business Integration
–
“I’ll have my computer call your computer”
–
A transaction in one business’s computer
automatically triggers a transaction in another
business’s computer
–
B2B integration may be under the covers in B2C
scenarios or performed independent of B2C
transactions
Newest Terminology (4)

B2B:
“I took my notepad from my shirt pocket and displayed a
standard contract … She glanced at it, then had her own
computer scrutinize the document. Conversing in
modulated infrared, the machines rapidly negotiated the
fine details. My notepad signed the agreement on my
behalf, and Lansing’s did the same, and they both chimed
happily in unison to let us know that the deal had been
concluded.”
Greg Egan, “Cocoon”, ©1994
14
Extract/Acquire (1)

15
Online, real-time database access
–
Native Oracle access
–
ODBC/JDBC
–
Oracle gateways ($$$)
–
Heterogeous replication packages (such as
DataBridge)
–
APIs (COTS packages such as SAP)
Extract/Acquire (2)

Alternate character sets
–
–

Change Data Propagation (CDP)
–
–
16
EBCDIC, ASCII, unicode
7-bit, 8-bit, 16-bit
Triggers
Event Logs
Load/Deliver

17
Same access issues as Extract/Acquire
Transport (1)

Files
–
Connectivity

–
18
LAN/WAN, Internet, Sneaker-Net
Transfer protocols

ftp, proprietary, http, https

WAP
Transport (2)

Messages
–
via queues


–
via email

19
IBM MQ Series, Oracle AQ
Microsoft MSMQ, Java JMS
POP3, attachments
Transform – Data Mapping (1)


20
Potential many-to-many mapping between
sources and targets
–
“Point-to-point” mappings
–
vs. hub-and-spoke transformation engines
Algorithms to change the format and semantics
of the data
Transform - Data Mapping (2)

Relationships
–
–
1:1, many:many - Facts of life
1:many

–
Many:1


21
normalization - conversions, semantically overloaded
attributes
mergers/acquisitions; multi-line text to LOB
Repository for impact analysis
Data Transformation (1)

Data type translation
–
–

Mutually intelligible data
–
–
22
Should be transparent (a la Oracle)
Except for rare types (eg, bit maps)
XML
Emerging XML standards
Data Transformation (2)


23
Algorithms are often referred to as “business
rules”
–
Rules may range from simple assignments
–
To complex lookups/translations on multiple
columns, with referential integrity checks, data
cleansing, functions, etc.
Rules and/or their components should be reusable
Data Transformation (3)

24
Algorithms/business rules
–
Ability to INSERT/UPDATE/DELETE
–
Ability to produce multiple target records per one
source, or one target per multiple sources
–
Ability to track (and potentially reprocess)
exceptions (not part of transform per se)
Data Transformation (4)

Data Cleansing - specialized transform
process, applied to “dirty” legacy data
–
Report on fixes, exceptions
–
Ability to resubmit failed rows
–
Third-party products

25
Merge-purge software (typically for addresses)
Technology (1)

Selection criteria
–
–
Runs on your hardware and software
Support for physical data types


–
Support for logical data types


26
Files, databases, message queues
Internet and wireless protocols
Adapters, connectors, pre-built interfaces
Especially for COTS packages (Oracle Apps, PeopleSoft,
SAP, Siebel CRM)
Technology (2)

Selection criteria
–
Ability to write business rules



–
Maintainability

27
Language, point-and-click, combination
Ability to use external code (custom or bought)
Ability to reuse components
Cost-benefit over the system development life cycle
Technology (3)

Selection criteria
–
–
Real-time, “near real-time”, batch
Metadata


–
–
–
–
28
Operational
Programmer-oriented (business rules)
Scalability (maintenance windows?)
Integration with other tools, skill sets
Support for corporate standards
Infrastructure required (middleware)
Oracle Technology (1)





29
PL/SQL and SQL*Loader
Data Mart Suite (RIP)
Oracle Warehouse Builder
Oracle Integration Server (MIA)
XML services
Oracle Technology (2)

Database services
–
–
–
30
Scheduling (DBMS_JOB)
Advanced Queuing (AQ)
Replication
Third-Party Tools

Specialized for a processing scenario
–
–
–

General purpose
–
–
31
Conversions
Warehouse loads
Data Integration (A2A, B2B, B2C)
Mainframe gateways
Heterogenous replication
Summary


32
Select a methodology and tool to fit your
processing scenario -- more than one tool if
necessary
Integrate the tool(s) into your development and
maintenance methodology
About the Author


33
Leslie Tierstein is an Technical Project
Manager at STR LLC in Fairfax VA.
She can be reached at: [email protected]