Agenda - NEDB2UG
Download
Report
Transcript Agenda - NEDB2UG
Informatica Application ILM
Streamline & Secure Nonproduction Mainframe Environments
September 16, 2010
Scott Hagan, Data Integration Sr. Product
Manager
Jay Hill, ILM Director of Product
Management and Marketing
2
Informatica Confidential & Proprietary
Agenda
• Business Drivers
• Building Better Test Environments
• Identifying and Masking Data Automatically
• Enabling Seamless Database Connectivity
• Products in Action
Informatica Confidential & Proprietary
3
Market Driver: Data Proliferation
Customers are Drowning in their Own Data
• Escalating storage, server, and database cost
• Diminishing application and data warehouse
performance
• Inability to retire redundant or obsolete applications
• Increasing effort spent on maintenance & compliance
• More data in more place = greater risk of data breach
Informatica Confidential. Do Not Distribute.
4
Test Data Management
Developers & QA Struggle With Data
Why is this such a big problem?
• Creating data
=
time consuming, laborious, costly
• Gaining access
=
data protection legislation. More data in more places = more risk
• Ensuring integrity
=
complex, especially if you’re federating across systems
• Getting enough
=
load, stress and performance testing
• Storage space
=
expensive to maintain lots of full production copies
• Getting the right quality
=
you need maximum code coverage
Informatica Confidential & Proprietary
5
Informatica Overview
Critical Infrastructure for Data Driven Enterprises
6
The Informatica Approach
Comprehensive, Unified, Open and Economical Platform
Data
Warehouse
Data
Migration
Test Data
Management
& Archiving
Data
Consolidation
Master Data
Management
Data
Synchronization
SWIFT
Cloud Computing
Application
Database
Unstructured
B2B Data
Exchange
NACHA
HIPAA
…
Partner Data
7
Application ILM Products & Use Cases
Improving Operational Efficiency & Compliance
•
Reduce storage, RDBMS license, personnel costs
•
Increase performance
•
Reduce effort spent on maintenance & compliance
•
Reduce data privacy risk
DATABASE SIZE
Production
Development/Testing/Training Copies
Performance
Copy 1
Copy 2
Copy 3
Informatica Data Subset
Informatica Data Archive
Copy 1
Copy 2
Copy 3
Inactive data
Active data
TIME
Informatica Data Masking
8
Informatica Application ILM
• Application ILM Enables Customers To:
• Data Archive – Relocate older/inactive data out of
production for performance, compliance and application
retirement
• Data Subset – Create smaller copies of production
databases for test and development purposes
• Data Masking – Protect sensitive information in
nonproduction
• ILM Value Proposition:
• Lower storage and server costs
• Improve application and query performance
• Less time and cost for back-up & batch processes
• Eliminate cost, complexity by retired legacy applications
• Reduce compliance and eDiscovery expense
• Prevent data breaches in nonproduction environments
Informatica Confidential & Proprietary
9
Building Better Test Environments
Informatica Data Subset
10
Informatica Data Subset
Product Objectives
Objective
Method
Primary Challenge
Solution
Smaller nonproduction footprint
Retaining only required data
Enabling target application usability
Informatica Data Subset
Informatica Confidential & Proprietary
11
Informatica Data Subset
Benefits of Subsetting
What?
How?
Reduce current costs
•Shrink footprint of nonproduction environments
•Reduce time to copy nonproduction environments
Avoid future costs
•Eliminate future spending on additional capacity
Speed delivery cycles
•Decrease test and development cycle times
•Enable more cycles in testing time frame
Promote efficiency
•Reduce or eliminate environment sharing
•Reduce wait time for new environments
Informatica Confidential & Proprietary
12
Informatica Data Subset
Lean Copies for Nonproduction Use
Time
Savings
Here
Time Slice
or
Functional Slice
Space Savings
Here
300 GB
300 GB
Production
Database
5 TB
Subset
300 GB
300 GB
300 GB
13
Informatica Data Subset
Entity Concept
Entity Definition
• Logical unit to subset
• Database and application
level relationships
• Policy scoping criteria
14
Entities
• Data Subset uses metadata-based
Entities
Billing Plan
Dates
• Entities typically represent the
transactions with which your
application specialists interact.
Such as, purchase orders, sales
orders or financial documents
Billing Plan
• Selection screens are also
metadata-driven to allow for easy
customization
Item Data
Sales
Org
Creation
Z-Field
Date
Customer
Payment
Cards
Status
Incompletion
Log
Sales Order
Schedule
Line
Partner Data
Individual
Records
Schedule
Line History
Address
Data
Subset
Informatica Confidential & Proprietary
15
Identifying and Masking Data Automatically
Informatica Data Masking
16
Informatica Data Masking
Product Objective
Objective
Method
Primary Challenge
Solution
Protect sensitive information in
nonproduction
Data masking
Creating meaningful yet de-identified
data
Informatica Data Masking
Informatica Confidential & Proprietary
17
Informatica Data Masking
Privacy Regulations Driving Masking Initiatives
Regulation
HIPAA
GrammLeach Bliley
Act
Data
Protection
Act (UK)
PCI
Example Text
“Under the Privacy Rule, health plans, health care
clearinghouses, and certain health care providers must guard
against misuse of individuals' identifiable health information and
limit the sharing of such information.”
“The law requires that financial institutions protect information
collected about individuals”
“Appropriate technical and organizational measures shall be
taken against unauthorized or unlawful processing of personal
data and against accidental loss or destruction of, or damage to,
personal data.”
“…keep cardholder data storage to a minimum. Develop a data
retention and disposal policy. Limit storage amount and retention
time to that which is required for business, legal, and/or
regulatory purposes, as documented in the data retention policy.”
18
Informatica Data Masking
Realistic, Masked Data to Prevent Data Breach
Substitute
QA 02
QA 03
SSN Special
Masking
Blur
Nullify
Randomize
CRM
20 TB
Mask & Clone
FIN
15 TB
HR
12 TB
Clone 1 Subset & Mask
Subset
Credit Card
Special Masking
QA 01
Subset & Clone
Key-Masking
= Protected Data
DEV 01 Mask & Clone
= Production
= Nonproduction
DEV 05
DEV 04
DEV 03
DEV 02
Expression
19
Informatica Data Masking
Contextually Correct, Referentially Intact Data Masking
20
Informatica ILM
Broad Application and Database Support
Informatica ILM Solutions
Data Archive
Data Subset
Data Masking
Application Aware Accelerators
Oracle
e-Business
SAP
PeopleSoft
Siebel
Custom/
3rd Party
Universal Connectivity
Oracle
SQL
Server
DB2
UDB
Teradata
Sybase
DB2
z/OS
VSAM
Other
21
Informatica ILM: An Enterprise Solution
Platform & Vendor Independent
ACQUIRED DIVISION
Custom Billing
Application
SHARED SERVICE CENTER
Reservation
Applications
INVOICES
Oracle 9i
HPUX
10 Years = 600GB
CALL CENTER
Siebel 7.8
SERVICE
DB2 for z
5 Years = 350GB REQUESTS
CONTRACTS
IMS
7 Years = 1.4 TB
CORPORATE HQ
Logistics
Applications
VSAM –
600 KSDS files
BENEFITS
22
Informatica PowerExchange
Fast and Easy Access to Mainframe Sources!
September 16, 2010
Scott Hagan, Data Integration Sr. Product
Manager
23
Informatica Confidential & Proprietary
Informatica PowerExchange
What’s the Problem?
You need access to mainframe data, quickly! No
time or expertise to code extracts? FTP’s?
Queries? What about Security? Speed?
Recoverability? Integration Support?
Oh yes, and I need it yesterday!
24
Informatica PowerExchange
Informatica PowerExchange helps you to…
Unlock difficult to access data – Mainframe, legacy,
etc. And make it available in when you need it –
Batch, regular updates or real-time
25
Data Integration
Traditional Methods
Extract
Source
Data
Program
Extract
from one or
more sources
Translate
Filter, ASCII
EBCDIC
conversion
Move
Transport
data across
platforms
Load
Load data
to target
Database
Target
Database
26
Data Integration
PowerExchange Approach
Target
Database
Source
Data
NO PROGAMMING,
NO INTERMEDIATE FILES
Data is extracted using SQL, converted (EBCDIC/ASCII), filtered
and available to the target database in memory, without any
program code or FTP.
27
PowerExchange - Batch
Highly Scalable Bulk Access to Data
SOURCES/TARGETS
•
•
•
•
•
•
•
PROJECTS
•
•
•
•
•
•
Databases
Data warehouses
Packaged applications
Mainframe, midrange
Message-oriented middleware
Collaboration
Technology standards
Data warehousing
Data migration
Data consolidation
Application implementation
Application migration
ILM
• Test Data Sources
Informatica Data Integration Platform
PowerExchange
PowerCenter
PowerExchange
28
PowerExchange - Real-time
Immediate Access to Data, Events, and Web Services
SOURCES
•
•
•
•
PROJECTS
Message-oriented middleware
Web services
Packaged applications
Multiple modes
• Batch
• Continuous
•
•
•
•
Straight-through processing
Real-time analytics
Real-time warehousing
Application integration
Informatica Data Integration Platform
PowerExchange
PowerCenter
Real Time Edition
PowerExchange
29
PowerExchange - Change Capture
Creation and Detection of Business Events
SOURCES
•
•
PROJECTS
Relational, mainframe, midrange
databases
Multiple modes
• Batch (for initial
materialization)
• Net change
• Continuous capture
•
•
•
•
•
Create business events from
database updates
Operational data integration (ODI)
Master data management (MDM)
Trickle-feed data warehousing
Data replication/synchronization
Informatica Data Integration Platform
PowerExchange
CDC Option
PowerCenter
Real Time Edition
PowerExchange
30
PowerExchange Run-Time
Batch Data Movement (Test Data Creation?)
Operating Environment
User Applications
PowerExchange
Tools
(ETL, EAI, BI)
Mainframe and
Mid-Range
Listener
Packaged
Applications
Targets
Data
Records
PowerCenter
SQL
Relational and
Flat Files
Standards and
Messaging
Data Maps for
Non-Relational Access
Remote Data
31
Informatica Confidential & Proprietary
34
Test Data Management Concepts
Subst Last Names
Skew Salary
Subst Credit Cards
• Privacy Policies
at logical level
• Define once use
multiple times
• Plans define a data
subset with entities,
filter criteria and
privacy policies
Nullify SSN’s
Policy Assignment
DBMS
Mainframe
ERP
Files
• Policy Assignment
at physical level
• Reuse policy for
multiple
applications
35
Financial Services Co. and New Regulations
• Financial holding company approval
• October 2008 Financial Services Co. is approved by the US Federal
Reserve Board to operate as a financial holding company allowing Financial
Services Co. to offer additional retail banking services to its customers
• New regulations
• Financial Services Co. is now subject to supervision by the Federal Reserve
and regulated by the FDIC
• Redundant work
• To comply with new regulations many business units within Financial
Services Co. are performing redundant work such as complying with data
privacy regulations
• Self-service solution
• Financial Services Co. wanted to pursue a holistic approach and build a selfservice data masking solution
Informatica Confidential & Proprietary
36
Self Service Data Masking Solution
• Corporate IT Compliance Team
• Reviewed regulations and determined what constitutes sensitive
or private data
• Built a finite list of sensitive fields that must be masked
throughout the organization and the masking rule that should be
used
• Action item 1: update Business Glossary with the corporate IT
privacy policies for each sensitive field
• Action item 2: build company-wide data masking policies for each
sensitive fields with the associated masking rule
• Online banking application owner
• Apply corporate IT compliance team’s privacy policies to my
online banking application
Informatica Confidential & Proprietary
37
Business Glossary
Open the Business
Glossary to define the
masking policies in
business terms
Informatica Confidential & Proprietary
38
ILM Workbench – Policies
I’ll enter a clear name
and description for
credit cards
I need to create
new policy for
credit cards
Rules can also be
defined reusable
mapplets
I’ll locate available
masking rules and
choose the rule I
want to assign
Informatica Confidential & Proprietary
39
ILM Workbench – Policy Assignment
I’ll start with assigning
a policy to credit card
columns
I need to apply the
corporate privacy policy
to my online banking
application
I just I’ll
profiled
source
locatemy
sensitive
database.
Now
I can
lookthe
for
columns
and
assign
data patterns
that represent
appropriate
policy
credit cards
Informatica Confidential & Proprietary
40
ILM Workbench – Entities
I’ll create a subset of
data based on the
Customer entity
I initially want to test
my masking policies
on a subset of data
Entities are a set
of related tables
with a filter criteria
definition
Informatica Confidential & Proprietary
41
ILM Workbench – Plans
I’ll give the plan a good
I’ll add all the Policy name to show this is a
Subset and Masking plan
Assignments I
created earlier to
the plan
I’ll search for and add
the Customer entity to
the plan to mask only a
subset of the data
Now that I reviewed my list of
entities, I’m ready to create
an integrated Data Subset
and Data Masking plan
Informatica Confidential & Proprietary
42
ILM Workbench – Plans
The plan is now
complete and
being generated
Before I process the plan, I’m
going to launch Metadata
Manager and look at the data
lineage from my source to target
system to validate the end to
end definition
I’mwere
ready
to process the
IfNow
there
any
plan. I’ll switch
to PowerCenter
additional
sensitive
workflow
monitor
for detailed
fields
they would
have
information
beenmonitoring
highlighted
Informatica Confidential & Proprietary
43
ILM Workbench – Masking Validation
Once the plan completes, I’d like to
validate the results to ensure masking
was performed as intended
I can use rules such as these to validate
the results
SSN
All values have changed
SSN
All values came from the dataset
First Name
All values have changed
Credit cards
All values have proper format
Informatica Confidential & Proprietary
44
ILM Workbench – Masking Validation
After running the validation. I
Here are a few can
of the
see a simple scorecard of
validation rules
up the rules
theI rules that passedI set
or failed
created earlier
earlier with simple
operators like this one
I can see that one SSN value
didn’t pass the validation rule
The value in the source is the
same as the target
Informatica Confidential & Proprietary
45
Data Masking and Data Subset Check-List
•
Built reusable masking policies
•
Reduced redundant work
•
Complied with data privacy regulations
•
Integrated subset with privacy rules
•
Validated masking results
Informatica Confidential & Proprietary
46
47