Transcript Document

Integration Services
Xx
xx
Microsoft
Agenda
Microsoft BI Platform
Challenges
Vision
SSIS Differentiators
Enterprise Data Integration
Enhanced Productivity
Extensibility
References
Why Is It So Difficult To Achieve A
Higher Return Of Information Today…
Most companies today have a tangled web of IT and human systems.
This tangled web has a high degree of inefficiency, complexity, and risks.
High degree of
data cleansing and
re-entry…labor-intensive
Source
Systems
Many points of data integration…
poor data integrity and reliability
Data Marts
Data
Warehouses
High IT
involvement…
longer time-tovalue
Text
Mining
ERP
CRM
Hand
coding
Multiple
solutions…
more costly,
and frustrates
users
LOB
ETL
Data Analysis
(OLAP, Data Mining)
Human input…
prone to error
Many disconnected systems…
incomplete data…
multiple versions of truth
Improving organizations by
providing business insights to
all employees leading to better,
faster, more relevant decisions
Delivered through a familiar
environment
Integrated into a business
productivity infrastructure
Built on a trusted & extensible
platform
Get more out of products you already own
BUSINESS USER EXPERIENCE
BUSINESS COLLABORATION PLATFORM
DATA INFRASTRUCTURE & BI PLATFORM
Self-Service access & insight
Data exploration & analysis
Predictive analysis
Data
visualization
Dashboards
& Scorecards
Contextual
visualization
Excel Services
Web based forms &
workflow
Collaboration
Analysis
Services
Search Services
Reporting
Content Management
Integration
Services
LOB data
integration
Master
Data
Services
Data Mining
Data Warehousing
Empower Your People with Business Insight
Increase individual productivity through familiar & intuitive tools
Enable your end users to create their own BI solutions
Improve end user sharing and discovery of insights
Improve Organizational Effectiveness
Increase team & organizational productivity through dashboards
Improve visibility into key team and organizational objectives and metrics
Increase business user efficiency with a unified platform
Enable IT Efficiency
Cut costs and complexity by leveraging existing IT investments
Reduce reporting backlog and bottlenecks with managed Self-Service
capabilities
Scale-out to support the growing needs of your business
Gartner
Forrester
• SQL Server is a Leader in Data Warehousing
• Microsoft is the most aggressive DBMS
vendor with a strong road map
IDC
• SQL Server ships more units than Oracle and
IBM combined
• SQL Server is the fastest growing of the top 5
Data Warehouse Vendors
The Magic Quadrant is copyrighted February, 2008 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner’s analysis of how certain vendors measure
against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the “Leaders” quadrant. The
Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims al l warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular
purpose.
Empowered IT
Pervasive Insight
Dynamic Development
Mission Critical Platform
Desktop & Mobile
Server & Datacenter
Cloud
Integration Services
Analysis Services
Integrate
Analyze
Data acquisition
from source
systems and
integration
Data
transformation
and synthesis
Data enrichment,
with business
logic, hierarchical
views
Data discovery
via data mining
Reporting Services
Report
Data
presentation and
distribution
Data access for
the masses
Enterprise Data Integration Platform
Scalable Platform
–
–
–
–
–
Connect to data
Multi-threaded architecture
Comprehensive transformations
Profile your data
Cleanse your data
Enhanced Productivity
− Visual Development Environment
− Powerful debugging capabilities
− Integrated with SQL BI Platform
Extensible Platform
–
–
–
C# and VB.Net scripting
Reuse business logic
Embeddable tasks
Data Integration Architecture:
Before Integration Services
Alerts and escalation
Data mining
ETL
Call centre data: semi structured
Text Mining
Staging
Staging
Legacy data: binary files
ETL
Hand
coding
Cleansing
&
ETL
ETL
Application database




Staging
Warehouse
Reports
Mobile
data
Integration and warehousing require separate, staged, operations.
Preparation of data to ETL tools requires work to provide flat files
Proliferation of custom coding as developers are unable to leverage expensive ETL products
Developers forced to learn proprietary expressions and languages
SQL Server Integration Services – In Action
Integration Services
Alerts and escalation
Text mining
components
Data mining
components
Custom
Adapter
Merges
Mobile
data
Call centre:
Semi-structured data
Warehouse
Legacy data: Binary files
Standard
sources
Application database





Custom
components
SQL Server Integration Services
Reports
Integration and warehousing are a seamless, manageable, operation.
Connect, Clean, Transform and Load in a single, auditable process.
Scales to handle heavy and complex data requirements.
Cube
Data is routed by rules or error conditions for cleansing and conforming.
Load data directly into any destination, from RDBMS to in memory ADO.Net recordsets
Heterogeneous Sources and Targets
 Extensive Connectivity
–
Unstructured data
–
–
–
–
Legacy data: Binary files
Application database
–
OLTP
DW
Change
Tables
High speed connectors for Oracle,
Teradata and SAP
Standards-based support
XML, flat files, and Excel
Binary files
Connection to Applications via BizTalk,
MS Message Queues
Partner ecosystem
 Change Data Capture
–
–
Transparently capture changes
Real time integration
Development Productivity
Visual Development Environment
Drag and Drop Package Designer
Edit and Debug in Visual Studio
Environment
Breakpoints, watches, variable inspection
Data viewers enable visualization of data
flow
Full Source Control management
Build and deploy
Impact analysis and with build and
validate
Deployment utility moves packages from
development, and test to production
Custom code integration
Script objects using VB.NET
Custom components using C# .NET
Integrate existing code, or develop new
solutions
Intelligent Data Handling
Intelligent Data Handling
Data Profiling
Train mining models on incoming data
Use mining models to predict values for
missing data
Identify outlying rows from patterns of known
good data
Conditionally route outlying rows for smart
escalation of potential anomalies
Intelligent Data Handling
Data Cleansing
Extract key terms from text fields
Lookup from text to reference tables for
matching terms
Build processes which manage structured and
semi-structured data together
Intelligent Data Handling
Data Auditing
Identify exceptions to business rules
with expressions
Route type conversion or other data errors to
error outputs of components
Save exceptions and errors to files,
databases…
or transform exceptions and errors, to merge
back into the main process
Pluggable Components
Components can work with
data set
Aggregation
Sorting
Fuzzy (best match) Lookups
and De-duplication
Components work with
data row by row
Calculating new columns
Converting data
Character conversions
Look-up joins to reference
tables
Pluggable Components
Components can work with
data set
Aggregation
Sorting
Fuzzy (best match) Lookups
and De-duplication
Components work with
data row by row
Calculating new columns
Converting data
Character conversions
Look-up joins to reference
tables
SSIS Customer References
“SSIS is delivering sizable performance improvements even with a
marked increase in the amount of data it is handling”
“Creation of dynamic packages and data marts is automatic….
Scalability of Integration Services was essential to managing
massive amounts of data that we work with. “
“Now we can build our business logic into ETL processes, we can
make a change to business rule in one place”
Using SSIS, data warehouse has grown to over 4 TB, handling 12fold increase of data loads without increasing loading window
Provide real time store performance data to address business
situations
Source real time messages from Microsoft Message Queue
Process 14 million rows in 20 minutes
Minimize coding, resulting in faster turnaround for data cleansing and
transformation
© 2006 Microsoft Corporation. All rights reserved.
This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
“We’re in a business where some of the most important
questions asked by upper management begin with ‘What
if.’ SQL Server Analysis Services is a tremendous tool for
answering those questions.”
Dan Zerfas, Vice President of Software Development,
PREMIER Bankcard
Credit Card Company Runs its Business with 17-Terabyte
Mission Critical BI Solution




Needed to enhance its
BI infrastructure for its
17-terabyte data
warehouse


Required greater
scalability
Sought easier db
maintenance
Reduce scheduled
downtime

Microsoft® SQL Server®
2008 Enterprise (64-bit)
Microsoft Windows
Server 2003 Datacenter
Edition for Itaniumbased Systems
Visual Studio® 2008





Better view of the
business
Enterprise-grade
scalability
Easier database
maintenance
Efficient ETL with
Integration Services
Data compression
New in SQL Server 2008
A few SSIS and data warehousing improvements
SAP-BW Adapter
Data Compression
Star Join Query Optimization
Parallel Query Enhancements
Teradata Adapter
Oracle Adapter
Backup Compression
Resource Governor
Policy Based Administration
Change Data Capture (CDC)
Reference Architectures
Persistent Lookups
…………
©2009 Microsoft Corporation
Enhanced Data Visualization
Rendering for Word & Excel
MERGE SQL Statement
Data Profiling
New - Report Builder 2.0
Partition-Aligned Indexed
Views
…………
IIS Agnostic Report Deployments
Data Mining Engine Improvements
MDX Query , Writeback
Optimizations
Best Practice Design Alerts
Scale-out AS engine & backup
…………
26
SSRS on BW
BW and SQL
Enterprise reporting on existing business data
Build SSRS Reports directly on SAP BW
infocubes
Certified Connectivity through “Microsoft .Net
Data Provider for SAP NetWeaver BI”
SQL Server DB
Fastest Growing platform for SAP installations
Investment in product team to optimize SQL for
SAP with co-located product team members in
SAP Germany
SAP Application
Servers
SAN
R/3
SQL Server
SAP BW 3.5B ON
SQL SERVER 2005
Persistent
Staging Area
BW
Object
SSRS BW
Provider
SQL Server
Reporting
Services
BW DB
Connect
Staging
Area
`
End User
BW
Open Hub
Sharepoint BI
portal
SSIS ETL
Platform
Excel
SharePoint
Non-SAP
Applications
SSAS Cube