Package - Microsoft
Download
Report
Transcript Package - Microsoft
Building Data Integration
Solutions with Integration
Services
Donald Farmer
Group Program Manager
Microsoft Corporation
Agenda
Integration Services Overview
Building Packages Demo
SSIS Lifecycle
Tools, management, security, deployment
Managing Packages Demo
Troubleshooting
Log data, error flows,
Summary
Integration Services
Breakthrough ETL Capabilities
Enterprise ETL platform
High performance
High scale
Best in class usability
Rich development
environment
Source control
Visual debugging of control
flow and data
Great range of transforms
out-of-the-box
Highly extensible
Custom tasks
Custom enumerations
Custom transformations
Custom data sources
Data Integration Architecture
Before Integration Services
Alerts and escalation
Data mining
ETL
Call centre data: semi structured
Text Mining
Staging
Staging
Legacy data: binary files
ETL
Hand
coding
Cleansing
&
ETL
ETL
Warehouse
Reports
Staging
Application database
Integration and warehousing require separate, staged, operations.
Preparation of data requires different, often incompatible, tools.
Reporting and escalation is a slow process, delaying smart responses.
Heavy data volumes make this scenario increasingly unworkable.
Mobile
data
Data Integration Architecture
With Integration Services
Alerts and escalation
Text mining
components
Data mining
components
Custom
source
Merges
Mobile
data
Call centre:
Semi-structured data
Warehouse
Legacy data: Binary files
Standard
sources
Application database
Data cleansing
components
SQL Server Integration Services
Reports
Integration and warehousing are a seamless, manageable, operation.
Sourced, prepare and load data in a single, auditable process.
Reporting and escalation can be parallelized with the warehouse load.
Scales to handle heavy and complex data requirements.
How SQL Server Integration
Services Works
Data sources can be diverse, including custom or scripted adapters
Transformation components shape and modify data in many ways.
Data is routed by rules or error conditions for cleansing and conforming.
Flows can be as complex as your business rules, but highly concurrent.
And finally data can be loaded in parallel to many varied destinations.
SQL Server Integration Services
New Paradigm for the ETL Platform
Data Cleansing
Provides data mining and AI expertise
Domain-independent data cleansing
Fuzzy lookup
Lookup on approximate matches
Tune for best match
De-duplication
Eliminate approximate duplicates
“Windows XP”, “WinXP”, etc.
Tune for confidence
Managing Slowly Changing Dimensions
E.g. Sales organization changes
E.g. Customer movement
E.g. Product category changes
Building Packages
SSIS Life Cycle tools
Feature summary
Design the SSIS ‘Package’
Business Intelligence Studio (visual Studio)
Migration wizard for pre SQL 2005 packages
Visual Source Safe Integration
Deployment/Execution
Deployment Utility to copy packages
Command Line execution (dtexec.exe and dtexecui.exe)
Flexible Configuration Options
Supportability
Rich per package Logging (Log Providers)
SQL Management Studio for monitoring running packages and
organizing stored packages (using SSIS windows service).
Checkpoint - Restart ability
Sample Server Layout
Packages
on file
system
Source
data
SSIS
packages
stored in
SQL
SSIS
Package
Execution
SSIS
package
Logging
SSIS
package
error rows
SSIS
support
Servers
Destination
data
Source Flat
Files
Logging and Log Providers
Log entries are a blend of status and result
messages.
Can select what ‘details’ per control flow object
within each package (e.g. OnError, OnWarning,
OnPreExecute)
Can select what fields (e.g. computer, operator,
ExecutionID…)
Can define multiple log providers (SQL, text file,
Windows Event..) per package
BIDS has a Log Events window to see the log
entries that are headed for the log provider
SSIS Windows Service
Monitors and manages running as well as stored
packages, via SQL Management Studio
Service Installed when you install SSIS
Service is not required to design or execute
Packages
Stored Packages tree is based on XML
configuration file
You can customize the file contents, name, and
location
Windows events for service (service start, service
failed to start, package started, package
stopped…)
Overview of SSIS security
‘Security’ is comprised of several layers to
support both SQL and File System Based
scenarios.
Packages can be Encrypted
Packages can be Digitally Signed
Packages can be stored in SQL DB and
protected with SQL Roles
Overview of SSIS
security
Package
SQL DB
Roles
•Reader and Writer
roles for packages
stored in MSDB
Protection
Level
Property
Encrypt or clear
sensitive properties
A
Package
Signature
B
OS
Permissions
•File and folder
access control for file
based packages
•View/stop running
packages
(A) A package can be encrypted via the Package Protection level
property as well as stored inside MSDB, where access to the package
is limited by SQL Database roles
(B) A package can be encrypted via the Package Protection level property
as well as access to the package file is controlled by folder/file
permissions at the operating system level
Deployment
Flow
•Design Package
Bi Studio
•Add Configurations
•Add Miscellaneous files
•Set Project Deployment properties
•Build Project
Tools to
organize and
‘copy’ packages
and supporting
files
•Copy/Move Deployment folder\files You
•Execute manifest file Installation Wizard
•Choose Destination (SQL File System)
•Modify protection level
•Choose location of supporting files
•Change configurations
•Create desired agent jobs SQL Agent
Deploying SSIS packages
Designer can build a deployment file set which
includes a project’s package(s), configuration
files, and installer files
You move the installer file set to another
server\environment and execute, to install
packages to SQL or the File system.
Deployment is not a version sync\check tool
such as SMS
Installer can ‘copy’ miscellaneous files but will
not ‘install’ them, such as custom component.
SQL Management Studio
Requires the SSIS service
Allows Monitoring and Stopping of currently
Executing packages
Maintain stored package structure and set
roles for SQL stored packages
You can connect\view multiple SSIS servers
at one time
Adhoc Package execution from folder tree
Log Data for Troubleshooting
Logging and Error Flow data are core for
troubleshooting
Can save\load logging detail templates
Children packages bubble entries to parent
Package IDs need to be unique. You can regenerate the ID via BIDS or dtutil.exe
Example: This will regenerate package IDs for all
packages in a folder…be it 2 or 200:
for %%f in
(C:\_work\SSISPackages\_quick\Notepad\*.dt
sx) do dtutil.exe /i /File %%f
Error Flows in Data Flow Task
Error flows can be controlled per field
per row (fail component, redirect,
ignore)
Just another flow…to your destination
Error flows can all be directed to a
central location for centralized
operations
Includes error code and column id
Can couple with a Row Sampling
transform
Other Troubleshooting ideas
Performance Counters (SQLServer:SSISPipeline
and SQLServer:SSISService:)
Integration with Microsoft® Operations Manager
SQL Agent has log data as well and proxies to limit
package execution
General package design to log row counts,
multicast, and save variables
Webcast: Scalability, Performance and
Optimization in SSIS
Summary
SQL Server Integration Services is an
exceptionally high performance integration
and transformation tool
Some processes benefit more from
parallelism, some from memory
32-bit performance and scale is best
increased by parallelism
64-bit enables highly scalable
memory operations
For More Information
Analysis Services TechCenter
http://www.microsoft.com/technet/prodtechnol/sql/200
5/technologies/ssisvcs.mspx
Developer Center
http://msdn.microsoft.com/sql/bi/integration/default.as
px
Great information available at www.sqlis.com
Project Real
http://www.microsoft.com/sql/solutions/bi/projectreal.m
spx
On-demand Webcasts
http://www.microsoft.com/events/series/sqlserverbi.ms
px