Package - Microsoft

Download Report

Transcript Package - Microsoft

Building Data Integration
Solutions with Integration
Services
Donald Farmer
Group Program Manager
Microsoft Corporation
Agenda
 Integration Services Overview
 Building Packages Demo
 SSIS Lifecycle
 Tools, management, security, deployment
 Managing Packages Demo
 Troubleshooting
 Log data, error flows,
 Summary
Integration Services
Breakthrough ETL Capabilities
 Enterprise ETL platform
 High performance
 High scale
 Best in class usability
 Rich development
environment
 Source control
 Visual debugging of control
flow and data
 Great range of transforms
out-of-the-box
 Highly extensible
 Custom tasks
 Custom enumerations
 Custom transformations
 Custom data sources
Data Integration Architecture
Before Integration Services
Alerts and escalation
Data mining
ETL
Call centre data: semi structured
Text Mining
Staging
Staging
Legacy data: binary files
ETL
Hand
coding
Cleansing
&
ETL
ETL
Warehouse
Reports
Staging
Application database




Integration and warehousing require separate, staged, operations.
Preparation of data requires different, often incompatible, tools.
Reporting and escalation is a slow process, delaying smart responses.
Heavy data volumes make this scenario increasingly unworkable.
Mobile
data
Data Integration Architecture
With Integration Services
Alerts and escalation
Text mining
components
Data mining
components
Custom
source
Merges
Mobile
data
Call centre:
Semi-structured data
Warehouse
Legacy data: Binary files
Standard
sources
Application database




Data cleansing
components
SQL Server Integration Services
Reports
Integration and warehousing are a seamless, manageable, operation.
Sourced, prepare and load data in a single, auditable process.
Reporting and escalation can be parallelized with the warehouse load.
Scales to handle heavy and complex data requirements.
How SQL Server Integration
Services Works





Data sources can be diverse, including custom or scripted adapters
Transformation components shape and modify data in many ways.
Data is routed by rules or error conditions for cleansing and conforming.
Flows can be as complex as your business rules, but highly concurrent.
And finally data can be loaded in parallel to many varied destinations.
SQL Server Integration Services
New Paradigm for the ETL Platform
 Data Cleansing
 Provides data mining and AI expertise
 Domain-independent data cleansing
 Fuzzy lookup
 Lookup on approximate matches
 Tune for best match
 De-duplication
 Eliminate approximate duplicates
 “Windows XP”, “WinXP”, etc.
 Tune for confidence
 Managing Slowly Changing Dimensions
 E.g. Sales organization changes
 E.g. Customer movement
 E.g. Product category changes
Building Packages
SSIS Life Cycle tools
Feature summary
 Design the SSIS ‘Package’
 Business Intelligence Studio (visual Studio)
 Migration wizard for pre SQL 2005 packages
 Visual Source Safe Integration
 Deployment/Execution
 Deployment Utility to copy packages
 Command Line execution (dtexec.exe and dtexecui.exe)
 Flexible Configuration Options
 Supportability
 Rich per package Logging (Log Providers)
 SQL Management Studio for monitoring running packages and
organizing stored packages (using SSIS windows service).
 Checkpoint - Restart ability
Sample Server Layout
Packages
on file
system
Source
data
SSIS
packages
stored in
SQL
SSIS
Package
Execution
SSIS
package
Logging
SSIS
package
error rows
SSIS
support
Servers
Destination
data
Source Flat
Files
Logging and Log Providers
 Log entries are a blend of status and result
messages.
 Can select what ‘details’ per control flow object
within each package (e.g. OnError, OnWarning,
OnPreExecute)
 Can select what fields (e.g. computer, operator,
ExecutionID…)
 Can define multiple log providers (SQL, text file,
Windows Event..) per package
 BIDS has a Log Events window to see the log
entries that are headed for the log provider
SSIS Windows Service
 Monitors and manages running as well as stored
packages, via SQL Management Studio
 Service Installed when you install SSIS
 Service is not required to design or execute
Packages
 Stored Packages tree is based on XML
configuration file
 You can customize the file contents, name, and
location
 Windows events for service (service start, service
failed to start, package started, package
stopped…)
Overview of SSIS security
 ‘Security’ is comprised of several layers to
support both SQL and File System Based
scenarios.
 Packages can be Encrypted
 Packages can be Digitally Signed
 Packages can be stored in SQL DB and
protected with SQL Roles
Overview of SSIS
security
Package
SQL DB
Roles
•Reader and Writer
roles for packages
stored in MSDB
Protection
Level
Property
Encrypt or clear
sensitive properties
A
Package
Signature
B
OS
Permissions
•File and folder
access control for file
based packages
•View/stop running
packages
 (A) A package can be encrypted via the Package Protection level
property as well as stored inside MSDB, where access to the package
is limited by SQL Database roles
 (B) A package can be encrypted via the Package Protection level property
as well as access to the package file is controlled by folder/file
permissions at the operating system level
Deployment
Flow
•Design Package
Bi Studio
•Add Configurations
•Add Miscellaneous files
•Set Project Deployment properties
•Build Project
 Tools to
organize and
‘copy’ packages
and supporting
files
•Copy/Move Deployment folder\files You
•Execute manifest file Installation Wizard
•Choose Destination (SQL File System)
•Modify protection level
•Choose location of supporting files
•Change configurations
•Create desired agent jobs SQL Agent
Deploying SSIS packages
 Designer can build a deployment file set which
includes a project’s package(s), configuration
files, and installer files
 You move the installer file set to another
server\environment and execute, to install
packages to SQL or the File system.
 Deployment is not a version sync\check tool
such as SMS
 Installer can ‘copy’ miscellaneous files but will
not ‘install’ them, such as custom component.
SQL Management Studio
 Requires the SSIS service
 Allows Monitoring and Stopping of currently
Executing packages
 Maintain stored package structure and set
roles for SQL stored packages
 You can connect\view multiple SSIS servers
at one time
 Adhoc Package execution from folder tree
Log Data for Troubleshooting
 Logging and Error Flow data are core for
troubleshooting
 Can save\load logging detail templates
 Children packages bubble entries to parent
 Package IDs need to be unique. You can regenerate the ID via BIDS or dtutil.exe
 Example: This will regenerate package IDs for all
packages in a folder…be it 2 or 200:
 for %%f in
(C:\_work\SSISPackages\_quick\Notepad\*.dt
sx) do dtutil.exe /i /File %%f
Error Flows in Data Flow Task
 Error flows can be controlled per field
per row (fail component, redirect,
ignore)
 Just another flow…to your destination
 Error flows can all be directed to a
central location for centralized
operations
 Includes error code and column id
 Can couple with a Row Sampling
transform
Other Troubleshooting ideas
 Performance Counters (SQLServer:SSISPipeline
and SQLServer:SSISService:)
 Integration with Microsoft® Operations Manager
 SQL Agent has log data as well and proxies to limit
package execution
 General package design to log row counts,
multicast, and save variables
 Webcast: Scalability, Performance and
Optimization in SSIS
Summary
 SQL Server Integration Services is an
exceptionally high performance integration
and transformation tool
 Some processes benefit more from
parallelism, some from memory
 32-bit performance and scale is best
increased by parallelism
 64-bit enables highly scalable
memory operations
For More Information
 Analysis Services TechCenter
 http://www.microsoft.com/technet/prodtechnol/sql/200
5/technologies/ssisvcs.mspx
 Developer Center
 http://msdn.microsoft.com/sql/bi/integration/default.as
px
 Great information available at www.sqlis.com
 Project Real
 http://www.microsoft.com/sql/solutions/bi/projectreal.m
spx
 On-demand Webcasts
 http://www.microsoft.com/events/series/sqlserverbi.ms
px