Easy_ETL_with_Bimlx
Download
Report
Transcript Easy_ETL_with_Bimlx
Easy ETL with
12.12.2015
12.12.2015
Thank you to our AWESOME sponsors!
12.12.2015
Andrzej Kukuła
20 years of professional experience in IT
Microsoft Certified Solutions Expert
Biml Expert
System Architect
C#, F# Developer by heart
Leader of local PLSSUG/PASS chapter
Long-time Contributor to pymssql project
Passionate about many aspects of IT world
[email protected]
12.12.2015
Marcin Szeliga
Data Philosopher
BI Expert and Consultant
Data Platform Architect
20 years of experience with SQL Server
Ph.D. Candidate at Politechnika Śląska
[email protected]
12.12.2015
Agenda
The ETL
SSIS and its problems
Introduction to Biml
BimlScript and its features
Code generation described
Interesting functionalities
Digressions
Demos
Summary
12.12.2015
The ETL
ETL is a significant cost in DW project
But it‘s not going to disappear tomorrow
It is still the foundation of many big Enterprise
Data Warehouses
Thus the goal: simplify ETL to lower cost
SSIS is very good ETL solution
but...
12.12.2015
SSIS and its problems
Manual package creation using visual editor + drag & drop
Slow, boring, error prone
Not adaptable to requirements
(think about changing logic in 10/100/… packages)
Not developer-friendly
Not generic
Version control is hardly possible
No easy API
No code templates/design patterns/DRY
No CI/CD
Should we really spend
our time fighting this?
12.12.2015
Example Package
12.12.2015
Example DTSX
Is this reasonable
amount of code to
accomplish the task?…
12.12.2015
Enter
<Package Name="Load Person">
<Tasks>
<Dataflow Name="Copy to Stage">
<Transformations>
<OleDbSource Name="Person" ConnectionName="Source">
<TableInput TableName="AdventureWorks2012.Person.Person" />
</OleDbSource>
<DerivedColumns Name="Add LoadDateTime">
<Columns>
<Column Name="LoadDateTime" DataType="DateTime2" Scale="7">
@[System::ContainerStartTime]</Column>
</Columns>
</DerivedColumns>
<OleDbDestination Name="StgPerson" ConnectionName="Stage">
<TableOutput TableName="Stage.AW.Person" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
12.12.2015
What’s Biml?
Business Intelligence Markup Language
A really easy XML-based language to describe BI assets
Connections, Tables, Views
SSIS Packages, SSIS Projects
Dimensions, Measure Groups, Cubes
and more…
Smart default values and behavior
Don’t bother with SSIS metadata (most of the time)
Easy to integrate with VC (TFS, Git, and more)
Easy to organize code into reusable libraries
Available for free in SSDT-BI with BIDSHelper
Demo!
12.12.2015
Example Biml elements
Familiar and pretty easy to remember, aren’t they?
12.12.2015
Biml and BimlScript
Biml = XML Domain Specific Language
BimlScript = Biml + C# code nuggets
called template
Demo!
12.12.2015
BimlScript
We have full power of .NET at our disposal
Execute code during compilation
Read files
Access remote data
Even create database objects (e.g. semi-temporary tables)
Reference external .NET assemblies
Supports including and calling other Biml files
Can reference external .NET assemblies
Extensive .NET Biml API built-in
12.12.2015
BimlScript Expansion
First, BimlScript is expanded to Biml
Then Biml is compiled to DTSX
12.12.2015
Biml Compilation
Biml is compiled to DTSX + DTProj
The process is repeatable
Can be directly opened in SSDT-BI (BIDS)
Execution using DTExec / SSISDB / SSDT
No additional runtime or runtime license required
to execute package
Compilation is just automation of SSIS
The same rules apply
Objects must exist in database (so that code generator can
create SSIS metadata and mappings)
12.12.2015
The process
• Biml + BimlScript source files
• C#/VB compiler
• BimlScript expansion to Biml
• Population of in-memory Object Model
• Single, in-memory, expanded,
compiled and merged Biml
Code generator
…
12.12.2015
RootNode
The .NET object which represents all assets and metadata
Populated during parsing of Biml files
Useful in BimlScript
Allows creation of very dynamic solution
RootNode.Connections
RootNode.Databases
RootNode.Tables
RootNode.Packages
and many many more
12.12.2015
Code Reuse
Can be used anywhere in Biml file
Expands file given in argument and return Biml string
Can specify additional arguments
Include (expand) specified Biml file (or .NET resource)
before current file
12.12.2015
Interesting features
<Annotation></Annotation>
Lightweight metadata, can be associated with any asset
Documentation annotations
Interesting use case: SQL extended properties
ObjectTag
Property of all BimlScript .NET objects
Accessed programmatically from BimlScript
Very convenient and extensible object storage
12.12.2015
Extending BimlScript object model
Contains plain C# code (no templating)
Can interact with Biml object model
Can use extended methods
Can define new classes
We can subclass original Biml classes
to change/extend their behavior
12.12.2015
More features
Transformers and „Frameworks”
Ability to change Biml after it is already expanded
Can „reverse engineer” DTSX packages into Biml
It’s possible to write custom code generators
Different target runtime than DTSX
12.12.2015
Source object types - digression
Views in custom schema, e.g. BI.Person
Isolation
Doesn’t need to modify production objects
Can be in different database than production on the same instance
(if absolutely no modification of production is allowed)
Abstraction
Rename columns
Convert data, create new columns
Model data – e.g. join with other sources
Filter data
Add extended properties
Security
Dedicated ETL user can be granted access only to views in this
schema, doesn’t need access to underlying tables
12.12.2015
Separation of duties
External metadata
Business analysts don’t touch the packages
Programmers don’t model the business
Easy storage
MDS – Excel client!
Extended properties
custom
12.12.2015
Complete demo
Complete ETL solution
Storage of metadata (connections, mappings) outside of code
Dynamic import of database table definitions
Dynamic generation of stage tables
Annotations and SQL extended properties as a metadata source
Execute SQL during BimlScript expansion
LogicalDisplayFolder as a metadata source
Conditional logic
Type conversion on the fly
Nonstandard column mapping
Master package
Dynamic package project
Dynamic rename of objects during ETL
Package to create stage tables
Tiers
MSBuild
12.12.2015
Demo!
How to benefit it?
BI Project Decision Makers
Faster initial delivery
Lower cost of change
Immediate Biml/Mist ROI
BI Architects
Reusable Design Patterns with Biml/BimlScript
One project compatible with SQL Server 2005-2014
Easily manage large BI code base, tasks, and issues
using TFS
Plan for BI Continuous Integration/Continuous Delivery
12.12.2015
How to benefit it?
ETL developers
Fast Biml learning curve
Generate your DTSX faster with Biml instead of
drag’n’drop
Embrace DRY in BI development
Use proper version control to manage your sources
BI consultants
Increased productivity
Build your Biml/BimlScript code library and reuse it in
different projects
No runtime license costs for customers
12.12.2015
12.12.2015
Thank you to our AWESOME sponsors!
12.12.2015
Questions?
[email protected]
[email protected]
12.12.2015