Metadata-Driven_Bimlx
Download
Report
Transcript Metadata-Driven_Bimlx
Metadata-driven Automatic
Package Creation with
Notes from the field
Sponsors
Andrzej Kukuła
20 years of professional experience in IT
Microsoft Certified Solutions Expert: Data Platform
Performance tuning, troubleshooting
Biml Expert
System Architect
C#, F# Developer by heart
Long-time Contributor to pymssql project
Passionate about many aspects of IT world
[email protected]
Marcin Szeliga
Data Philosopher
BI Expert and Consultant
Data Platform Architect
20 years of experience with SQL Server
Ph.D. Candidate at Politechnika Śląska
[email protected]
Agenda
Problems with SSIS
Introduction to Biml
BimlScript
Code generation
RootNode
Code Reuse
Annotations
LogicalDisplayFolder
Demos
ObjectTags
Extending Biml object model
Summary
SSIS and its problems
Manual package creation using visual editor + drag &
drop
Slow, boring, error prone
Not adaptable to requirements
(think about changing logic in 10/100/… packages)
Not developer-friendly
Not generic
Version control is hardly possible
No easy API
No code templates/design patterns/DRY
No CI/CD
Enter
<Package Name="Load Person">
<Tasks>
<Dataflow Name="Copy to Stage">
<Transformations>
<OleDbSource Name="Person" ConnectionName="Source">
<TableInput TableName="AdventureWorks2012.Person.Person" />
</OleDbSource>
<DerivedColumns Name="Add LoadDateTime">
<Columns>
<Column Name="LoadDateTime" DataType="DateTime2" Scale="7">
@[System::ContainerStartTime]</Column>
</Columns>
</DerivedColumns>
<OleDbDestination Name="StgPerson" ConnectionName="Stage">
<TableOutput TableName="Stage.AW.Person" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
What’s Biml?
Business Intelligence Markup Language
A really easy XML-based language to describe BI
assets
Connections, Tables, Views
SSIS Packages, SSIS Projects
Dimensions, Measure Groups, Cubes
and more…
Smart default values and behavior
Don’t bother with metadata (most of the time)
Easy to integrate with VC (TFS, Git, and more)
Document your BI solution
Demo!
Example Biml elements
Biml and BimlScript
Biml = XML Domain Specific Language
BimlScript = Biml + C# code nuggets
called template
Demo!
BimlScript
We have full power of .NET at our disposal
Execute code during compilation
Read files
Access remote data
Even create database objects (e.g. semi-temporary
tables)
Compilation is just automation of SSIS
The same rules apply
Objects must exist in database to be able to create
SSIS metadata and mappings
Biml and BimlScript
Biml is compiled to DTSX + DTProj
Can be directly opened in SSDT-BI (BIDS)
Execution using DTExec / SSISDB / SSDT
No additional runtime license required for
package user
BimlScript
First, BimlScript is expanded to Biml
Then Biml is compiled to DTSX
BimlScript
Expansion is ordered by tiers
Source file with BimlScript is called template
Template of tier x see all Biml code with tier
x-1
BimlScript compiler will automatically detect
dependencies between templates
Code generation
• Biml + BimlScript source files
• C#/VB compiler
• BimlScript expansion to Biml
• Population of in-memory Object Model
• Single, in-memory, expanded,
compiled and merged Biml
Code generator
…
Code generation
Packages can also be generated for assets
database/schema/table, …
In practice it’s better to roll out custom
package which takes care of this
There’s GetDropAndCreateDdl() method to help
RootNode
a .NET object which represents all assets and metadata
Populated during parsing of Biml files
Useful in BimlScript
Allows creation of very dynamic solution
RootNode.Connections
RootNode.Databases
RootNode.Tables
RootNode.Packages
and many many more – inspect it in Intellisense
RootNode
The model of all assets in the project
(how we want the database and
packages to look like)
Read/write
RootNode
Code Reuse
Can be used anywhere in Biml file
Expands file given in argument and return
expanded Biml
Can specify additional arguments
Code Reuse
Include or copy specified Biml file, or
resource, before current file
Code Reuse
Included file can be normal template or
declaration scope:
Complete demo
Complete ETL solution
Static and dynamic tables
Extended properties as a metadata source
Execute SQL during BimlScript expansion
LogicalDisplayFolder as a metadata source
Conditional logic
Type conversion on the fly
Nonstandard column mapping
Master package
Dynamic package project
Tiers
Demo!
Annotations
<Annotation></Annotation>
Lightweight metadata
Annotations can associate any string data
with any asset, statically or programmatically
Annotations can be assigned to connections,
tables, columns, packages, tasks, …
Can be used to customize packages even
more
Example use case of annotations
Import SQL extended properties to
annotations during import of source objects
Can be used to add automatic data
conversion during package creation
Most popular application - custom date values
(integer) -> proper DateTime data type
Set using AddAnnotation() method
Read using GetTag()
LogicalDisplayFolder
Logical separation of assets
E.g. „Source” versus „Stage”
Can be set/read programmatically
Visible in IDE
Source object types - digression
Views in custom schema, e.g. BI.Person
Isolation
Doesn’t need to modify production objects
Can be in different database than production on the same
instance (if absolutely no modification of production is allowed)
Abstraction
Rename columns
Convert data, create new columns
Model data – e.g. join with other sources
Filter data
Add extended properties
Security
Dedicated ETL user can be granted access only to views in this
schema, doesn’t need access to underlying tables
Separation of duties
External metadata
Business analysts don’t touch the packages
Programmers don’t model the business
Easy storage
MDS – Excel client!
Extended properties
custom
Another demo
Storage of connection strings and other
metadata outside of project
Here in SQL, but can also be in MDS
Dynamic source to stage object mapping with
on-the-fly rename
Dynamic creation of source objects and stage
tables in Biml
Package to create stage tables automatically
Demo!
Extending BimlScript object model
Contains plain C# code (no templating)
Can interact with Biml object model
Can use extended methods
Can define new classes
Subclass original Biml classes to change/extend their
behavior
ObjectTag
Property of all BimlScript .NET objects
Connections, tables, columns, packages, tasks, and more
Accessed only programmatically from BimlScript
Very convenient and extensible object storage
Can extend builtin Biml properties
Demo!
MSBuild
Batch builds from command line/a tool
CI/CD pipeline is possible
Generated SSIS projects can be deployed
using PowerShell, XCOPY/RoboCopy
Demo!
More features
Transformers and „Frameworks”
Ability to change the way how code is generated
Ability to „reverse engineer” DTSX packages
into Biml
Ability to write custom code generators
How to benefit it?
BI Project Decision Makers
Faster initial delivery
Lower cost of change
Immediate Biml/Mist ROI
BI Architects
Reusable Design Patterns with Biml/BimlScript
One project compatible with SQL Server 2005-2014
Easily manage large BI code base, tasks, and issues
using TFS
Plan for BI Continuous Integration/Continuous
Delivery
How to benefit it?
ETL developers
Fast Biml learning curve
Generate your DTSX faster with Biml instead of
drag’n’drop
Embrace DRY in BI development
Use proper version control to manage your sources
BI consultants
Increased productivity
Build your Biml/BimlScript code library and reuse it in
different projects
No runtime license costs for customers
Sponsors
Questions?
[email protected]
[email protected]