Metadata-Driven_Bimlx

Download Report

Transcript Metadata-Driven_Bimlx

Metadata-driven Automatic
Package Creation with
Notes from the field
Sponsors
Andrzej Kukuła
 20 years of professional experience in IT
 Microsoft Certified Solutions Expert: Data Platform
 Performance tuning, troubleshooting






Biml Expert
System Architect
C#, F# Developer by heart
Long-time Contributor to pymssql project
Passionate about many aspects of IT world
[email protected]
Marcin Szeliga






Data Philosopher
BI Expert and Consultant
Data Platform Architect
20 years of experience with SQL Server
Ph.D. Candidate at Politechnika Śląska
[email protected]
Agenda












Problems with SSIS
Introduction to Biml
BimlScript
Code generation
RootNode
Code Reuse
Annotations
LogicalDisplayFolder
Demos
ObjectTags
Extending Biml object model
Summary
SSIS and its problems
 Manual package creation using visual editor + drag &
drop
 Slow, boring, error prone
 Not adaptable to requirements
(think about changing logic in 10/100/… packages)
 Not developer-friendly





Not generic
Version control is hardly possible
No easy API
No code templates/design patterns/DRY
No CI/CD
Enter
<Package Name="Load Person">
<Tasks>
<Dataflow Name="Copy to Stage">
<Transformations>
<OleDbSource Name="Person" ConnectionName="Source">
<TableInput TableName="AdventureWorks2012.Person.Person" />
</OleDbSource>
<DerivedColumns Name="Add LoadDateTime">
<Columns>
<Column Name="LoadDateTime" DataType="DateTime2" Scale="7">
@[System::ContainerStartTime]</Column>
</Columns>
</DerivedColumns>
<OleDbDestination Name="StgPerson" ConnectionName="Stage">
<TableOutput TableName="Stage.AW.Person" />
</OleDbDestination>
</Transformations>
</Dataflow>
</Tasks>
</Package>
What’s Biml?
 Business Intelligence Markup Language
 A really easy XML-based language to describe BI
assets








Connections, Tables, Views
SSIS Packages, SSIS Projects
Dimensions, Measure Groups, Cubes
and more…
Smart default values and behavior
Don’t bother with metadata (most of the time)
Easy to integrate with VC (TFS, Git, and more)
Document your BI solution
Demo!
Example Biml elements
Biml and BimlScript
 Biml = XML Domain Specific Language
 BimlScript = Biml + C# code nuggets
 called template
Demo!
BimlScript
 We have full power of .NET at our disposal
 Execute code during compilation
 Read files
 Access remote data
 Even create database objects (e.g. semi-temporary
tables)
 Compilation is just automation of SSIS
 The same rules apply
 Objects must exist in database to be able to create
SSIS metadata and mappings
Biml and BimlScript
 Biml is compiled to DTSX + DTProj
 Can be directly opened in SSDT-BI (BIDS)
 Execution using DTExec / SSISDB / SSDT
No additional runtime license required for
package user
BimlScript
 First, BimlScript is expanded to Biml
 Then Biml is compiled to DTSX
BimlScript
 Expansion is ordered by tiers
 Source file with BimlScript is called template
 Template of tier x see all Biml code with tier
x-1
 BimlScript compiler will automatically detect
dependencies between templates
Code generation
• Biml + BimlScript source files
• C#/VB compiler
• BimlScript expansion to Biml
• Population of in-memory Object Model
• Single, in-memory, expanded,
compiled and merged Biml
Code generator
…
Code generation
 Packages can also be generated for assets
 database/schema/table, …
 In practice it’s better to roll out custom
package which takes care of this
 There’s GetDropAndCreateDdl() method to help
RootNode




a .NET object which represents all assets and metadata
Populated during parsing of Biml files
Useful in BimlScript
Allows creation of very dynamic solution





RootNode.Connections
RootNode.Databases
RootNode.Tables
RootNode.Packages
and many many more – inspect it in Intellisense
RootNode
 The model of all assets in the project
(how we want the database and
packages to look like)
 Read/write
RootNode
Code Reuse
 Can be used anywhere in Biml file
 Expands file given in argument and return
expanded Biml
 Can specify additional arguments
Code Reuse
 Include or copy specified Biml file, or
resource, before current file
Code Reuse
 Included file can be normal template or
declaration scope:
Complete demo











Complete ETL solution
Static and dynamic tables
Extended properties as a metadata source
Execute SQL during BimlScript expansion
LogicalDisplayFolder as a metadata source
Conditional logic
Type conversion on the fly
Nonstandard column mapping
Master package
Dynamic package project
Tiers
Demo!
Annotations
 <Annotation></Annotation>
 Lightweight metadata
 Annotations can associate any string data
with any asset, statically or programmatically
 Annotations can be assigned to connections,
tables, columns, packages, tasks, …
 Can be used to customize packages even
more
Example use case of annotations
 Import SQL extended properties to
annotations during import of source objects
 Can be used to add automatic data
conversion during package creation
 Most popular application - custom date values
(integer) -> proper DateTime data type
 Set using AddAnnotation() method
 Read using GetTag()
LogicalDisplayFolder
 Logical separation of assets
 E.g. „Source” versus „Stage”
 Can be set/read programmatically
 Visible in IDE
Source object types - digression
 Views in custom schema, e.g. BI.Person
 Isolation
 Doesn’t need to modify production objects
 Can be in different database than production on the same
instance (if absolutely no modification of production is allowed)
 Abstraction





Rename columns
Convert data, create new columns
Model data – e.g. join with other sources
Filter data
Add extended properties
 Security
 Dedicated ETL user can be granted access only to views in this
schema, doesn’t need access to underlying tables
Separation of duties
 External metadata
 Business analysts don’t touch the packages
 Programmers don’t model the business
 Easy storage
 MDS – Excel client!
 Extended properties
 custom
Another demo
 Storage of connection strings and other
metadata outside of project
 Here in SQL, but can also be in MDS
 Dynamic source to stage object mapping with
on-the-fly rename
 Dynamic creation of source objects and stage
tables in Biml
 Package to create stage tables automatically
Demo!
Extending BimlScript object model
 Contains plain C# code (no templating)
 Can interact with Biml object model
 Can use extended methods
 Can define new classes
 Subclass original Biml classes to change/extend their
behavior
ObjectTag
 Property of all BimlScript .NET objects
 Connections, tables, columns, packages, tasks, and more
 Accessed only programmatically from BimlScript
 Very convenient and extensible object storage
 Can extend builtin Biml properties
Demo!
MSBuild
 Batch builds from command line/a tool
 CI/CD pipeline is possible
 Generated SSIS projects can be deployed
using PowerShell, XCOPY/RoboCopy
Demo!
More features
 Transformers and „Frameworks”
 Ability to change the way how code is generated
 Ability to „reverse engineer” DTSX packages
into Biml
 Ability to write custom code generators
How to benefit it?
 BI Project Decision Makers
 Faster initial delivery
 Lower cost of change
 Immediate Biml/Mist ROI
 BI Architects
 Reusable Design Patterns with Biml/BimlScript
 One project compatible with SQL Server 2005-2014
 Easily manage large BI code base, tasks, and issues
using TFS
 Plan for BI Continuous Integration/Continuous
Delivery
How to benefit it?
 ETL developers
 Fast Biml learning curve
 Generate your DTSX faster with Biml instead of
drag’n’drop
 Embrace DRY in BI development
 Use proper version control to manage your sources
 BI consultants
 Increased productivity
 Build your Biml/BimlScript code library and reuse it in
different projects
 No runtime license costs for customers
Sponsors
Questions?
[email protected]
[email protected]