SQL Server Analysis Services Tabular Model

Download Report

Transcript SQL Server Analysis Services Tabular Model

SQL Server Analysis
Services Tabular Model
A BR I E F E N D TO E N D
BY KI RSTEN BUR N E T T - E N T ELEC T
Our Journey through Tabular Models
1.
Tabular in a nut shell
2.
The Business Intelligence Semantic Model
3.
A brief history of Tabular
4.
Tabular benefits over Power Pivot
5.
Tabular vs Multidimensional and choosing between the two
6.
xVelocity In Memory Analytics Engine
7.
DirectQuery
8.
Managing a Tabular Model (Processing and Partitioning)
9.
Automating with ASSL/TMSL and AMO
10. Creating and Deploying a Tabular model (Demo)
11. Tabular and xVelocity in the market and the future for SQL Server Analysis Services
Image from: https://sqldusty.com/2013/08/06/performance-tuning-of-tabular-models-in-ssas-2012-whitepaper-now-available/
Tabular in a nutshell
Measures,
KPI’s,
Hierarchies
Perspectives
&
Translations
DAX
ASSL, AMO,
TMSL,
DMV’s
DirectQuery
Role and Row
Security
xVelocity
Relationships,
Tables,
Columns,
Rows
SSAS
Database
Partitioning
& Processing
Tabular
Various Data
Sources
SSDT &
SSMS
http://www.nutritionbynature.com.au/blog/nuts-seeds-are-for-birds-and-squirrels-not-humans
Business Intelligence Semantic Model
BI APPLICATIONS
SSRS
EXCEL
3RD PARTY APPS
DATA MODEL
MULTIDIMENSIONAL
TABULAR
BUSINESS LOGIC
MDX
DAX
DATA ACCESS
DATA SOURCES
ROLAP
RDBMS
MOLAP
FILE
DIRECT
QUERY
LOB
xVelocity
CLOUD
A Brief History of Tabular
2 Internal papers in 2006 by Amir
Netz about a BI Sandbox and Inmemory technology
Power Pivot for Excel was first
released as an add-in in May
2010 as part of the SQL Server
2008 R2 release
SQL Server 2012 saw the
introduction of a completely new
data model, SQL Server Analysis
Services Tabular model
Microsoft’s entry into the Self-Service BI market.
https://news.microsoft.com/stories/people/amir-netz.html
Power Pivot vs Tabular
SCALABILITY
TABULAR
BENEFITS OVER
POWER PIVOT
MANAGEABILITY
SECURABILITY
PowerPivot has essentially become the desktop version of Analysis Services, but remains an
excellent prototyping tool for Tabular.
Tabular vs Multidimensional
EXCELLENT QUERY
PERFORMANCE
VERY LARGE
DATABASES
(TERABYTES)
VARIED DATA
SOURCES
EASIER, QUICKER,
NEWER
TECHNOLOGY
CALCULATED
COLUMNS/TABLES
TABULAR
MORE
GRANULAR
SECURITY
PROVEN, STABLE,
OLDER
TECHNOLOGY
MULTIDIMENTIONAL
WRITEBACK,
AGGREGATIONS,
ACTIONS,
NAMED SETS
xVelocity In Memory Analytics Engine
Compression algorithm
Compression techniques
• Less RAM usage and decreased costs
• Increased performance as less data
to scan
• Dictionary/Hash Encoding
• Value Encoding
• Run Length Encoding
Originally called the VertiPaq engine, xVelocity is an in-memory columnar database
where all of the data in a model lives in RAM.
Columnstore indexes use this same compression technology.
xVelocity Dictionary Encoding
A dictionary of unique values in a column is built and then data values are replaced with the values of the
dictionary index.
SIZE
Regardless of datatype the same
performance gains are achieved.
Small
Small
SIZE ID
0
ID
SIZE
0
0
Small
Extra Large
1
1
Extra Large
Medium
2
2
Medium
Medium
2
3
Extra Small
Extra Small
3
4
Large
Large
4
xVelocity Value Encoding
xVelocity looks for mathematical relationships between the values in a column, then uses the
value itself, with or without a mathematical operation applied, to decrease the memory usage.
Provides better performance as computations
can operate directly on the data without a
separate lookup step as with dictionary
encoding.
PRODUCT_KEY
100023
100024
100025
100026
PRODUCT_KEY 100000
23
24
25
26
100027
27
100028
28
100029
29
Less storage used for smaller value by subtracting 100000.
Adds the value back in when required.
xVelocity Run Length Encoding
Complementary compression technique that tries to reduce the size of a dataset by avoiding
repeated values.
QUARTER
Values are replaced with a slightly
more complex structure that
contains the value only once, with
the number of contiguous rows
having the same value.
Q1
Q1
Q1
Repeated
210 times
…..
Q2
Q2
Q2
QUARTER
START
COUNT
Q1
1
210
Q2
211
260
Repeated
260 times
…
Example from Ferrari, Alberto; Russo, Marco. The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI (Business Skills) (p. 411). Pearson Education. Kindle Edition.
xVelocity Re-encoding
SAMPLE
CHOOSE
COMPRESS
RE-ENCODE
COMPRESS
• Read a
sample of the
data
• Choose a
compression
technique
based on
sample
• Compress the
data using
chosen
technique
• Determine a
new
compression
technique, if
required
• Compress the
data using
chosen
technique
Provide a good sample of data when doing an initial partition
process to keep re-encoding to a minimum.
xVelocity Relationships
To improve performance, relationships between tables are stored in memory as pairs of ID’s and
row numbers.
Given the ID in one table, it
can immediately find the
corresponding rows in the
joined table.
SALE AMOUNT
PRODUCT ID
10
2
12
3
34
7
PRODUCT ID
PRODUCT
2
Nutty Kitty
ROW 1
3
Chocolate Bunny
ROW 2
7
Nutty Chocolate
Bunny
ROW 3
SALES[PRODUCT ID]
PRODUCT[ROW NUMBER]
2
1
3
2
7
3
xVelocity Design Considerations
A column can have both Run Length Encoding AND Dictionary OR Value encoding.
1. CARDINALITY
Number of unique values in the column,
lower is better
2. DISTRIBUTION
Repeated values have higher
compression rates than those columns
with frequently changing values
DESIGN
CONSIDERATIONS
3. TABLE SIZE
Number of rows in the table
4. DATA TYPE
Affects the dictionary size
DirectQuery
REAL TIME
REPORTING
VERY LARGE
DATABASES
DAX CONVERTED
TO SQL
DIRECTQUERY
SEMANTIC LAYER
DIFFERENCES IN
DAX FORMULAS
AND SUPPORT
SINGLE
RELATIONAL
SOURCE
Partitioning
Partitions divide a table into logical parts and provides several benefits.
AMOUNT
DATE
220
2015-12-31
….
…..
468
2015-01-01
1290
2014-12-31
…
…
34
2014-01-01
3
2013-12-31
…
…
98
2013-01-01
DECREASED PROCESSING TIME
Partition for 2015
Partition for 2014
Partition for 2013
EASILY REMOVE OR UPDATE DATA
CONSOLIDATE DATA FROM VARIOUS
SOURCES
Processing – Full, Add, Recalc, Default
FULL
ADD
RECALC
DEFAULT
REFRESHES ALL STRUCTURES
UPDATES PARTITION WITH
NEW DATA
UPDATES AND RECALCULATES
MAKES OBJECT QUERYABLE
AVAILABLE AT DATABASE,
TABLE AND PARTITION LEVEL
AVAILABLE AT PARTITION
LEVEL
AVAILABLE AT DATABASE
LEVEL
AVAILABLE AT DATABASE, TABLE
AND PARITION LEVEL
MODEL IS QUERYABLE
THROUGHOUT
MODEL IS QUERYABLE
THROUGHOUT
NEEDS TO BE MANUALLY
ISSUED AFTER PROCESS DATA
ONLY OPERATIONS NECCESARY
TO MAKE OBJECT QUERYABLE
NO AUTOMATIC DETECTION AFFECTS HIERARCHIES, CALCS
AND RELATIONSHIPS
OF DUPLICATES
CAN RUN PROCESS DATA AND
OR PROCESS RECALC
ENTIRE COPY OF DATABASE
STORED IN MEMORY
Processing - Data, Clear, Defrag
PROCESS DATA
PROCESS CLEAR
PROCESS DEFRAG
LOADS DATA ONLY
DROPS ALL DATA
REBUILDS TABLE
DICTIONARY
AVAILABLE AT TABLE AND
PARTITION LEVEL
AVAILABLE AT DATABASE,
TABLE AND PARTITION LEVEL
AVAILABLE AT TABLE
LEVEL
MODEL IS NOT QUERYABLE
AFTERWARDS
MODEL IS NOT QUERYABLE
AFTERWARDS
USED WHEN PARTITIONS
FREQUENTLY CHANGE
RUN PROCESS DEFAULT OR
PROCESS RECALC AFTER
NEED TO LOAD DATA AND
RECALC
USELESS AFTER PROCESS
FULL OR PROCESS DATA
SQL Server 2016 Analysis Services (SSAS) includes parallel processing for tables with two or more partitions.
Automating with ASSL, TMSL & AMO
Analysis Services Scripting Language (ASSL) is
an extension of the XMLA Open Standard and
can be used to automate admin tasks.
Tabular Model Scripting Language (TMSL) –
JSON. New in SQL Server 2016 with relevant
Tabular language elements.
ASSL
TMSL
AMO
Analysis Management Objects (AMO) –
Managed Interface used to run XMLA, MDX
etc.
Basic Tabular Demo
Image from: http://quoteaddicts.com/topic/good-luck-funny/
The Future
MERGED
TECHNOLOGIES
TABULAR REPLACES
MULTI-DIMENSIONAL
?
ANALYSIS SERVICES
AS A SERVICE
(AZURE)
HYBRID SOLUTIONS
References
http://sqlblog.com/
Ferrari, Alberto; Russo, Marco; Webb, Chris. Microsoft SQL Server 2012 Analysis Services: The BISM
Tabular Model (Developer Reference) Pearson Education. Kindle Edition.
https://msdn.microsoft.com/en-us/library/hh212945.aspx
https://www.mssqltips.com/sqlservertip/2818/understanding-the-sql-server-2012-bi-semanticmodel-bism/
Credit: Business Intelligence in Microsoft SharePoint 2010 By Norm Warren, Mariano Neto, John
Campbell, Stacia Misner
Ferrari, Alberto; Russo, Marco. The Definitive Guide to DAX: Business intelligence with Microsoft
Excel, SQL Server Analysis Services, and Power BI (Business Skills) (p. 411). Pearson Education.
Kindle Edition.
Thank you to our
sponsors