Analysis Services
Download
Report
Transcript Analysis Services
Analysis Services
Choosing between
Multidimensional and Tabular
Helena Cabral - BI4ALL
[email protected]
Our Main Sponsors:
Say Thank you to Volunteers:
They spend their FREE time to give you this
event.
Because they are crazy.
Because they want YOU
to learn from the BEST IN THE WORLD.
Paulo Matos:
Pedro Simões:
André Batista:
Paulo Borges:
André Melancia:
Murilo Miranda:
Quilson Antunes:
5 Sponsor Sessions at 15:10
Don’t miss them, they might be getting
distributing some awesome prizes!
Rumos
BI4ALL
Bold Int
CozyRoc
Pythian
Important Activities:
WIT – Women in Technology
15:10 at BizSpark Room (Ground Floor)
SQLClinic Challenges
10:00 DEV (Neil Hambly)
11:50 DBA (Uwe Ricken)
17:00 BI (Steph Locke)
Agenda
SSAS Evolution
Tabular: What’s it and how does it work?
Multidimensional vs Tabular
Development experience
Data modeling
Scalability
Query performance
Process performance
Security
Client applications support
Conclusions
v12
SQL Server 2014
v11
SQL Server 2012
v10
SQL Server 2008
(and R2)
v9
SQL Server 2005
v8
SQL Server 2000
v7
SQL Server 7.0
•First release with OLAP Services
•MOLAP, ROLAP and HOLAP architectures
•MDX query language
•SSDT-BI
•Power View for Multidimensional Models (using DAX)
•xVelocity engine
•Power Pivot and Tabular Model
•DAX language
•Business Intelligence Semantic Model (BISM)
•Improved Scalability and Performance
•Attribute and User Hierarchies
•Business Intelligence Development Studio (BIDS)
•Unified Dimension Model (UDM)
•Data Mining services
•Improved dimension design (changing dimensions, parent-child,
virtual dimensions)
•Improved calculation engine (custom rollups, cell calculations)
•Dimension security
•Connectivity over HTTP
SSAS Evolution
SSAS Evolution
Why two approaches?
Only one data modeling tool for
different needs
Similar development
experience
Easy migration between SelfService and Corporate BI
New column-oriented and inmemory storage technology
Multidimensional concepts are
still difficult to learn
What is it? How does it work?
Does it still have dimensions and measure groups?
SSAS TABULAR
SSAS Tabular
What is it?
Relational modeling
Uses xVelocity which is an in-memory
engine with column-oriented storage
Provides higher compression and faster
in-memory operations such as
aggregations
Supports for both MDX and DAX query
languages
Available in SQL Server 2012/14
Enterprise and Business Intelligence
editions
Also supports a DirectQuery mode
You can add calculated columns and
measures to any table in the model
You can create hierarchies in any table
in the model
You can partition any table in the model
Unlike Multidimensional models, there’s no
difference between a dimension table and a
measure group table.
Sometimes we don’t need a dimensional
model!
Dimensions vs Fact Tables
Which development tools can I use?
How long is the learning curve?
DEVELOPMENT EXPERIENCE
Development tools
TABULAR
SQL Server Data Tools for
Visual Studio
Open an existing Power Pivot
model
MULTIDIMENSIONAL
SQL Server Data Tools for
Visual Studio
Development process
TABULAR
Easy to wrap a model over raw
sources or DW for prototyping
purposes
Model can be created from a
Self-Service experience
Smaller learning curve (Excel
like)
Need to learn DAX
MULTIDIMENSIONAL
Model should be created over
a DW source
Not so easy prototyping
Higher learning curve
IT required
Wide knowledge base
Need to learn MDX
Development experience
TABULAR
Workspace database directly
related with the project
WYSIWYG experience, but
also can make the
development experience too
slow
Single developer at a time (one
model per solution)
MULTIDIMENSIONAL
Need to be deployed and
processed to server to see the
results
Support for multiple developers
Does Tabular have the same advanced model options?
Do I have to learn DAX?
DATA MODELING
Data Modeling
TABULAR
MULTIDIMENSIONAL
Relational Modeling:
Tables and more tables
Dimensional Modeling:
Facts and Dimensions
Basic Relationships (…and
advanced relationships
through DAX calculations)
Advanced Relationships
natively supported (parentchild, many-to-many, attribute
relationships, …)
No role playing dimension (two
tables can only have one
active relationship)
Support multiple granularity
DAX calculations
Do not support multiple
granularity in the same
measure group
MDX calculations
Other (un)supported features
TABULAR
MULTIDIMENSIONAL
Some features (as Actions,
Translations and Display
Folders) are possible by
editing the generated XMLA or
by using BIDS Helper tool
available at CodePlex
Named-Sets, Scopes, Actions,
Translations
Scopes can be defined by
using calculated columns
Display Folders, custom format
of measures, default values for
dimensions attributes
Write-back, Data Mining,
Custom Assemblies, Custom
Rollups
Process behavior configuration
DimSalesReason
SalesReasonKey
SalesReasonName
SalesReasonReasonType
FactInternetSalesReason
SalesOrderNumber
SalesOrderLineNumber
SalesReasonKey
Relational model
FactInternetSales
SalesOrderNumber
SalesOrderLineNumber
SalesAmount
SalesAmountWithReason :=
CALCULATE (
SUM ( 'Internet Sales'[Sales Amount] );
FILTER (
FactInternetSalesReason;
CALCULATE ( COUNTROWS ( DimSalesReason ) > 0 )
)
)
Many to many relationships are not
supported by Tabular (natively)
But can be achieved by using some
advanced DAX calculations!
DAX Calculation
DEMO #1
DimProduct
ProductKey
(...)
Relational model
FactProductInventory
DimDate
ProductKey
DateKey
UnitsIn
UnitsOut
DateKey
(...)
DAX Query
EVALUATE
SUMMARIZE (
CALCULATETABLE (
'Product Inventory';
'Product Category'[Product Category Name] = "Clothing"
);
'Product'[Product Name];
'Date'[Date];
"Total Units Movement", [Total Units Movement]
)
ORDER BY 'Product'[Product Name]; 'Date'[Date]
Calculating balances by using
transactions
Need to sum all transactions up to a
given day
A common solution is to create a
snaphost table with the quantity
(calculated in ETL) for each product/day
Calculate balances on the fly using
Tabular/DAX!
DEMO #2
Does Tabular support a large volume of data?
SCALABILITY AND DEPLOY
Operation modes
TABULAR
MULTIDIMENSIONAL
In-memory/cached:
All data is loaded in memory
and all queries are answered
from there
MOLAP:
Data is pulled from relational
store, is aggregated and stored
in a multidimensional format
Direct Query:
Bypass the in-memory model,
queries are answered directly
from database source (SQL
Server only)
ROLAP:
All data stay in the relational
store and additiona objetcs are
created for calculated
aggregations
Model scalability
TABULAR
MULTIDIMENSIONAL
Data volume mostly limited to
physical memory
Data volumes can scale to
multiple terabytes
Small/medium size models
Medium/large size models
Processor centric:
Memory is the most critical
resource
Disk centric:
Disk performance is important
because is used as the primary
data storage for preaggregated data
Disk performance is not an
important factor
How fast is Tabular compared to Multidimensional?
Is always faster?
QUERY PERFORMANCE
Querying scenarios
TABULAR
Always read columnar data
from RAM
Aggregate data in memory
Second (warm) run is just as
fast the first (cold) one
MULTIDIMENSIONAL
With no aggregations defined:
read atomic data from disk and
aggregate data in memory
With predefined aggregations:
Cold Cache: read aggregated
data from disk
Warm Cache: read
aggregated data from RAM
Querying preparation
TABULAR
MULTIDIMENSIONAL
Use DAX for detailed reports
Use MDX or DAX for querying
Use MDX for aggregated data
Need to define aggregations
No special tuning is required
Benefits from warmup cache
mechanisms
Outperforms Multidimensional:
Report on low granularity data
Outperforms Tabular:
Report on aggregated data
with predefined aggregations
on Warm Cache
Report on aggregated data
with no predefined
aggregations
MDX Query
SELECT
{
[Measures].[Discount Amount]
,[Measures].[Reseller Freight Cost]
,[Measures].[Reseller Gross Profit]
,[Measures].[Reseller Total Product Cost]
,[Measures].[Reseller Sales Amount]
,[Measures].[Reseller Tax Amount]
} ON COLUMNS,
NON EMPTY
(
[Geography].[Geography].[City]
*[Product].[Product].[Product]
*[Sales Territory].[Sales Territory Region].[Sales
Territory Region]
*[Promotion].[Promotion].[Promotion]
)
ON ROWS
FROM [Adventure Works]
Detailed queries are normally an
headache
Most of the times, we opt to query
directly the DW relational source
With Tabular, this detailed queries
perform better!
DEMO #3
DimCustomer
CustomerKey
Relational model
(...)
FactInternetSales
CustomerKey
OrderDateKey
SalesAmount
DimDate
Distinct Count (DC) operations are
expensive in Multidimensional
Moreover, need to create and maintain
multiple measure groups (one for each
DC measure)
With Tabular, DC measures performs
better without need special modeling!
DateKey
(...)
DAX Query
NewCustomers :=
CALCULATE (
DISTINCTCOUNT ( 'Internet Sales'[CustomerKey] );
FILTER (
ALL ( Date );
'Date'[DateKey] <= MAX ( 'Date'[DateKey] )
)
)
- CALCULATE (
DISTINCTCOUNT ( 'Internet Sales'[CustomerKey] );
FILTER (
ALL ( Date );
'Date'[DateKey] < MAX ( 'Date'[DateKey] )
)
)
DEMO #4
What are the process options?
Is there any dependencies?
PROCESS PERFORMANCE
Processing preparation
TABULAR
Every table can be partitioned
No processing dependencies
(one table can be processed
without processing related
tables)
MULTIDIMENSIONAL
Only measure groups can be
partitioned
Processing dependencies
(need to process dimensions
first)
Don’t support parallel partition
processing
Support for parallel partition
processing
Data in other tables/partitions
can be queried during
processing
Aggregations must be
recalculated when dimensions
processing occurs
Processing options
TABULAR
Full process of a whole DB
might consume too much
memory
Process Defrag: rebuilds the
table dictionary
Process Recalc: recalculates
calculated columns, rebuilds
relationships, user hierarchies
and other internal engine
structures
MULTIDIMENSIONAL
• Process Index: creates or
rebuilds indexes and
aggregations for all processed
partitions
Does Tabular have the same security options as Multidimensional?
SECURITY
Security
TABULAR
MULTIDIMENSIONAL
Don’t support cell security, only
table and row level security
• Advanced security features that
can go up to cell-level
Drill through permission is set
to all roles
• It’s possible to set drill through
permission at role level
Dynamic Security
• Dynamic Security
Only supports visual totals
• Supports both visual totals and
non-visual totals
Use DAX to restrict access to
Row Sets
Do I have to change my client applications?
CLIENT APPLICATIONS SUPPORT
Client Applications Support
TABULAR
Query languages:
MULTIDIMENSIONAL
Query languages:
In Cached Mode: MDX and
DAX
In DirectQuery mode: only
DAX
• MDX and DAX (as of SQL
Server 2012 SP1 CU4)
Client tool
Language
Microsoft Excel PivotTable
MDX
Power View
DAX
SQL Server Reporting Services
MDX / DAX
PerformancePoint
MDX
Other applications (ADO.NET of OLE DB
provider)
MDX / DAX
So.. What do I choose?
CONCLUSIONS
You should choose Multidimensional!
You have an extremely large cube that exceeds server memory
You need some of the features not supported by Tabular (even
with DAX)
You have a complex model (budgeting/forecasting, multiple
many-to-many, parent-child…)
You are an MDX expert
You hate workarounds
Your reporting needs are typically aggregated
You have all your models in Multidimensional mode and want to
migrate to Tabular just because!
You should choose Tabular!
You have small/medium models that fit in server memory
You need a non-multidimensional model
You have simple models (and not need neither of the
unsupported features)
You really never understood MDX (just give a try to DAX,
although is not so simple than they say)
You are an enthusiastic about workarounds!
You have detailed and ad-hoc reports that takes to long to run
You have a new model with no special needs and it will be great
if you can show it just after lunch (prototyping!)
Our Main Sponsors: