SSIS - Prodata Blog | SQL Centre of Excellence

Download Report

Transcript SSIS - Prodata Blog | SQL Centre of Excellence

18th July, 2014
Honest Bob’s Cube Processing
Bob Duffy
Database Architect
Prodata SQL Centre of Excellence
Bob Duffy
•
•
•
•
•
•
•
20 years in database sector, 250+ projects
Senior Consultant with Microsoft 2005-2008
One of about 25 MCA for SQL Server globally (aka SQL Ranger)
SQL MCM on SQL 2005 and 2008
SQL Server MVP 2009+
SSAS Maestro
Database Architect at Prodata SQL Centre of Excellence
•
•
http://blogs.prodata.ie/author/bob.aspx
[email protected]
@bob_duffy
What we Will Cover
How Processing Works
Tips to Optimise Processing
Benchmarking Tools
Processing Options and Strategies
Tips for Tabular Models
What's a cube ?
Dimension Processing
Build Processing Schedule
Process Dimension
Allocate
Resources
Process Attribute
BuildIndex
Write Data
Attribute
Build Decode Stores (Hierarchies)
Begin
Processing
???
Read Data
Write Decode Stores
Process
Data
Read Data
Execute SQL
Write Data
Wait Time
Read
Buffer 1
Data
Cache 1
Read
Buffer 2
Data
Cache 1
End
Attribute
Processing
Measure Group Processing
Process Cube
Process Measure Group
Process Partition
Write Data
Read Data
Execute SQL
BuildAggsAndIndexes
BuildIndex
Wait Time
Aggregate
MergeAggsOnDisk
Recap: Tuning Attribute Properties
Key Column
Name Column
AttributeHierarchyEnabled
AttributeHierarchyOptimized
AttributeHierarchyOrdered
Add correct relationships
Avoid Large Dimensions
Or
DELETE THEM
Determining Size of Objects
>http://blogs.prodata.ie/post/Tool-for-baselining-MOLAP-Processing.aspx
Slow Execute SQL ?
Read Data
1. Check WAIT Statistics
•
What does ASYNC_NETWORK mean ?
2. For Dimensions
•
Optimise for Select Distinct
Or Switch “by Table Mode”
•
3. For Facts
1. Ensure narrow base table (no joins!)
2. Optimise Scan speed
•
•
•
•
512k Read Ahead
Fast Disks
Good File Group Layout
No Extent Fragmentation (-E trace flag)
Execute SQL
Wait
Time
Process Data
Write Data
Review Grain
Read Data
Increase BufferRecordLimit
Or Aggregate in SQL
Too many Measures ?
Use Calculated Measures
Do we have disk bottleneck
Do we have Network bottleneck
Are we consuming a lot of CPU
Match SSAS to SQL Types
Idle Server Resources
Increase Parallelism
Execute SQL
Optimising Bitmap Indexes
1.
2.
3.
4.
5.
Optimise Attributes on Dimensions
Ensure good CPUs !
Ensure max parallelism
Don’t over partition
Will sorting and Compression Help ?
BuildAggsAndIndexes
BuildIndex
Aggregate
Optimising Compression by Sorting
Each Segment in MOLAP is 64k
http://dennyglee.com/2013/09/30/analysis-servicesmultidimensional-it-is-the-order-of-things/
Optimising Aggregations
Use correct Attribute Relationships
Don’t over aggregate (say 20)
Consider consolidating aggregations
Avoid spilling to disk
Ensure max parallelism
CPU should be saturated!
BuildAggsAndIndexes
BuildIndex
Aggregate
MergeAggsOnDisk
Benchmarking Processing Time
Use XMLA to Process interesting objects
Use XMLA to capture a trace file
Use logman to capture perfmon counters
Disk throughput and latency
CPU
SSAS Threads
Automate with a batch file
Analysing the Trace File
>http://blogs.prodata.ie/post/Tool-for-baselining-MOLAP-Processing.aspx
Case Study – Tuning AW
Size Before
8.8 GB Data, 1.2 GB Index
Size After
8.5 GB, 0.8 GB Index
ENGINE TUNING TIPS
Connection String
Check Data Source
Try OLEDB driver instead of SqlClient
Max TCP/IP Packet Size set to 32,756
Up default connections from 10
Don’t use TCP/IP on shared server
INI Settings
INI Setting
Impact
OLAP\Process\
AggregationMemoryLimitMin
AggregationMemoryLimitMax
Increase Max if spill to disk
Decrease Min if lots of partitions
ThreadPool\Process\MaxThreads
Increase if maxed out
CoordinatorExecutionMode
Increase Threads per core
BufferMemoryLimit
Improve data grouping
TempDir
Can place on faster disk
See Analysis Services Performance Tuning paper
Processing Options & Strategy
Dimensions
Process Full
ProcessData
ProcessIndex
Process Update
ProcessAdd
Cube/Measure Group
ProcessFull
ProcessData
ProcessIndex
ProcessAdd
Can we use ROLAP ?
Cube Processing Deep Dive
THE TABULAR MODEL
Tabular Processing Events
Command Begin/End (XMLA)
Meta Data Load
Load
Build Processing Schedule
ExecuteSQL
Process Table
Process Measure Group
Process Partition
Hierarchy Processing
Process Table (GUID)
ReadData
Vertipaq
(Per segment)
Compress Segment
(per Column / Segment)
Analyze / Encode
Relationship Prepare
Tabular Processing Types
ProcessFull
ProcessData
ProcessAdd
ProcessRecalc
ProcessDefrag
ProcessClear
Baselining Tabular Processing
>http://blogs.prodata.ie/post/Baselining-Tabular-Model-Processing-with-a-trace-file.aspx
Sizing Tool
>http://www.powerpivotblog.nl/what-is-using-all-that-memory-on-my-analysis-server-instance/
Optimising Columns
Don’t store unused columns like fact keys
Reduce Precision
Split high cardinality columns
Sort we improve sorting
ProcessingTimeBoxSecPerMRow
Review Segment Size
DefaultSegmentRowCount
Smaller may process faster in parallel
Larger may compress better
Other Tips for tabular
Replace simple calculated columns with DAX
Optimise the ExecuteSQL for table scan
Replace multiple ProcessFull with
ProcessData and Process Recalc
Avoid NUMA hardware
Should we Partition ?
CANNOT be processed in parallel
No support for partition elimination
Thank You
SQL SATURDAY #310
DUBLIN
LEARN FROM INTERNATIONAL EXPERTS
- Sept 19th: 6 x Precons
- Sept 20th: 6 x Tracks – BI,DEV,DBA,2014,Cloud
20th
Sept
DUBLIN
STAY
PARTY
Hilton Hotel
Charlemont – rooms
from €159 B&B
After party and
entertainment during
the event
REGISTER
http://www.sqlsaturday.
com/310/
THE CRAIC