Transcript Document
--A Gem of SQL Server 2012, particularly for Data Warehousing--
Present By Steven Wang
Steven Wang
• Senior DBA/Senior BI Specialist at BNZ
• MCITP/MCTS: BI Developer, Database Developer and
Database Administrator
• Blog: www.msbicoe.com
• Email: [email protected]
Agenda
•
•
•
•
•
•
•
•
•
The Nitty-gritty of Columnstore Indexes (Demo)
Columnstore Indexes Under the Hood (Demo)
Columnstore Indexes Performance Tuning (Demo)
The Limitations of the Columnstore Indexes
How to load data for a colunstore indexed table
The Columnstore Indexes Best Practises
The Impact of Columnstore Indexes on BI
Resources
Q&A
The Nitty-gritty of Columnstore Indexes
• What is columnstore?
The Nitty-gritty of Columnstore Indexes
• What is columnstore?
Row-Store
--Serialize all of the
attribute values in a
row together, then the
values in the next row,
and so on.
ProductKey
…………...
DateKey
…………...
ResellerKey
…………...
Quantity
…………...
Price
…………...
Amount
…………...
Columns-Store
--Serializes all of the
values of an attribute
(column) together
The Nitty-gritty of Columnstore Indexes
• Why columnstore?
--It’s All about I/O;
--It’s all about compression efficiency;
--In columnstore, compression algorithms are working better
and a higher compression ratio can be achieved;
In
Memory
--Columnstore compression improves CPU performance;
--The maximum memory is limited by OS,1TB? 2TB?;
--Get only columns which are queried;
The Nitty-gritty of Columnstore Indexes
• SQL Server 2012 Columnstore Indexes
For the current release, SQL server 11 RTM,only
nonclustered columnstore indexe is allowed
A nonclustered index covers all possible columns and
stores data in column-wise fashion.
Data is highly compressed by using xVelocity
Engine (aka, “Vertipaq”)
A new “Batch mode” execution model is introduced
A columstore index breaks each column into 1 million
row chunks called segment.
Delivers order-of-magnitude gains for DW queries
Base table is not able to INSERT, UPDATE, DELETE while
a nonclustered columnstore index exists
The Nitty-gritty of Columnstore Indexes
• How to create a columnstore index?
CREATE NONCLUSTERED
COLUMNSTORE INDEX
[ci_FactResellerSales]
ON [dbo].[FactResellerSales]
(
[ProductKey],
[OrderDateKey],
[DueDateKey],
[ShipDateKey],
[ResellerKey],
[UnitPrice],
[ExtendedAmount],
[UnitPriceDiscountPct],
[DiscountAmount],
[ProductStandardCost],
[ShipDate]
);
The Nitty-gritty of Columnstore Indexes
• Demo
Lots of fundamental of columnstore indexes
Performance difference between B-Tree and CI
And more…
Columnstore Indexes Under the Hood
• Data is highly compressed. Dramatically reduced IO. More
Data can fit into memory.
Row Store
Column Store
Column Store, Compressed
Columnstore Indexes Under the Hood
• Data is highly compressed. Dramatically reduced IO. More
data can fit into memory.
Row Store
Column Store, Compressed
Columnstore Indexes Under the Hood
Run-length Encoding
How data is compressed?
RLE
Columnstore Indexes Under the Hood
Dictionary Encoding
How data is compressed?
Columnstore Indexes Under the Hood
• A vector-based query execution method called “Batch Mode”
processing is implemented. Dramatically reduced CPU
consumption time.
A Batch is an object that contains about 1000 rows
Columnstore Indexes Under the Hood
• Segment elimination can skip large chunks of data to speed
up scans.
A segment a 1 million of rows
Each segment stores min and max value
If no rows qualify, then the entire segment is skipped for
scan
Columnstore Indexes Under the Hood
• Demo
Columnstore Indexes Performance Tuning
• Maximizing the use of “Batch Mode” processing
The Limitations of the Columnstore Indexes
• Base table is not able to insert, delete and update data
• For current release, no clustered columnstore index
• Columnstore Indexes are not designed for
needle-in-the-haystack kind of queries. No Seek.
Data Type not allowed:
Other restrictions:
No Sparse column
No unique columnstore index
No customer sort option for
columns
No replication
No change tracking
No change data capture
No filestream column
Binary and varbinary
Varchar(max) and nvarchar(max)
Decimal/Numeric precision > 18
Ntext, text, and image
Uniqueidentifier
Rowversion
Sql-variant
Datetimeoffset(>2)
CLR data type
XML
How to load data for a colunstore indexed table
• There is really no simple way that loads data into a
columnstore indexed table
• Columnstore is not born for data updating
• 3 possible ways:
Drop columnstore index, load the data,
recreate. (or disable, load the data, rebuild)
Partition the columnstore indexed table and using
partition switching. Seems the best way.
Using t 2 tables, one with historical data with columnstore
index, another is just a normal table. Complicated, lots of
maintenance
The Columnstore Indexes Best Practises
• Include all columns in the columnstore index whenever
possible
• Put columnstore indexes on large tables only
• Consider to create a clustered index on columns which
are frequently used, like date column
• Structure your queries as star joins with grouping and
aggregation as much as possible
• Avoid joins and string filters directly on columns of
columnstore indexed tables.
• Whenever possible, avoid constructing queries with outer
join, Union all, and not in directly on columnstore indexed
• Using integer whenever possilbe
• Using table partitions
The Impact of Columnstore Indexes on BI
•
•
•
•
•
Data type consideration for Data warehouse designing
The columns chosen for creating a clustered index
Partition clustering
ROLAP vs MOLAP
DirectQuery for the Tabular BI semantic model
Resources
• Eric Hanson: SQL Server Columnstore Performance Tuning
http://social.technet.microsoft.com/wiki/contents/articles/4995.sqlserver-columnstore-performance-tuning.aspx
• BOL: Columnstore Indexes
http://msdn.microsoft.com/en-us/library/gg492088.aspx
• Joe Sack: Exploring Columnstore Index Metadata, Segment
Distribution and Elimination Behaviors
http://www.sqlskills.com/blogs/joe/post/Exploring-Columnstore-IndexMetadata-Segment-Distribution-and-Elimination-Behaviors.aspx
• Benjamin Nevarez: Improve the Performance of Data
Warehouse Queries with Columnstore Indexes
http://www.sqlmag.com/article/sqlserverdenali/data-warehousequeries-columnstore-indexes-141712
• Stavros Harizopoulos, Daniel Abadi: Column-Oriented
Database system
Q&A