PowerPoint-præsentation
Download
Report
Transcript PowerPoint-præsentation
Stories from the trenches
- Partitioning as a design pattern
•
•
•
About the FLIS project
Data warehouse heroes and architecture
Partitioning in many disguises
The FLIS Project
Access to your own data?
• ~ 72 mill. kr. to
get data for 4
years
• (10 mill. euro)
Bill
Ralph
Dan
par·ti·tion (pär-tshn)n.
1.a. The act or process of dividing something into parts.
b. The state of being so divided.
2.a. Something that divides or separates,
as a wall dividing one room or cubicle from another.
b. A wall, septum, or other separating membrane in an organism.
3. A part or section into which something has been divided.
…
Source: http://www.thefreedictionary.com/partitioning
Dividing something into parts
• Files
• Database
• Table
State of being so divided
• ETL developers vs. Customer
• Developers vs (evil?) DBA’s
• Backups (what is a full backup anyway)
• Xml, csv, xls, fixed
format
• 1 file
• Data from
• 1 or more tables
• Data from 1 or more municipality
• 30 different file
naming schemes
• Csv
• 1 file
• 1 table
• Data from just 1 municipality
• 1 file naming scheme
Files
(complex)
File-dsa
mappings
(complex)
Dsa tables
Files
(complex)
Preprocessor
(complex)
File-dsa
mappings
(simple)
Dsa tables
1 data warehouse layer = 1 database
• Scaling
• IO-pattern for a data flow
1 database
• 4 LUNS, RAID 10
• 4 datafiles
Filegroups
• PRIMARY
• 160MB
• BIG ONE
• A lot of data
Table partitioning (2005 EE feature)
• Pruning
• Divide the data warehouse into 100 parts 1/100 the size
• Switching
• Separation of readers and writers
• Fast
• DSA
• Partition by (municipality_id, month_year)
• EDW and data marts
• Partition by municipality_id
CREATE TABLE [dbo].[partition_test](
[Kommune_id] [varchar](11) NULL,
[Kommunenummer] [nvarchar](4000) NULL,
[Distrikt_kode] [nvarchar](4000) NULL,
[Distrikt_type] [nvarchar](4000) NULL,
[Distrikt_tekst] [nvarchar](4000) NULL,
[Cpr_Dist_Tekst_TS] [nvarchar](4000) NULL
) ON [partition_test_pt_sc]([Kommune_id])
GO
• Affinity masking
• Not really possible in
Oracle – even on
Windows
FlowType
FlowTypeID (int)
FlowTypeName (varchar)
TransformationSchema
TransformationSchemaID (int)
TransformationTypeID (int)
TransformationSchemaName (varchar)
Flow
FlowID (int)
FlowTypeFK (int)
FlowName (varchar)
TransformationType
TransformationTypeID (int)
TransformationTypeName (varchar)
Table
TableID (int)
DWLayerFK (int)
TableTypeFK (int)
TableName (varchar)
DWLayer
DWLayerID (int)
DWLayerName (varchar)
Transformation
TransformationID (int)
FlowFK (int)
TransformationSchemaFK (int)
Logic (varchar)
Order???
Column
ColumnID (int)
ColumnTypeFK (int)
TableFK (int)
ColumnName (varchar)
DataTypeFK (int)
StandardColumn
StandardColumnID (int)
DWLayerFK (int)
StandardColumnName (varchar)
Logic (varchar)
Mapping
MappingID (int)
TransformationFK (int)
DestinationColumnFK (int)
SourceColumnFK (int)
ColumnType (source data)
ColumnTypeID (int)
ColumnTypeName (varchar)
TableType
TableTypeID (int)
TableTypeName (varchar)
DataType
DataTypeID (int)
DataTypeName (varchar)
Developers vs.
(evil) DBA’s
•
•
• Meta data on table definitions
• Script ddl from meta data
• Hide partitioning in ddl
• Partition Scheme
• Change File group design
• Backups: DBA’s vs. Backup administrator
• Partitioning helps us divide data into
• Hot
• (C)old (and therefore read only)
• Only backup
hot partitions
• Restoring: DBA vs. SQL server
• PRIMARY filegroup
• 150 MB
• Default filegroup
• Big
• Restore database
• only primary filegroup
=> online
• DW vs OLTP and OLAP
• DW server
• ETL
• processing OLAP databases/cubes
• Reporting server
• Sharepoint databases (OLTP)
• restoring OLAP databases/cubes
Page compression results
code
DA000
DA000
DA000
DA001
DA001
DA001
DA009
DA009
DA009
vartype
DGALT
DGCAT
DGPROP
DGALT
DGCAT
DGPROP
DGALT
DGCAT
DGPROP
varval
00K00
06M38A
26X01
00K00
06M38A
26X01
00K00
06M38A
26X01
table
dg1
dg1
dg1
Factor
(size)
cpu time elapsed time
Num rows num reads (ms)
(ms)
1
89132
372
31
739
2
178264
743
31
1517
10
891320
3781
281
7551
dg1_page_compr
1
89132
143
15
748
dg1_page_compr
2
178264
285
62
1512
dg1_page_compr
10
891320
1427
421
7596
or
Evaluation
Scale:
1 = Very bad
2 = Bad
3 = Relevant
4 = Good
5 = Very Good!
Questions:
• Speaker Performance
• Relevance according to your work
• Match of technical level
according to published level
• Comments
Evaluation
Create a Text message on your phone and send it to 1919 with the content:
Session Code
DB301 5 5 5 I liked it a lot
Kennie
Performance
(1 to 5)
Relevance
(1 to 5)
Match of
technical Level
(1 to 5)
Comments
(optional)
© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing
market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this
presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.