Parallel Data Warehouse - Optimize Our Inventory

Download Report

Transcript Parallel Data Warehouse - Optimize Our Inventory

•
•
−
−
−
−
•
•
Microsoft Data Warehousing Vision
Make SQL Server the gold standard for data warehousing offering customers
Massive Scalability at
Low Cost
Hardware Choice
Improved Business Agility
and Alignment
Democratized Business Intelligence
Customer Challenges
Increased volumes of data
Need to reduce costs
Increased adoption of
DW appliances
Move to MPP
Demand for flexibility
and mixed workloads
across the enterprise
Desire for
real-time analytics
Growing importance of
data quality
Techniques
DW
Technologies
Effect
Approximate
Processing
of current
used
used
data
Architectures
inrecession
primary
involume
primary
data
on
managed
data
DW
warehouse
warehouse
teams
by data
and
solution
warehouse
solution
projects
Have
Today
Today
Today
Today
Would
In
In 33 years
Years
years
Prefer
Budget reduced
57%
17% 42%
46%
21%
53% 61%
Symmetrical
Multiprocessing
Hiring
frozen
41%
Real-time
Data
Warehouse
data
Mixed
DataLess
warehousing
Quality
Workloads
Appliance
Tool
than
500 GB
(SMP)
5% 27%
72%
78%
82%
92%
Approved projects on hold
30%
20%
33%
Priorities
shift to
short-term gains 38%
Massively
Advanced
Parallel
Analytics
Processing
31%
500 GB – 1 TB
(e.g. data mining/predictive)
(MPP)
12%
85%
68%
No impact so far
27%
6%
21%
New tool and platform acquisitions
1 Other
– 3 TB
25%
frozen
5% 18%
Some team members laid off
3 – new
10 TB
Focus shifted from
dev to
admin of old solutions
19%
19%
18%
25%
3%
Other 17%
More than 10 TB
34%
Don’t Know
2%
6%
Source: TDWI
4
100%
75%
Flat growth,
good/
moderate
commitment
Centralized
EDW
HA for DW
64-bit

MDM

Web Services

Security
Real-time DW
 Streaming
Data 
Mixed Workloads  SOA

DW Appliance
Server
Virtualization
25%
DW
Bundles
0%
Declining
usage despite
commitment
Decreasing Usage

MPP
DBMS Built
for
Transactions
SMP
Narrow Commitment

Data
Quality
Analytics
within EDW
Blades in
Racks
-50%
Good growth,
good commitment
Advanced
Analytics
Analytics
Outside EDW DBMS Built
for DW
50%
Plan to Use
Broad Commitment
Data Warehouse Industry Trends
-25%
0%
Data Federation Low-Power
Hardware
Columnar DBMS
In-Memory DBMS
SaaS
Open Source
Open Source
OS
Reporting
Open
Source
Software
Data
Integration
Appliance
Open Source DBMS
Public Cloud
25%
50%
Anticipated Growth in the next 3 Years
 Areas of strategic investment for Microsoft
Good growth,
moderate
commitment
Good growth,
small commitment
75%
100%
Increasing Usage
Source: TDWI
Gartner
Forrester
• SQL Server is a Leader in Data Warehousing
• Microsoft is the most aggressive DBMS
vendor with a strong road map
IDC
• SQL Server ships more units than Oracle and
IBM combined
• SQL Server is the fastest growing of the top 5
Data Warehouse Vendors
The Magic Quadrant is copyrighted February, 2008 by Gartner, Inc. and is reused with permission. The Magic Quadrant is a graphical representation of a marketplace at and for a specific time period. It depicts Gartner’s analysis of how certain vendors measure
against criteria for that marketplace, as defined by Gartner. Gartner does not endorse any vendor, product or service depicted in the Magic Quadrant, and does not advise technology users to select only those vendors placed in the “Leaders” quadrant. The
Magic Quadrant is intended solely as a research tool, and is not meant to be a specific guide to action. Gartner disclaims al l warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular
purpose.
Credit Card Company Runs its Business with 17Terabyte Mission Critical BI Solution
Mediterranean Shipping Company Managing 22
Terabytes of data with SQL Server 2008
Raiffeisen Banking Groups builds a central data
warehouse to analyze 55 terabytes of data
Danish Supermarket Group Manages 10
Terabytes of Business Intelligence with SQL
• Integrate your data
• Connect to any source
• Develop visually
• Predictable Response
• Simplified Management
• Scale across mixed
workloads
• Fast Query
Performance
• Deliver Relevant
information
• Share insights
High speed Adapters
Data Compression
Star Join Query Optimization
MERGE SQL Statement
Backup Compression
Parallel Query Enhancements
Change Data Capture (CDC)
Resource Governor
Scale-out Shared Databases
Persistent Lookups
Policy Based Administration
Data Mining Improvements
Data Profiling
Partition-Aligned Indexed
Views
New - Report Builder 2.0
…………
…………
…………
Solution to help customers and partners accelerate their data warehouse
deployments
•
method
•
• Best practices
configurations
Software:
• SQL Server 2008 Enterprise
• Windows Server 2008
Configuration guidelines:
• Physical table structures
• Indexes
• Compression
• SQL Server settings
• Windows Server settings
• Loading
Hardware:
• Tight specifications for servers,
storage and networking
• ‘Per core’ building block
14
Fast Track Data Warehouse 2.0
Twelve SMP Reference Architectures
SI Solution Templates
Server
CPU
HP Proliant
DL 385 G6
HP Proliant
DL 380 G6
HP Proliant
DL 585 G6
HP Proliant
DL 580 G5
HP Proliant
DL 785 G6
Dell PowerEdge
R710
Dell Power Edge
R900
IBM X3650 M2
(2) AMD Opteron Istanbul
six core 2.6 GHz
(2) Intel Xeon® 5500 Series
Quad core
(4) AMD Opteron Instanbul
six core 2.6 GHz
(4) Intel Xeon® 7400 Series
six core
(8) AMD Opteron Istanbul
six core 2.8 GHz
(2) Intel Xeon Nehalem quad
core 2.66 GHz
(4) Intel Xeon Dunnington
six core 2.67GHz
(2) Intel Xeon Nehalem quad
core 2.67 GHx
(4) Intel Xeon Dunnington six
core 2.67 GHz
(8) Intel Xeon Nehalem four
core 2.13 GHz
(2) Intel Xeon Nehalem quad
core 2.66 GHz
(4) Intel Xeon Dunnington
six core 2.67GHz
IBM X3850 M2
IBM X3950 M2
Bull Novascale
R460 E2
Bull Novascale
R480 E1
CPU
Cores
SAN
12
(3) HP MSA2312fc
8
(2) HP MSA2312
24
(6) HP MSA2312fc
24
(6) HP MSA2312
48
(12) HP MSA2312
8
(2) EMC AX4
24
(6) EMC AX4
8
(2) IBM DS3400
24
(6) IBM DS3400
32
(8) IBM DS3400
8
(2) EMC AX4
24
(6) EMC AX4
Data Drive
Count
(24) 300GB 15k
SAS
(16) 300GB 15k
SAS
(48) 300GB 15k
SAS
(48) 300GB 15k
SAS
(96) 300GB 15k
SAS
(16) 300GB 15k
FC
(48) 300GB 15k
FC
(16) 200GB 15K
FC
(24) 300GB 15k
FC
(32) 300GB 15k
SAS
(16) 300GB 15k
FC
(48) 300GB 15k
FC
Initial
Capacity*
Max
Capacity**
6TB
12TB
4TB
8TB
12TB
24TB
12TB
24TB
24TB
48TB
4TB
8TB
12TB
24TB
4TB
8TB
12TB
24TB
16TB
32TB
4TB
8TB
12TB
24TB
Fast Track Data Warehouse Benefits
Appliance-like time to value
Reduces DBA effort; fewer indexes,
much higher level of sequential I/O
Choice of HW Platforms
Dell, HP, Bull, EMC and IBM – more in
future
Low TCO Through
Commodity Hardware and value pricing;
Lower storage costs.
High Scale
New reference architectures scale up to
48 TB (assuming 2.5x compression)
Reduced Risk
Validated by Microsoft; better choice of
hardware; application of Best Practice
<Session Name>
Microsoft NDA-only
17
Fast Track Data Warehouse Timeline
2.0
Enterprise ETL Services
Star Join Query Optimizations
Data Compression
Partitioned table parallelism
2008
New Reference Architectures from IBM
Updated Configurations from HP, Dell and
Bull
EMC as a Service Partner for Fast Track
2009
Fast Track vNext
Future Partners to create new
Validated Reference
Architectures with Test Harness
Incorporates SQL vNext
2010
Beyond
Test Harness for Partners
DW Reference Architectures
Predictable performance at low
cost
Faster time to solution
Microsoft to create Test Harness for
validation of new Fast Track
configurations
NEC to validate new Reference
Architectures
?
?
?
•
•
•
•
•
Parallel Data Warehouse Node
Database Server
Storage Node
Parallel Data Warehouse Appliance - Hardware
Architecture
Storage Nodes
Database Servers
Control Nodes
SQL
Active / Passive
SQL
Client Drivers
SQL
SQL
SQL
Data Center
Monitoring
Landing Zone
Dual Infiniband
SQL
SQL
SQL
SQL
ETL Load Interface
Backup Node
Dual Fiber Channel
Management Servers
SQL
SQL
Corporate Backup
Solution
Corporate Network
Spare Database Server
Private Network
Parallel Data Warehouse demo at BI conference
2008
• Query
− Cache flushed
− Inner joins
Report
Retailer: day-part analysis
Sales, Time, Date, Prod type
Sample Results
625K rows returned in 11
seconds from 1 trillion row table
Final product will be even faster
Existing
Environment
Hardware
16 CPU HP 8620 Itanium
Hitachi Storage 27TB Raw
SATA 21 LUNS
Software
Windows 2003 SP2
SQLServer 2008
SSIS/SSRS
Current
Challenges
Data Load Speeds
Improved by 300%
Analytic Capacity
30TB/160 Cores
Analytic Speed
Query Speeds 70X
Improvement
Mixed Workload
Concurrency
Mixed Workload
Total Cost of
Ownership
TCO Lowered by
50%
Data Warehouse
18 Terabytes
Star Schema
80 Fact Tables
500 + Dimensions
Madison
Highlights
Parallel Data Warehouse
•
•
•
•
−
−
•
•
−
−
PDW vNext
MTP Program Launched
Circa 10 Customers Provided with early
Madison Benchmark
Madison Named as SQL Server 2008 R2
Parallel Data Warehouse
List Price at $57.5K per proc
Microsoft Announce Intention to
Acquire DATAllegro (July)
Acquisition Closes (Sept)
150TB demo of DATAllegro on SQL
Server run at BI Conference (Oct)
2008
2009
Project “Madison”
Compatibility with DATAllegro v3
MS BI integration
2010
MTP 2 Program to Launch (fully
functional, fully performant)
TAP Program (on client site)
RTM in Summer 2010
Focus on continually lowering the
costs of high end DW, while
increasing performance
Additional Hardware Partners
Closer functional alignment with SQL
Server
Better integration with SQL and tools
and technologies
Beyond
Hub and Spoke – Flexible Business
Alignment
Parallel database copy
technology enables rapid
data movement and
consistency between hub
and spokes
Support user groups with
very different SLAs:
Performance
Capacity
Loading
Concurrency
Create SQL Server 2008, Fast Track Data Warehouse, and SQL Server Analysis
Services spokes
Departmental
AEDW
Hub provides
and Spoke
data
“single
solution
marts
version
enable
givesofyou
mixed
truth”
theworkloads,
but
flexibility
makesto
but
it add/change
difficult
make ittodifficult
support
diverse
to
workloads/user
mixed workloads
consolidate
groups,
andwhile
information
multiple
maintaining
user
across
groups,
data
theeach
enterprise
consistency
requiring
across
SLAsthe
enterprise
DW products positioning
PDW with
Hub-and-spoke
Scale
Complexity
HA by default
SW-HW integration
1 Minimal HW tune
up/optimization.
Supports mixed
workloads
2 Balanced solution for
mostly scan centric
workloads.
4
3
PDW
3 Max HW tune up for
SQL Server 2008
with Fast Track
Reference Architecture
2
SQL Server 2008
1
Start
here
most DW scenarios.
4 Most flexible
Architecture for
handling all DW
scenarios.
•
−
−
−
•
−
−
−
−
•
−
−
−
•
Microsoft Data Warehousing portal
Fast Track
Parallel Data Warehouse
−
−
DW Portal
−
−
•
−
−
−
Summary
Microsoft DW
offers customers
Fast Track Data Warehouse
offers customers
Massive scalability at low cost
Hardware choice
appliance-like ease of deployment, scalability and
performance for SMP
Parallel Data Warehouse
offers customers
massively parallel scale and performance
Appliance experience
Hub & Spoke Architecture
offers customers
Better solution for customers than consolidation
‘Best of both worlds’ solution
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,
it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.