Transcript Slide 1
Scalable Data Management
with DB2
Matthias Nicola
IBM Silicon Valley Lab
[email protected]
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
2
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2 Data Server Editions
DB2 for z/OS
DB2 Enterprise Edition /
IBM InfoSphere Warehouse
DB2
DB2 Workgroup Edition
DB2 Express-C (free!)
DB2 Everyplace
3
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Business Value of Scalability
More historical data = more precise forecasts
– Data mining needs a lot of data for pattern accuracy
– OLAP needs a lot of data for forecast accuracy
Predictable costs when growth occurs
– Often the budget is the controlling factor, not technology
– Low maintenance cost is important
No forced migrations from technology limitations
– Enabling very large databases
4
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2 Scalability for OLTP and Data Warehousing
5
Database Partitioning Feature (DPF)
DB2 pureScale
Range partitioning
Multi-Dimensional Clustering (MDC)
Compression
Self-Tuning Memory Management (STMM)
Automatic Storage
Workload Management
High Availability, Disaster Recovery
Recovery
Security and Compliance
Utilities: Load, Backup & Restore, Redistribute
Archiving
etc.
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
6
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2's Database Partitioning Feature (DPF)
select … from table
Tables
FCM network
Engine
Engine
Engine
Engine
…
data+log
data+log
data+log
data+log
Partition 1
Partition 2
Partition 3
Partition n
Database
Database is divided into multiple database partitions
Database partitions run on same or separate servers (shared-nothing)
Each partition has its own table spaces, log, configuration, etc.
Data is spread over N database partitions
Queries are executed in parallel on all database partitions
7
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Flexible configuration options
Possible hardware configurations
All database partitions on a single machine (logical partitions)
easy exploitation of multi-core systems
All database partitions on separate machines (physical partitions)
Hybrid: multiple machines with several logical partitions on each
FCM (Fast Communication Manager)
DB2
Partition
DB2
Partition
DB2
Partition
SMP server
DB2
Partition
SMP server
DB2
Partition
DB2
Partition
DB2
Partition
SMP server
DB2
Partition
SMP server
I/O Channels
I/O Channels
Storage server
Storage server
Example: 4 physical machines, 2 database partitions per machine
http://publib.boulder.ibm.com/infocenter/db2luw/v9r7/topic/com.ibm.db2.luw.admin.partition.doc/doc/c0004569.html
8
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2's Database Partitioning Feature (DPF)
….
….
9
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
The Distribution Map
• Distribution key can consist of one
or multiple columns.
• Avoid low cardinality columns, such as
"gender", "state", etc.
• Unique indexes must contain all columns
of the distribution key
Distribution key
C1
000120
column name
column value
DB2 hash algorithm
5
Distribution map
i
0
1
2
3
4
5
6
7
…
…
p(i)
1
2
3
4
1
2
3
4
…
…
Partition1
10
Partition2
Partition3
32k
Partition4
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
11
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Single Server
12
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2 Database Partitioning Feature = Divide Work
Database Partition 1
13
Database Partition 2
Database Partition 3
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Range Partitioning Further Reduces I/O
Database Partition 1
Database Partition 2
Database Partition 3
January
February
CREATE TABLE sales (recordID
INT,
salesdate
DATE,
...
details
XML)
DISTRIBUTE BY HASH (recordID)
PARTITION BY RANGE (salesdate) EVERY 1 MONTHS ;
March
14
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Multi-Dimensional Clustering to Further Reduce I/O
Database Partition 1
January
February
Database Partition 2
Database Partition 3
CREATE TABLE sales (recordID
INT,
salesdate
DATE,
productID
INTEGER,
storeID
INTEGER,
...
details
XML)
DISTRIBUTE BY HASH (recordID)
PARTITION BY RANGE (salesdate) EVERY 1 MONTHS
ORGANIZE BY (productID, storeID) ;
March
15
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Compression Reduces I/O by a Factor of 3x to 4x
Database Partition 1
Database Partition 2
Database Partition 3
January
February
March
16
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Data Partitioning and Placement Options
Can distribute a table across some or all database partitions.
Can replicate a table to have an identical copy on each partition.
Database Partitions
Part.
1
Part.
2
Part.
3
Part.
4
Part.
5
Part.
6
Part.
7
Part.
8
Table 1: Sales
Table 2: Customer
Table 3:
Product
19
Table 3:
Product
Table 3:
Product
Table 3:
Product
Table 3:
Product
Table 3:
Product
Table 3:
Product
Table 3:
Product
(copy)
(copy)
(copy)
(copy)
(copy)
(copy)
(copy)
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
20
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Join Processing - Example
create table tab1(pk1 int, c1 int,...)
distribute by hash (pk1);
create table tab2(pk2 int, c2 int,...)
distribute by hash (pk2);
Logical data in the tables:
tab1
pk1 c1
1
3
2
3
3
4
7
7
8 12
11 10
12 15
Physical data distribution:
database
partition 1
tab2
pk2
3
4
5
7
8
10
12
15
c2
2
8
3
4
15
10
12
7
distribute by hash*
tab1
pk1 c1
1
3
3
4
7
7
11 10
tab2
pk2 c2
3
2
5
3
7
4
15
7
database
partition 2
tab1
pk1 c1
2
3
8 12
12 15
tab2
pk2 c2
4
8
8 15
10 10
12 12
*For simplicity, this example hashes odd key values to partition 1 and even key values to partition 2
21
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Collocated Join
create table tab1(pk1 int, c1 int,...) distribute by hash (pk1);
create table tab2(pk2 int, c2 int,...) distribute by hash (pk2);
select * from tab1, tab2 where tab1.pk1 = tab2.pk2;
• Both tables are partitioned
by the join key
• Any join matches are guaranteed to
be within any given partition
("co-located")
• No join matches across partitions
partition 1
tab1
pk1
1
3
7
11
tab2
pk2
3
5
7
15
partition 2
tab1
pk1
2
8
12
tab2
pk2
4
8
10
12
• Allows local joins within each
partition, no data movement
• Best case, best performance
22
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Directed Join
select * from tab1, tab2 where tab1.c1 = tab2.pk2;
permanent storage
partition 1
tab1
pk1 c1
1
3
3
4
7
7
11 10
tab2
pk2
3
5
7
15
on the fly / in memory
partition 2
tab1
pk1 c1
2
3
8 12
12 15
partition 1
tab2
pk2
4
8
10
12
DTQ
tab1'
pk1 c1
1
3
2
3
7
7
12 15
tab2
pk2
3
5
7
15
partition 2
tab1'
pk1 c1
3
4
11 10
8 12
tab2
pk2
4
8
10
12
Send rows from tab1 to those partitions where they can find join matches in tab2,
i.e. redistribution of tab1, based on hashing of the join key c1.
23
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Single Partition Directed Join
select * from tab1, tab2
where tab1.c1 = 3 and tab1.c1 = tab2.pk2;
partition 1
tab1
pk1 c1
1
3
3
4
7
7
11 10
tab2
pk2
3
5
7
15
partition 2
tab1
pk1 c1
2
3
8 12
12 15
partition 1
tab2
pk2
4
8
10
12
DTQ
tab1'
pk1 c1
1
3
2
3
partition 2
tab2'
pk2
3
Value predicates are used to optimize (reduce) the data flow
and eliminate irrelevant partitions from the join processing.
24
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Repartitioned Join
select * from tab1, tab2 where tab1.c1 = tab2.c2;
partition 1
tab1
pk1 c1
1
3
3
4
7
7
11 10
tab2
pk2 c2
3
2
5
3
7
4
15
7
partition 2
tab1
pk1 c1
2
3
8 12
12 15
pk2
4
8
10
12
partition 1
tab2
c2
8
15
10
12
DTQ
tab1'
pk1 c1
1
3
2
3
7
7
12 15
tab2'
pk2 c2
5
3
15
7
4 11
8 15
partition 2
tab1'
pk1 c1
3
4
11 10
8 12
tab2'
pk2 c2
3
2
7
4
10 10
12 12
DTQ
Redistribute both tables by hashing on their join keys so that matching
rows end up on the same partition.
25
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Broadcast Join
select * from tab1, tab2
partition 1
tab1
pk1
1
3
7
11
tab2
pk2
3
5
7
15
partition 2
tab1
pk1
2
8
12
tab2
pk2
4
8
10
12
partition 1
tab1'
BTQ
pk1
1
3
7
11
2
8
12
tab2
pk2
3
5
7
15
partition 2
tab1'
pk1
1
3
7
11
2
8
12
tab2
pk2
4
8
10
12
Broadcast a copy of one table to all database partitions.
26
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Data Placement Option: Replicated Table
permanent storage
partition 1
tab1
pk1
1
3
7
11
2
8
12
tab2
pk2
3
5
7
15
partition 2
tab1(copy)
pk1
1
3
7
11
2
8
12
tab2
pk2
4
8
10
12
Good choice for small tables with infrequent insert/update/delete activity,
such as dimension tables in a star schema.
27
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
28
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Scalability vs. Performance
Performance: Time to complete a given task with given resources
Scalability: Ability to add resources to
– complete the same task more quickly
– handle a bigger task in about the same time
Example: Mowing the lawn…
– Peter does it alone in 8 hours
– Peter and Bob work together and take 4 hours
Scalability is perfect, performance is poor!
– Jim does it alone in 1 hour
– Jim and John together do it in 1hrs20min
Performance is great, scalability is awful !
– Mary mows the lawn in 30 minutes
– Mary and Susan together need 15 minutes
Performance is great, scalability is also great !
29
© 2009 IBM Corporation
Fixed Database Size
Mathematically, these
two approaches are
equivalent….
# of partitions
Make queries against a DB of a fixed
size faster by adding partitions
(“speedup”). Amount of data per
partition shrinks.
Query elapsed time
Query elapsed time
Scalability Metrics
Increasing Database Size
Database size &
# of partitions
Hold response time constant for a
growing database by adding partitions in
proportion (“scaleup”/"scale-out").
Amount of data per partition remains
constant.
Basic assumption: Queries executed against a bigger
database examine more data
30
Our Test Design
Query elapsed time
Increasing database size:
250GB / 500GB / 1TB
Increasing number of database
partitions
Fixed ratio of data volume to
number of partitions
n partitions
250GB
31
n*2 partitions
500GB
n*4 partitions
1 TB
Show constant query elapsed
times to prove scalability
TPoX Benchmark
TPoX = Transaction Processing over XML Data
Open Source Benchmark: http://tpox.sourceforge.net/
Financial transaction processing scenario: “online brokerage”
Realistic test for XML databases
Custacc
Customers
Customer
1
n Account
1
11
n
Order
Brokerage
House
DB
CustAcc.xsd
4 – 20 kb
FIXML
(41 XSD files)
1 – 2 kb
n
Holding
n
1
n
1
Security
Security.xsd
2 – 9 kb
FIXML: Standardized Financial XML Schema for Securities Trading !
32
Document structures and join relationships
ID
Name
DateOfBirth
Address
Phone
…
Account
Order
CustAcc
ID
Currency
OpeningDate
Balance
…
Holding Symbol
Name
Type
Quantity
Holding
ID
OrignDt
TrdDt
Acct
Side
Qty
Sym
…
Symbol
Name
Type
Quantity
33
ID
Currency
OpeningDate
Balance
…
Holding Symbol
Name
Type
Quantity
Sector
Industry
Category
OutstShares
FundInformation
Holding…
Account
ID
Security
Symbol
Name
SecurityType
SecurityInformation
StockInformation
FundFamily
Sector
Industry
AssetGroup
FixedIncome
ExpenseRatio
TotalAssets
MinInitialInvestment
MinSubsequentInvest.
Price/LastTrade
Ask/Bid
50DayAvg
200DayAvg
…
TPoX Data & Schema
Customer
1
n
Account
11
n
Holding
n
1
FIXML: financial
industry XML Schema
CustAcc: modeled after
a real banking system
that uses XML
Security: information
similar to investment
web sites
34
n
Order
CustAcc.xsd
1
n
FIXML
(41 XSD files)
1
Security
Security.xsd
Database schema for a non-DPF DB2 database:
create table custacc ( cadoc XML )
create table security ( sdoc XML )
create table order ( odoc XML )
Scale Factor “M”, 1 TB raw data
500M Order documents, 50M CustAcc documents
20,833 Securities, independent of scale factor
3 Simple Tables + XML Indexes
TPoX Database Schema for DPF
- Extract certain XML element values into relational cols as distribution keys
- Goal: enable partitioning of both tables by a common key
!
ID
Name
DateOfBirth
Address
Phone
…
Account
Order
custid secsym
integer varchar
35
ID
OrignDt
TrdDt
Acct
Side
Qty
Sym
…
odoc
XML
order table (500M rows)
custid
integer
CustAcc
ID
Currency
OpeningDate
Balance
…
Holding Symbol
Name
Type
Quantity
Holding
Holding…
Account
Symbol
Name
Type
Quantity
ID
Currency
OpeningDate
Balance
…
Holding Symbol
Name
Type
Quantity
cdoc
XML
custacc table (50M rows)
What is TPoX-DSS*?
• Decision Support workload on top of the
regular XML data of the TPoX benchmark
• A set of complex SQL/XML queries
– Includes massive table scans, aggregation,
grouping, OLAP functions, etc.
• Focus on single-user query response time
36
* we might come up with a better name in the near future
Business Questions Complex SQL/XML Queries
Q1: Popular Securities
Find securities that have more shares bought than sold across all orders.
List their order quantities grouped by year.
Q2: Top 10 Most Popular Trading Weeks, Ranked by Order Volume
For each year, find the ten most active weeks and return the buy, sell, and
total order volumes for each week.
Q3: Average Account Balance of Premiun Customers
Calculate the average account balance of all premium customers, grouped
by their number of accounts.
Q4: Average Balance per Number Of Accounts
Calculate the average account balance of all customers, grouped by their
number of accounts.
Q5: Percentage of buy orders per sector and gender
For each stock in a given sector of securities, find the percentage of buy
orders placed by male vs. female clients.
37
Business Questions Complex SQL/XML Queries
Q6: Max Stock Orders for an Industry
List the 20% (or: x%) most expensive orders for customer in a given state and
for a given industry (subset of securities).
Q7: Order Amounts for Two Major Currencies
Calculate the min, max and avg order amount for all orders in a given
timeframe grouped by buy/sell for two major currencies.
Q8: Order Amounts for All Currencies
Calculate the min, max and avg order amount for all orders in a given
timeframe grouped by buy/sell and the order’s currency.
Q9: Balance per Currency
Each account is in a specific currency. Calculate the average account balance
for each currency.
Q10: Sleeping Customers
Find all customers having less than x orders in a given timeframe.
38
TPoX DSS: Query Characteristics
Query
Tables
Characteristics
Q1
Popular Securities
O, S
2 x XMLTABLE,
Group By, Order By
Q2
Top 10 Most Popular Trading Weeks
O
Full scan of all orders,
OLAP Function rank()
Q3
Average Account Balance of Premiun
Customers
C
Indexed access to premium
customers, Group By, Order By
Q4
Average Balance per Number Of Accounts
C
Full scan of all customers
Q5
Percentage of buy orders per sector and
gender
C, O, S
Aggregation, SQL OLAP Functions,
3 x XMLTABLE, 2 x XMLEXISTS
Q6
Max Stock Orders for an Industry
C, O, S
2 x XMLTABLE, 2 x XMLEXISTS
Q7
Order Amounts for Two Major Currencies
O
Several predicates, CASE expression
Q8
Order Amounts for All Currencies
O
4 aggregation functions,
Group By two XML attributes
Q9
Balance per Currency
C
Full scan of all accounts, aggregation
and grouping
Q10
Sleeping Customers
C, O
Common table expression
All queries available upon request, in SQL/XML notation.
39
Q5: Percentage of buy orders per sector and gender
SELECT DISTINCT secsector, gender,
SUM(ordqty) OVER (PARTITION BY secsector, gender) AS orderqty,
SUM(ordqty) OVER (PARTITION BY secsector, gender) * 100
/ SUM(ordqty) OVER (PARTITION BY secsector) AS percentage
FROM security, order, custacc,
XMLTABLE(' declare namespace s="http://tpox-benchmark.com/security";
$SDOC/s:Security'
COLUMNS secsector VARCHAR(30) PATH '*:SecurityInformation//*:Sector',
secname VARCHAR(50) PATH '*:Name') AS T1,
XMLTABLE(' declare default element namespace "http://www.fixprotocol.org/FIXML-4-4";
$ODOC/FIXML/Order'
COLUMNS ordqty
BIGINT
PATH '*:OrdQty/@Qty') AS T2,
XMLTABLE(' declare namespace c="http://tpox-benchmark.com/custacc";
$CADOC/c:Customer'
COLUMNS gender
VARCHAR(10) PATH '*:Gender') AS T3
WHERE order.secsym = security.secsym AND
order.custid = custacc.custid
AND
XMLEXISTS(' declare namespace s="http://tpox-benchmark.com/security";
$SDOC/s:Security/s:SecurityInformation/*[s:Industry="OfficeSupplies" and
s:MinInitialInvestment=5000]')
AND
XMLEXISTS(' declare default element namespace "http://www.fixprotocol.org/FIXML-4-4";
$ODOC/FIXML/Order[@Side = "2"]')
ORDER BY secsector, gender;
40
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
41
© 2009 IBM Corporation
Data Partitioning in a Cluster
Each node has 2 Intel Xeon 5169 dual-core CPUs, and 32GB RAM.
4 cores per node we use 4 database partitions per node.
8 processing nodes
Node
1
Node
2
Node
3
Node
4
Node
5
Node
6
8 database
partitions, 250GB
16 database partitions, 500 GB
32 database partitions, 1TB
45
Node
7
Node
8
Scalability Results: Cluster
TPoX/DSS Query Response Times (Cluster)
Elapsed time (seconds)
250GB / 8 partitions
500GB / 16 partitions
1TB / 32 partitions
Q1
Q2
Q3
Q4
Source: IBM internally measured results, September 2009
Q5
Q6
Q7
Q8
Q9
Q10
Query
Query response times for 500GB and 1TB are close to the 250GB results!
47
Information Management – DB2
Information Management Software
Agenda
Introduction
DB2 Scalability for OLTP and Data Warehousing
DB2's Database Partitioning Feature (DPF)
– Overview
– Data partitioning, clustering, placement
– Join Methods
TPoX Scalability in a DPF database
– Scalability vs. Performance
– Benchmark configuration & results
pureScale Overview
Summary
48
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2 pureScale
Goals
Unlimited Capacity
Any transaction processing or ERP workload
Start small
Grow easily, with your business
Application Transparency
Avoid the risk and cost of tuning your applications to the database topology
Continuous Availability
Maintain service across planned and unplanned events
Webcast: http://www.channeldb2.com/video/db2-purescale-a-technology
Web site: http://www.ibm.com/software/data/db2/linux-unix-windows/editions-features-purescale.html
49
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
DB2 pureScale : Technology Overview
Clients connect anywhere,…
… see single database
Clients
Clients connect into any member
Automatic load balancing and client reroute may change
underlying physical member to which client is connected
Single Database View
DB2 engine runs on several host computers
Member
CS
Member
CS
Integrated cluster services
Member
Member
CS
Co-operate with each other to provide coherent access to the
database from any member
CS
Failure detection, recovery automation, cluster file system
In partnership with STG (GPFS,RSCT) and Tivoli (SA MP)
Low latency, high speed interconnect
Cluster Interconnect
Special optimizations provide significant advantages on RDMAcapable interconnects (eg. Infiniband)
PowerHA pureScale technology
CS
2nd-ary
CS
Log
Log
Log
Shared Storage Access
Database
50
Log
Primary
Efficient global locking and buffer management
Synchronous duplexing to secondary ensures availability
Data sharing architecture
Shared access to database
Members write to their own logs
Logs accessible from another host (used during recovery)
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Scale with Ease
Without changing
applications
Efficient coherency protocols
designed to scale without
application change
Applications automatically and
transparently workload balanced
across members
Without administrative
complexity
Single Database View
DB2
DB2
DB2
DB2
DB2
No data redistribution required
Log
Log
Log
Log
Log
To 128 members in initial
release
Limited by testing resources
51
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
What is a PowerHA pureScale ?
Software technology that assists
in global buffer coherency
management and global locking
db2 agents & other
threads
Derived from System z Parallel Sysplex &
Coupling Facility technology
Software based
Services provided include
db2 agents & other
threads
log buffer,
dbheap, &
other heaps
bufferpool(s)
log buffer,
dbheap, &
other heaps
bufferpool(s)
Group Bufferpool (GBP)
Global Lock Management (GLM)
Shared Communication Area (SCA)
Primary
Log
Log
Members duplex GBP, GLM,
SCA state to both a primary and
secondary
GBP GLM SCA
Secondary
Done synchronously
Duplexing is optional (but recommended)
Set up automatically, by default
Shared database
(Single database partition)
54
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
The Role of the GBP
GBP acts as fast disk cache
Client A :
Select from T1
Client C :
where C2=Y
Select from T1
where C2=Y
Client B :
Update T1 set C1=X
where C2=Y
Commit
Member 1
Member 2
Dirty pages stored in GBP, then later,
written to disk
Provides fast retrieval of such pages when
needed by other members
bufferpool(s)
bufferpool(s)
GBP includes a “Page Registry”
Keeps track of what pages are buffered in
each member and at what memory
address
Used for fast invalidation of such pages
when they are written to the GBP
Force-at-Commit (FAC) protocol
ensures coherent access to
data across members
DB2 “forces” (writes) updated pages to
GBP at COMMIT (or before)
GBP synchronously invalidates any copies
of such pages on other members
55
New references to the page on other
members will retrieve new copy from GBP
In-progress references to page can continue
GBP
GLM
SCA
Page
Registry
M1 M2
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Stealth System Maintenance
Goal: allow DBAs to apply
system maintenance without
negotiating an outage window
Procedure:
Single Database View
DB2
DB2
DB2
DB2
1. Drain (aka Quiesce)
2. Remove & Maintain
3. Re-integrate
4. Repeat until done
57
Log
Log
Log
Log
Enables continuous availability
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Achieving Efficient Scaling : Key Design Points
Deep RDMA exploitation over
low latency fabric
Enables round-trip response time
~10-15 microseconds
Lock Mgr
Lock Mgr
Lock Mgr
Lock Mgr
Buffer Mgr
Silent Invalidation
Informs members of page updates
requires no CPU cycles on those
members
No interrupt or other message
processing required
Increasingly important as cluster grows
GBP
GLM
SCA
Hot pages available without
disk I/O from GBP memory
RDMA and dedicated threads enable
read page operations in
~10s of microseconds
58
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
of Transaction Throughput
61
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Questions / Discussion
[email protected]
63
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Backup
Slides
64
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Features to Minimize Planned Outages
Backup: Fast, scalable, granular
–
–
–
–
–
–
–
Online or offline
Fully parallel and scalable
Can be throttled
Partition-level backup
Table space-level backup
Full, Incremental, or Delta
Volume snapshot support
Load: Fast, scalable and granular
–
–
–
–
Fully parallel and scalable
Partition-level
Online load
Online index rebuild
Automatic log management
65
Other utilities
– Online statistics collection
– Online index create and
reorganization
– Online reorganization
– Online inspect
Dynamic operations
– Configuration parameters
– Buffer pool operations
– Container operations
Space management
– Online container management
– Automatic storage
– Online index reorganization
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Features to Minimize Unplanned Outages
Hardware failures
– Integration with TSA cluster
manager
– Built-in redundancy can't be turned
off
– Consistency bits
– Log mirroring
– Automatic mirroring of critical data
files
– Support for RAID
Fast recovery
–
–
–
–
–
66
High availability
– Clustering / failover support
– Integrated with TSM
– Automatic client reroute
Human and Application Errors
– Point-in-Time (POT) recovery
– Drop table recovery
Miscellaneous
– Infinite active logging
– Online container operations
Continuous check pointing
Parallel recovery
Automatic recovery tuning
Filtered recovery
Dynamic debugging capability
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
OLAP Optimization Advisor
InfoSphere Warehouse will
design the aggregates to
support dimensional
analysis for you using:
– Hybrid line
– Statistics
– Meta-data that describes the
cubes
• Hierarchies, dimensions,
measures, etc.
– Optimizes to understand
impact to load times and
performance trade-off
67
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
Universal Cubing Services Access
Portals, Web Applications, Dashboards, Interactive Reports,
Ad Hoc Analysis, Common Desktop Tools
IBM Cognos 8 BI
IBM DataQuant
& DB2 QMF
Microsoft Excel
Cubeware Cockpit
Universal Cube Access
(ODBO, XMLA)
InfoSphere Warehouse
68
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
InfoSphere Warehouse Data Mining
Data Mining Embedded into Applications and Processes
SOA Processes
BI Analytical Tools
Web Analytical Apps
Mining Visualizer
SQL Interface
DB2 InfoSphere Warehouse
• Enterprise-Level Data
Mining
• High-Speed, In-Database
Scoring
SQL
Structured &
Unstructured
Data
69
Model
Results
Modeling
In-Database
Data Mining
Scoring
Functions
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
InfoSphere Warehouse Text Analytics
Analyze and extract structured data from text
– Makes data available to normal reporting and analysis tools
– From customer call center records, claim forms, etc.
Benefits
– Target specific information hidden within text
– Competitive edge by driving further business insight
– Drives a greater ROI for your applications
Business value examples
–
–
–
–
–
70
Better product categorization
Early warning on customer attrition
Fraud detection
Product defect analysis
Better customer profiling
Simple text analysis capabilities for text
columns stored in warehouse tables
Pattern matching rules and simple linguistics
Enhance existing reports and data mining
with insights gleaned from text
Simple rules and dictionary editor
© 2009 IBM Corporation
Information Management – DB2
Information Management Software
InfoSphere Warehouse Design Studio
Leverage and extend InfoSphere Data Architect:
– Design and modify database physical models (schema & storage design, etc)
– Design and model OLAP objects
– Design and model warehouse transformation and mining flows
Key Features:
Database design, or reverse engineer an
existing database or DDL (RDA)
–
View/Modify the schema
–
Compare/Sync DB objects
–
Analyze design (best practices and
dependencies), Validation
–
DB2 Storage Modeling: Table Space,
Buffer Pool, Partition
Generate script & Deploy: on data
models, and flow models
Impact Analysis: on data models and flow
models
71
© 2009 IBM Corporation71
What’s new in TPoX 2.0
TPoX 2.0 includes pervasive change to the benchmark
TPoX 2.0 test results not comparable to previous versions of TPoX
Data Generator
Workload and WorkloadDriver
TPoX V1.3 and Earlier
TPoX 2.0
TPoX V1.3 and Earlier
TPoX 2.0
Based on Toxgene
A single java based program
3rd party tool, lack of support
Complete rewrite
Workload description file in
proprietary format, hard to read
Workload description file in
XML format, easy to read and
create
Slow (> 5 days for 1TB data)
Fast (6 hours for 1TB data)
Can’t generate dense account IDs
for CUSTACC
Account IDs are now dense
WorkloadDriver reads input
documents from large amount
of small files
Large amount of small XML files
Small amount of larger files,
each contains 50K XML
documents
WorkloadDriver reads input
documents from smaller
amount of larger files,
improved performance for
reading XML input
documents
Update transaction U1, U5 and
U6 select account for update
based on customer ID
Update transaction U1, U5
and U6 select account for
update based on account ID
Data Distribution
TPoX V1.3
and Earlier
TPoX 2.0
# of CUSTACC vs # of ORDER
1:5
1:10
XML document size range
1-20KB
1-23KB
ACCOUNT IDs of customer
Not dense
Dense
Total XML document size of
“100GB” scale
Slightly less
than 100GB
Slight larger
than 100GB
avg # of accounts per customer
1.5
2.0
Changes have improved performance of
generating and consuming TPoX XML data
in large scale TPoX benchmarks !
NOTE: please refer to TPoX V2.0 Release Note at http://sourceforge.net/projects/tpox for more detail
72
More information on XML
data management in
DB2 for Linux, UNIX, Windows
and
DB2 for z/OS
http://tinyurl.com/pureXML
73