DB2 purescale Overview and Technology Deep Dive

Transcript DB2 purescale Overview and Technology Deep Dive

DB2 pureScale Overview and
Technology Deep Dive
<<Speaker Name Here>>
<<Speaker Title Here>>
<<For questions about this presentation contact Kelly Schlamb ([email protected])>>
April 6, 2016
© 2015 IBM Corporation
Critical IT Applications Need Reliability and Scalability
 Local Databases are Becoming Global
– Successful global businesses must deal
with exploding data and server needs
– Competitive IT organizations need to
handle rapid change
Clients need a highly scalable, flexible solution for the growth of their
information with the ability to easily grow existing applications
 Down-time is Not Acceptable
– Any outage means lost revenue and
permanent customer loss
– Today’s distributed systems need reliability
2
© 2015 IBM Corporation
Introducing DB2 pureScale
 Extreme capacity
– Buy only what you need,
add capacity as your needs grow
 Application transparency
– Avoid the risk and cost of
application changes
 Continuous availability
– Deliver uninterrupted access to your
data with consistent performance
Learning from the undisputed Gold Standard... System z
3
© 2015 IBM Corporation
DB2 pureScale Architecture
Leveraging IBM’s System z Sysplex Experience and Know-How
•
Multiple DB2 members for scalable and
available database environment
•
Client application connects into any DB2
member to execute transactions
Clients
•
CS
CS
CS
CS
Member
Member
Member
Member
•
Shared storage for database data and
transaction logs
•
Cluster caching facilities (CF) provide
centralized global locking and page
cache management for highest levels of
availability and scalability
Cluster Interconnect
CFCS
Logs
Logs
Logs
•
Logs
Primary CF
CFCS
Secondary CF
Automatic workload balancing
Duplexed, for no single point of failure
•
High speed, low latency interconnect for
efficient and scalable communication
between members and CFs
•
DB2 Cluster Services provides integrated
failure detection, recovery automation
and the clustered file system
Database
Shared Storage
DB2 pureScale Cluster (Instance)
4
© 2015 IBM Corporation
DB2 vs. DPF (Shared Nothing) vs. pureScale (Shared Data)
Tran
Single Database View
Single Database View
DB2
SQL 1
SQL 1’
SQL 1’’
SQL 1’’’
DB2
DB2
DB2
Log
Log
Tran 1
Tran 2
Tran 3
DB2
DB2
DB2
Log
Database
Core DB2
Log
Log
Part 1
Part 2
Part 3
DB2 with Database Partitioning Feature
Ideal for warehousing and OLAP scale out
and massively parallel query processing
5
Shared Data Access
DB2 pureScale Data Sharing
Ideal for active/active OLTP/ERP scale out
© 2015 IBM Corporation
Comparing pureScale with Other DB2 HA Options
Integrated Clustering
• Active/passive
• Hot/cold, with failover typically
in minutes
• Easy to setup
• DB2 ships with integrated TSA
failover software
• No additional licensing required
HADR
• Active/passive or active/active (with • DB2 ships with integrated TSA
Reads on Standby)
• Minimal licensing (full licensing
• Hot/warm or hot/hot (with RoS), with
required if standby is active)
failover typically less than one minute • Perform system and database
• Easy to setup
updates without interruption
HADR
pureScale
CF
6
CF
• Active/active
• Hot/hot, with automatic and
online failover
• Integrated solution includes CFs,
clustering, and shared data access
• Included as part of DB2
"Advanced" editions
• Perform system and database
updates in rolling online fashion
© 2015 IBM Corporation
Machine Deployment Examples
 Highly flexible topologies due to logical nature
of member and CF
Two Machines
– A member and CF can share the same machine
– For AIX, separate members and CFs in different LPARs
– Virtualized environments via VMware and KVM
Member
Member
CFp
CFs
 Dedicated cores for CFs
– Optimizes response time
Four Machines
 No pureScale licenses required
for CF hosts
Member
– You only need to license the CPUs for
hosts on which members are running
Member
Member
Member
CFp
CFs
 pureScale included in
– DB2 Advanced Workgroup
Server Edition
– DB2 Advanced Enterprise
Server Edition
– DB2 Developer Edition
– DB2 Business Application
Continuity Offering
7
Ten Machines
CFp
Member
Member
Member
Member
Member
Member
Member
Member
CFs
© 2015 IBM Corporation
pureScale Client Configuration
 Workload Balancing (WLB)
– Application requests balanced across all members or
subsets of members
– Takes server load of members into consideration
– Connection-level or transaction-level balancing
 Client Affinity
– Direct different groups of clients or workloads to specific
members in the cluster
– Consolidate separate workloads/applications on same
database infrastructure
– Define list of members for failover purposes
 Automatic Client Reroute (ACR)
– Client automatically connected to healthy member in
case of member failure
– May be seamless in that no error messages returned
to client
– Application may have to re-execute the transaction
8
X
X
© 2015 IBM Corporation
Online Recovery from Failures
 DB2 pureScale design point is to
maximize availability during
failure recovery processing
X
 When a database member fails, only
in-flight data remains locked until
member recovery completes
CF
CF
– In-flight = data being updated on the
failed member at the time it failed
Database member failure
“We pulled cards, we powered off systems, we uninstalled
devices, we did everything we could do to make the cluster
go out of service, and we couldn’t make it happen.”
-- Robert M. Collins Jr. (Kent), Database Engineer, BNSF Railway Inc.
Only data in-flight updates locked
during recovery
% of Data Available
 Target time to availability of rows
associated with in-flight updates on
failed member in seconds
100
50
Time (~seconds)
Scale with Ease
 Scale up or out… without changing
your applications
Add
member
online
– Efficient coherency protocols designed
to scale without application changes
– Applications automatically and
transparently workload balanced
across members
– Up to 128 members
Member
 Without impacting availability
– Members can be added while
cluster remains online
Log
CF
Member
Log
Member
Log
Member
Member
Log
Log
CF
 Without administrative complexity
– No data redistribution required
“DB2 pureScale is the only solution we found that provided near linear scalability... It scales 100 percent, which
means when I add servers and resources to the cluster, I get 100 percent of the benefit. Before, we had to ‘oversize’ our
servers, and used only 50 - 60 percent of the available capacity so we could scale them when we needed.”
-- Robert M. Collins Jr. (Kent), Database Engineer, BNSF Railway Inc.
10
DB2 pureScale Daily Licensing
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Licensing
Need
Savings
J
11
Workload
CPU
Demand
License Cost Savings
F
M
A
M
J
J
A
S
O
N
D
© 2015 IBM Corporation
Online System and Database Maintenance
 Transparently perform maintenance to the
cluster in an online rolling fashion
– DB2 pureScale fix packs
– System updates such as operating system
fixes, firmware updates, etc.
CF
CF
 No outage experienced by applications
 DB2 fix pack install involves a single installFixPack command
to be run on each member/CF
– Quiesces member
• Existing transactions allowed to finish
• New transactions sent to other members
– Installs binaries
– Updates instance
• Member still behaves as if running on previous fix pack level
– Unquiesces member
 Followed up by final cluster-wide installFixPack command to complete
and commit updates across cluster
– Instance now running at new fix pack level
12
© 2015 IBM Corporation
Rolling Database Fix Pack Updates (cont.)
Cluster is effectively
running at: GA FP1
Transactions routed away from
member undergoing maintenance,
so no application outages
experienced. Workload balancing
brings work back after
maintenance finished
Member
Code level:
6 > installFixPack –check_commit
7 > installFixPack –commit_level
Member
GA FP1
Code level:
1 > installFixPack –online
Member
GA FP1
Code level:
2 > installFixPack –online
CF P
GA FP1
Code level:
4 > installFixPack –online
13
GA FP1
3 > installFixPack –online
CF S
Code level:
Cluster not
running at
new level
until commit
is performed
GA FP1
5 > installFixPack –online
Simplified installFixPack commands are shown here for example purposes
© 2015 IBM Corporation
Continuous Availability During Maintenance and System Growth
45000
Total Transactions Per Second
40000
35000
30000
25000
Add a 4th
member
Start new
member and
add
Startmore
new
app
clients
member
and
Add a 4th
member
add more app
clients
20000
15000
Update
secondary
CF
Update
Update
primary
CF
Update
Update
members
1,2
&3
Update
10000
secondary
CF
primary
CF
members
1,2 & 3
5000
0
4 Hour Run Duration
Time
Database servers
•
•
•
•
14
SUSE Linux Enterprise Server 11 SP 1
6 - IBM x3950 X5s (Intel XEON X7560 @ 2.27GHz
(4s/32c/64t))
Mellanox ConnectX-2 IB Card
128GB system memory
Storage server
•
•
1 - IBM Storwize v7000
8 - SSD drives (2TB usable capacity)
•
4 - for data, 4 - for logs
© 2015 IBM Corporation
DB2 pureScale Supported Hardware and OS
OR
POWER6
POWER7/7+
POWER8
Flex
Flex
x86 Intel
Compatible Servers
BladeCenter
H22/HS23
TCP/IP sockets
interconnect or high
speed, low latency
RDMA-based interconnect
(InfiniBand, 10 GE (RoCE)
15
GPFS compatible storage
(ideally storage that supports
SCSI-3 PR fast I/O fencing)
© 2015 IBM Corporation
Cluster Interconnect Options
 RDMA-capable interconnect for best performance and scalability
– Requires specialized network adapter cards
• InfiniBand
• 10 Gigabit Ethernet RoCE (RDMA over Converge Ethernet)
 TCP/IP sockets interconnect for faster cluster setup and lower cost
deployments using commodity network hardware
– 10 Gigabit Ethernet (10GE) strongly recommended for production installations
– Appropriate for smaller clusters with moderate data sharing workloads where
availability is the primary motivator for pureScale
 No compromise in availability as both options provide exactly the
same levels of high availability
 Choice of interconnect based on
your performance and
scalability requirements
16
Interconnect
CF
CF
© 2015 IBM Corporation
Relative Performance Between RDMA and TCP/IP Sockets
Transactional workload with
70% reads, 30% writes
Relative # of transactions per second
InfiniBand (RDMA)
17
2 members 3 members 4 members
1-4 members, 2 CFs
Intel x86 servers
32 logical CPUs per server
Single adapter per server
IBM DS3000 storage
Relative # of transactions per second
Relative # of transactions per second
Transactional workload with
90% reads, 10% writes
1 member
•
•
•
•
•
Sockets (TCP/IP over Ethernet)
1 member
2 members 3 members 4 members
© 2015 IBM Corporation
Virtualized Deployments of DB2 pureScale
 Virtualized environment options include
– RDMA-capable interconnect
• AIX LPARs, with dedicated RDMA network adapters per partition
• KVM with RHEL, with dedicated 10 GE RoCE network adapters per partition
– TCP/IP sockets interconnect
• AIX LPARs
• VMware (ESXi, vSphere) with RHEL or SLES
• KVM with RHEL
 Virtualized environments provide a
lower cost of entry and are perfect for
–
–
–
–
18
Development
QA and testing
Production environments with moderate workloads
Getting hands-on experience with pureScale
© 2015 IBM Corporation
Virtualized Deployments: Supported Configurations
Operating
System
Virtualization
Technology
InfiniBand
Supported?
10GE RoCE
Supported?
TCP/IP Sockets
Supported?
AIX, SLES, RHEL
No virtualization
(bare metal)
Yes *
Yes *
Yes
AIX
PowerVM (LPARs)
Yes *
Yes *
Yes
SLES
VMware
No
No
Yes
RHEL
VMware
No
No
Yes
KVM
No
Yes *
Yes
* Dedicated interconnect adapter(s) per host/partition
 VMware supported with
– Any x64 system that is supported by both the VM and DB2 pureScale
– Any Linux distribution that is supported by both the VM and DB2 pureScale
 KVM supported with
– Any x64 system that is supported by both RHEL 6.2 and DB2 pureScale
– RHEL 6.2 and higher
19
© 2015 IBM Corporation
DB2 pureScale Supported Storage
 Full explanation in “Shared Storage Considerations” section of the
Information Center
– http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/
com.ibm.db2.luw.qb.server.doc/doc/c0059360.html
 pureScale supports all storage area network (SAN) and directly
attached shared block storage, referenced as a logical unit
number (LUN)
 pureScale exploits two specific features in storage when available for
best results with recovery
– Fast I/O fencing
– Tie-breaker support
• Storage that supports both features is
considered "category 1"
20
© 2015 IBM Corporation
Fast I/O Fencing
 SCSI-3 Persistent Reserve (PR) provides fast I/O fencing for fast
recovery times
– Fencing in as little as 1 - 2 seconds, allowing for host failure detection and I/O
fencing in as little as 3 seconds
 Guarantees protection of shared data in the event that one or more
errant hosts splits from the network
 Substantially more robust than technology used by others (self
initiated reboot based algorithms or STONITH)
 Allows re-integration of a split host into the cluster when the network
heals without requiring a reboot
21
© 2015 IBM Corporation
Storage Categories
 Storage controllers and multipath I/O driver combinations are divided
into three categories
– Category 1
• Verified with pureScale to support both fast I/O fencing and act as a tie-breaker
– Category 2
• Verified with pureScale to support tie-breaker but not fast I/O fencing
– Category 3
• Has not been verified with pureScale to support one or both of tie-breaker and/or fast
I/O fencing
• Storage controller or multipath I/O driver may not support it – OR – it may support it
but we have just not been able to validate it yet
 IBM is working closely with many storage vendors to move them up
to category 1
22
© 2015 IBM Corporation
Current Category 1 Storage List
Includes fast I/O
fencing and
tiebreaker support
See "Shared Storage Considerations" section of Information Center for latest information:
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.qb.server.doc/doc/c0059360.html
23
© 2015 IBM Corporation
What is a DB2 Member?
 A DB2 engine address space
– i.e. a db2sysc process and its threads
Member 0
Member 1
db2sysc process
db2 agents and other
threads
db2sysc process
db2 agents and other
threads
 Members Share Data
– All members access the same
shared database
– Also known as “data sharing”
log buffer,
dbheap, and
other heaps
bufferpool(s)
log buffer,
dbheap, and
other heaps
bufferpool(s)
 Each member has it’s own
– Buffer pools
– Memory regions
– Log files
Log
Log
 Members are logical. Each can have
– 1 per machine or LPAR (recommended)
– >1 per machine or LPAR
(not recommended)
24
Shared database
(Single database partition)
© 2015 IBM Corporation
What is a Cluster Caching Facility (CF)?
 Software technology that assists
in global buffer coherency
management and global locking
db2 agents and other
threads
– Shared lineage with System z
Parallel Sysplex
– Software based
db2 agents and other
threads
log buffer,
dbheap, and
other heaps
bufferpool(s)
log buffer,
dbheap, and
other heaps
bufferpool(s)
 Services provided include
– Group Bufferpool (GBP)
– Global Lock Management (GLM)
– Shared Communication Area (SCA)
Primary
Log
Log
 Members duplex GBP, GLM, SCA
state to both a primary and secondary
– Done synchronously
– Having a secondary is optional
(but recommended)
– Set up automatically, by default
25
GBP GLM SCA
Secondary
Shared database
(Single database partition)
© 2015 IBM Corporation
CF Self-Tuning Memory
 CF memory is optimally distributed between
consumers based on workload
GBP
GLM
– Less administrative overhead for DBA, with reduction
in memory monitoring and management
SCA
 Can function at two levels
– Dynamic distribution of CF memory between
multiple databases in an instance
– Dynamic distribution of database's CF
memory between its consumers
• Group buffer pool (GBP)
• Global lock manager (GLM)
• Shared communication area (SCA)
 Beneficial for multi-tenant environments
where multiple databases are consolidated
within the same DB2 pureScale cluster
DB #1
DB #2
GBP
GLM
SCA
CF Memory
Achieving Efficient Scaling – Key Design Points
 Deep RDMA exploitation
over low latency fabric
– Enables round-trip response time
~10-15 microseconds
Lock Mgr
Lock Mgr
Lock Mgr
Lock Mgr
Buffer Mgr
 Silent Invalidation
– Informs members of page updates
– Requires no CPU cycles on
those members
– No interrupt or other message
processing required
– Increasingly important as cluster grows
GBP
GLM
SCA
 Hot pages available without
disk I/O from GBP memory
– RDMA and dedicated threads enable
read page operations in
~10s of microseconds
30
© 2015 IBM Corporation
Throughput vs 1 member
Scalability Demonstration
12
11
10
9
8
7
6
5
4
3
2
1
0
10.4x @ 12 members
7.6x @ 8 members
3.9x @ 4 members
1.98x @ 2 members
0
OLTP 80/20 R/W workload
No affinity
31
5
12 8-core p550 members
64 GB, 5 GHz each
# Members
Duplexed CFs
on 2 additional 8-core p550s
64 GB, 5 GHz each
10
20Gb/s IB HCAs
7874-024 IB Switch
15
DS8300 storage
576 15K disks
Two 4Gb FC Switches
© 2015 IBM Corporation
Top SAP Certified Transaction Banking (TRBK)
Benchmark Result with DB2 pureScale
 Benchmark reflects the typical day-to-day
operations of a retail bank
 Day processing:
– 90 million accounts and 1.8 billion postings
– Over 56 million postings per hour
 Night processing:
– Over 22 million accounts balanced per hour
 Benchmark Configuration
– Five 3690 X5 servers
– Total database size: 9 TB uncompressed,
3.5 TB compressed
http://www.sap.com/solutions/benchmark/trbk3_results.htm
32
© 2015 IBM Corporation
DB2 pureScale and SAP TRBK:
Near Linear Scalability From One To Four Members
P ostings pe r hour (millions) a s func tion of numbe r of me mbe rs
3.9x
60.0
50.0
3x
40.0
2x
30.0
Postings/hour (millions)
20.0
10.0
0.0
1
2
3
4
Note: This data was run separately and is not officially certified as part of the benchmark result
33
© 2015 IBM Corporation
Member Hardware Failure: Member Restart on Guest Host
 Power cord tripped over accidentally
 DB2 Cluster Services looses heartbeat and
declares member down
–
–
–
–
Informs other members and CF servers
Fences member from logs and data
Initiates automated member restart
on another (“guest”) host
> Using reduced, and pre-allocated
memory model
Member restart is like a database crash
recovery in a single system database, but is
much faster
• Redo limited to in-flight transactions
(due to FAC)
• Benefits from page cache in CF
Ultra Fast.
Online.
 In the mean-time, client connections are
automatically re-routed to healthy members
–
–
Based on least load (by default), or,
Pre-designated failover member
–
Primary retains update locks held by member
at the time of failure
Other members can continue to read and
update data not locked for write access by
failed member
Clients
Automatic.
Single Database View
DB2
CS
DB2
CS
DB2
CS
DB2
 Other members remain fully available
throughout – “Online Failover”
–
 Member restart completes
–
DB2
CS
Log
Log
Log
Log
CS
CS
Updated Pages
Global Locks
Updated Pages
Global Locks
Shared Data
Secondary
Primary
Retained locks released and all data
fully available
Almost all data remains available. Affected connections transparently re-routed to other members.
34
© 2015 IBM Corporation
Member Failback
 Power restored and
system re-booted
Clients
 DB2 Cluster Services automatically
detects system availability
–
–
–
Informs other members and CFs
Removes fence
Brings up member on home host
Single Database View
 Client connections automatically rerouted back to member
DB2
CS
DB2
CS
DB2
CS
DB2
CS
DB2
Log
Log
Log
CS
CS
Updated Pages
Global Locks
Secondary
35
Log
Shared Data
Updated Pages
Global Locks
Primary
© 2015 IBM Corporation
Primary CF Hardware Failure
 Power cord tripped over accidentally
 DB2 Cluster Services looses
heartbeat and declares primary down
–
–
–
Informs members and secondary
CF service momentarily blocked
All other database activity
proceeds normally
• E.g. accessing pages in bufferpool,
existing locks, sorting, aggregation, etc
Automatic.
Clients
Ultra Fast.
Online.
Single Database View
 Members send missing data to
secondary
–
E.g. read locks, page registrations
DB2
CS
 Secondary becomes primary
–
–
CF service continues where it
left off
No errors are returned to
DB2 members
Log
DB2
CS
Log
DB2
CS
Log
Log
CS
CS
Updated Pages
Global Locks
DB2
CS
Shared Data
Secondary
Primary
Updated Pages
Global Locks
Primary
All data remains available. Completely transparent to members and transactions.
36
© 2015 IBM Corporation
CF Failback
 Power restored and
system re-booted
 DB2 Cluster Services automatically
detects system availability
–
Clients
Informs members and primary
Single Database View
 New system assumes secondary role
in catchup’ state
–
–
–
Members resume duplexing
Members asynchronously send lock
and other state information
to secondary
Members asynchronously castout
pages from primary to disk
DB2
CS
Log
DB2
CS
Log
DB2
CS
Log
Primary
Log
CS
CS
Updated Pages
Global Locks
DB2
CS
Shared Data
Updated Pages
Global Locks
Secondary
(Catchup
(Peer state)
state)
37
© 2015 IBM Corporation
Single Failure Scenarios
Other
Members
Remain
Online ?
Failure Mode
DB2
DB2
DB2
Automatic and
Transparent ?
DB2
Member
CF
Primary
CF
CF
DB2
DB2
DB2
DB2
CF
Secondary
CF
DB2
CF
38
CF
DB2
DB2
DB2
CF
Connections to failed
member transparently
move to another member
Examples of Simultaneous Failures
Failure Mode
DB2
DB2
DB2
CF
DB2
DB2
CF
CF
39
DB2
DB2
Connections to failed
member transparently
move to another member
DB2
CF
DB2
Automatic and
Transparent ?
DB2
CF
DB2
Other
Members
Remain
Online ?
Connections to failed
member transparently
move to another member
DB2
CF
Connections to failed
member transparently
move to another member
DB2 pureScale is Easy to Deploy
Single installation for all components
Monitoring integrated into
Optim tools
Single installation for fixpacks
and updates
Simple command to add
and remove members
41
pureScale is DB2
 A pureScale environment looks and feels very much like
a "regular" DB2 environment
– Same code base shared by DB2, DPF, and pureScale
– In DB2 10.1 and 10.5, pureScale is just an installable feature of DB2
 Immediate productivity from DBAs and application developers
– Single system view for utilities
• Act and behave exactly like they do in non-pureScale
• Backup, restore, rollforward, reorg, load, …
– Applications don’t need to know about or care about the fact that are
multiple members
• In general, can run SQL statements or command on any member
–
–
–
–
42
SQL, data access methods, and isolation levels are the same
Backup/recovery processes are the same
Security is managed in the same way
Environment (even the CFs) still managed by database manager and
database configuration parameters
© 2015 IBM Corporation
Installation and Adding Capacity
 Complete pre-requisite work:
 OS installed, hosts on the network, access to shared disks enabled
 Initial installation
 Copies the DB2 pureScale image to the Install Initiating Host
 Installs the code on the specified hosts using a response file
SD image
 Creates
the instance, members, and primary and secondary CFs
DB2
as directed
pureScale

Adds members, primary and secondary CFs, hosts, HCA cards,
Image
etc. to the domain resources
 Creates the cluster file system and sets up each member’s access to it
Add a member online
1. Complete pre-requisite work

2.
Add the member
3.
DB2 does all tasks to
add the member to the
cluster
Copy Image Locally
Member 4
host0
Install
Initiating
Host
host1
host2
host3
host6
Member 0
Member 1
Member 2
Member 3
Member 4
Install
Install
Install
Install
Install
CS
CS
CS
CS
CS
OS installed, on the network,
access to shared disks
db2iupdt –add –m
<MemHostName> -mnet
<MemInterconnectName>
InstName




Copies the image and
response file to new host
Runs install
Adds new host to the
resources for the instance
Sets up access to the
cluster file system for host
scp image and rsp file
host4
host5
Primary
Secondary
Install
CS
43
Install
You can also:
 Drop member
 Add / drop CFs
CS
© 2015 IBM Corporation
Instance and Host Status
> db2start
08/24/2008 00:52:59
0 0
SQL1063N DB2START
08/24/2008 00:53:00
1 0
SQL1063N DB2START
08/24/2008 00:53:01
2 0
SQL1063N DB2START
08/24/2008 00:53:01
3 0
SQL1063N DB2START
SQL1063N DB2START processing was successful.
Clients
Single Database View
host1
host2
host3
DB2
DB2
DB2
DB2
host4
host5
CF
CF
db2nodes.cfg
0
1
2
3
4
5
44
host0
host1
host2
host3
host4
host5
0
0
0
0
0
0
-
MEMBER
MEMBER
MEMBER
MEMBER
CF
CF
was
was
was
was
successful.
successful.
successful.
successful.
> db2instance -list
host0
Shared Data
processing
processing
processing
processing
ID TYPE
STATE
0
1
2
3
4
5
STARTED
STARTED
STARTED
STARTED
PRIMARY
PEER
MEMBER
MEMBER
MEMBER
MEMBER
CF
CF
HOME_HOST CURRENT_HOST ALERT
host0
host1
host2
host3
host4
host5
host0
host1
host2
host3
host4
host5
NO
NO
NO
NO
NO
NO
Instance status
HOST_NAME
STATE
host0
host1
host2
host3
host4
host5
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
ACTIVE
INSTANCE_STOPPED
NO
NO
NO
NO
NO
NO
Host status
ALERT
NO
NO
NO
NO
NO
NO
© 2015 IBM Corporation
Workload Balancing

Run-time load information used to automatically balance load across the cluster
–
–
–
–
–

Failover
–

Shares design with system z Sysplex
Load information of all members kept on each member
Piggy-backed to clients regularly
Used to route next connection (or optionally next transaction) to least loaded member
Routing occurs automatically (transparent to application)
Load of failed member evenly distributed to surviving members automatically
Fallback
–
Once the failed member is back online, fallback does the reverse
Clients
45
Clients
© 2015 IBM Corporation
Workload Balancing Across Member Subsets

Workload balancing can be configured to take place across a
subset of members, which enables
– Isolation of batch from transactional workloads within a single database
– Workloads for multiple databases in a single instance isolated from each other
Mix of
OLTP & Batch
Member 0
CF
Member 1
Member 2
Data
Member 3
OLTP
Batch
Member 4
CF
Member 0
CF
Member 1
Member 2
Data
Member 3
Member 4
CF
Example of isolating a batch workload from a transactional workload
46
© 2015 IBM Corporation
Optional Affinity-Based Routing
 Allows you to target different
groups of clients or workloads
to different members in
the cluster
App Servers App Servers App Servers App Servers
Group A
Group B
Group C
Group D
– Maintained after failover …
–
… and fallback
 Example use cases
– Consolidate separate
workloads/applications on
same database infrastructure
– Minimize total resource requirements
for disjoint workloads
 Easily configured through
client configuration
– db2dsdriver.cfg file
47
© 2015 IBM Corporation
Managing and Monitoring pureScale Using DB2 Tooling
Task
Database
Administration
System Monitoring
pureScale Support
Product
 Ability to perform common administration tasks across
members and CF
 Integrated navigation through shared data instances
Data Studio
 Recover database objects safely, precisely, and quickly
DB2 Recovery Expert
 Support for high speed unload utility
DB2 High
Performance Unload
 Merge incremental backups into a full backup
DB2 Merge Backup
 Integrated alerting and notification
 Seamless view of status and statistics across all members
and CFs
Data Studio Web
Console
Optim Performance
Manager
48
Configuration
Tracking and
Client
Management
 Full support for tracking and reporting of configuration
changes across clients on servers
Application
Development
 Full support for developing Java, C, and .NET applications
against a DB2 pureScale environment
Query Tuning
 Full support for query, statistics, and tuning advice for
applications on pureScale systems
Optim Configuration
Manager
Data Studio
Optim Query
Workload Tuner
© 2015 IBM Corporation
Optim Performance Manager
 Provide monitoring metrics about the cluster caching facility (CF) on
the Overview dashboard
– CF CPU and memory utilization
– Group Buffer Pool Hit Ratio
– CF lock timeouts, lock escalations, and transaction lock wait time
 Show enhanced system information on System dashboard
– Host status, instance status, CF requests and time, more CPU values, ....
 Member information for locking problems on Locking dashboard
 Provide DB2 pureScale
system templates
– DB2 pureScale production with
all details
– DB2 pureScale production with
low overhead
49
© 2015 IBM Corporation
Optim Performance Manager (cont.)
 Cluster Caching Facility (CF) monitoring metrics include
– Group Buffer Pool Hit Ratio per connection, statement, buffer pool or
table space
– CF locking information, CF requests/time on connection or
statement level
– Page reclaim information
– CF configuration parameters in database and database
manager reports
 Health alerts can notify DBAs or others of CF or member failures
50
© 2015 IBM Corporation
Overview Dashboard for DB2 pureScale System
51
© 2015 IBM Corporation
System Dashboard for DB2 pureScale System
CF details
52
© 2015 IBM Corporation
System Dashboard for DB2 pureScale System (cont.)
Member details
53
© 2015 IBM Corporation
Optim Data Administrator pureScale Support
Launch Desired
Administration
Task Assistant
Select Quiesce
options that
define how and
when the action
should occur
Select which
member to
quiesce before
taking it offline
View, modify, or
execute the
commands to
complete a task
54
Isolate Applications using OCM and Member Subsets
 Available for pureScale with Optim Configuration
Manager for DB2 for LUW V3.1
 Penalty boxing
– Protect mission critical applications from cascading effects
of misbehaving applications
 Proving grounds
– Test new applications with production data in a limited capacity environment
 Define and activate rules that dictate which member subset to use
– Fine-grained rules based on user, client workstation IP address, data
source name, and other properties
– Newly activated rules applied at transaction boundaries
 Available for managed clients
– Requires Optim Data Tools Runtime Client to be installed on clients
– JDBC, ODBC/CLI, .NET
55
© 2015 IBM Corporation
OPM and OCM Penalty Box Example
App A
App B
Regular Operation
Member 0
Member 1
• User defines and activates a
rule in OCM to isolate App C
to a restricted environment
without any outages
• Performance of App A and
App B go back to normal
App C
App A
Penalty Box
App B
Member 2
Regular Operation
CF
Data
Penalty Box
CF
• OPM alerts DBA that App C
is using excessive CPU
• OPM also shows that App A
and App B are affected
56
App C
Member 0
CF
Member 1
Data
Member 2
CF
© 2015 IBM Corporation
Disaster Recovery Options for pureScale
 Variety of disaster recovery
options to meet your needs
– HADR
– Storage Replication
– Q Replication
– InfoSphere Change Data
Capture (CDC)
– Geographically Dispersed
pureScale Cluster (GDPC)
– Manual Log Shipping
57
© 2015 IBM Corporation
HADR in DB2 pureScale
 Integrated disaster recovery solution
– Very simple to setup, configure, and manage
 Support includes
–
–
–
–
Asynchronous and super-asynchronous modes
Time delayed apply
Log spooling
Both non-forced (role switch) and forced (failover) takeovers
 Member topology must match between primary and standby clusters
– Different physical configuration allowed (less resources, sharing of LPAR, etc.)
HADR
CF
CF
Primary
pureScale Cluster
58
CF
CF
Standby DR
pureScale Cluster
© 2015 IBM Corporation
HADR in DB2 pureScale: Highly Available By Design
 If member in primary cluster fails or cannot connect to standby, logs for member
shipped by another member to standby (referred to as assisted remote catchup)
 If replay member fails then another member automatically takes over and becomes the
replay member
Standby Site
Primary Site
Member
Member
Becomes
replay
member if
preferred
member fails
TCP/IP
Transactions
Member
Member
Member
Member
CF
CF
CF
CF
Logs 1
59
Preferred
replay
member
Logs 2
Logs 3
Assisted remote
catchup - failed
member's logs sent
by healthy member
© 2015 IBM Corporation
Storage Replication
 Uses remote disk mirroring technology
– Maximum distance between sites is typically 100s of km
(for synchronous, 1000s of km for asynchronous)
– For example: IBM Metro Mirror, EMC SRDF
 Transactions run against primary site only,
DR site is passive
– If primary site fails, database at DR site can be brought online
– DR site must be an identical pureScale cluster with matching topology
 All data and logs must be mirrored to the DR site
– Synchronous replication guarantees no data loss
– Writes are synchronous and therefore ordered, but “consistency groups” are
still needed
• If failure to update one volume, don’t want other volumes to get updated (leaving data
and logs out of sync)
60
© 2015 IBM Corporation
Q Replication
 High-throughput, low latency logical data replication
– Distance between sites can be up to thousands of km
 Asynchronous replication
 Includes support for:
–
–
–
–
Delayed apply
Multiple targets
Replicating a subset of data
Data transformation
 DB2 pureScale can be a source and/or target of replication
– If using pureScale as a source, target does not have to be pureScale
– Member topology does not have to match if pureScale both a source and target
 DR site can be active
– Bi-directional replication is supported for updates on both primary and DR sites
61
© 2015 IBM Corporation
Geographically Dispersed pureScale Clusters (GDPC)
 A “stretch” or geographically dispersed pureScale cluster (GDPC) spans two sites at
distances of up to tens of kilometers
– Provides active/active DR for one or more shared databases across the cluster
– Enables a level of DR support suitable for many types of disasters (e.g. fire, data center power outage)
– Supported on AIX (InfiniBand, 10 GE RoCE, TCP/IP) and RHEL/SUSE Linux (10 GE RoCE, TCP/IP)
 Both sites active and available for transactions during normal operation
 On failures, client connections are automatically redirected to surviving members
– Applies to both individual members within sites and total site failure
M1
M3 CFP
Tens of km
CFS M2
Site A
M4
Site B
Workload fully balanced
M1
M3 CFP
Site A
Tens of km
CFS M2
M4
Site B
Workload rebalanced on hardware failure
62
M1
M3 CFP
Site A
Tens of km
CFS M2
M4
Site B
Workload rebalanced on site failure
© 2015 IBM Corporation
GDPC Suitability
40%
Write
Activity
30%
Percentage
of UPDATE, 20%
INSERT, or
DELETE
operations
in workload
10%
Good candidate
workloads for other
replication technologies
Good candidate
workloads /
configurations
for GDPC
0%
10km
63
20km
30km
40km
50km
Site-to-site distance
60km 70km
© 2015 IBM Corporation
Comparison of pureScale Disaster Recovery Options
HADR
Sync
Async
GDPC
Q Replication
/ CDC
Manual Log
Shipping
Active/active DR
No
No
No
Yes
Yes
No
Synchronous
No
Yes
No
Yes
No
No
Requires matching
pureScale topology at
DR site
Yes
Yes
Yes
n/a
No
Yes
Delayed apply
Yes
No
No
No
Yes
Yes
Multiple DR
target sites
No
No
No
No
Yes
Yes
1000s km
100s km
1000s
km
10s km
1000s km
1000s km
Maximum distance
between sites
64
Storage Replication
© 2015 IBM Corporation
DB2 Business Application Continuity (BAC) Offering
 Two member, active/active DB2 pureScale configuration
up
up
Apps
Admin
– All application workloads are directed to one primary active member
– Utilities and admin tasks allowed on the secondary admin member
– Application workloads quickly failover to secondary member during
planned or unplanned outages
CF
 Low cost active/passive licensing model
– Primary member fully licensed
– Secondary admin member licensed as idle/warm standby
– Available for DB2 Workgroup Server Edition (WSE) and DB2 Enterprise
Server Edition (ESE)
CF
pureScale cluster
implemented using
BAC offering
 Administrative activities allowed on secondary member includes
–
–
–
–
–
–
–
–
Backup and restore
Reorg and runstats
Monitoring
Usage of Data Definition Language (DDL)
Database Manager configuration
Database configuration
Log based capture utilities for the purpose of data capture
Security administration and setup
© 2015 IBM Corporation
DB2 10.5 Trial Software Available
http://www-01.ibm.com/software/data/db2/linux-unix-windows/downloads.html
Login with
IBM ID
66
 90 day trial
 Includes the pureScale feature
 With TCP/IP sockets support, no new hardware
investment required to try it out
 VMware installs also possible
© 2015 IBM Corporation
DB2 pureScale Reference Material
IBM DB2 pureScale
Product Information
IDUG
Presentations
http://www-01.ibm.com/software/data/db2/
linux-unix-windows/purescale/
http://www.idug.org/p/se/in/
q=purescale&sb=1
DB2 Information Center
http://www-01.ibm.com/support/knowledgecenter/
SSEPGG_10.5.0/com.ibm.db2.luw.licensing.doc/
doc/c0057442.html
DB2 pureScale Redbook
DB2 pureScale Book
developerWorks Articles
http://www.redbooks.ibm.com/abstracts/
sg248018.html
http://public.dhe.ibm.com/common/ssi/
ecm/en/imm14079usen/IMM14079USEN.PDF
http://www.ibm.com/search/csass/search/?
q=purescale&sn=dw&dws=dw
67
© 2015 IBM Corporation
What Can DB2 pureScale Do For You?
 Deliver higher levels of scalability and superior availability
 Better concurrency during regular operations
 Better concurrency during member failure
 Less application design and rework due to transparent scalability
 Improved SLA attainment
 Lower overall costs for applications that require high transactional
performance and ultra high availability
“In all other respects, for scalability to flexibility, through ease of use and high
availability, to cost (at least at list prices), IBM appears to offer significant
advantages.”- Phillip Howard, Research Director
68
© 2015 IBM Corporation
Backup Slides
© 2015 IBM Corporation
pureScale Roadmap Highlights (DB2 9.8 and DB2 10.1)
9.8 FP5
(06/12)
• Code fixes
GDPC
9.8 FP2
pureScale 9.8
GA on Power
(12/09)
(08/10)
• SLES Linux on
System x
• DR with QRep
• PIT Recovery
9.8 FP1
(03/10)
• Power 7
(04/11)
• Stretch cluster
for AIX
9.8 FP3
(12/10)
• RHEL Linux
• Multiple
databases
• 10 GE with
RDMA (RoCE)
for SLES
• XML
9.8 FP4
(07/11)
• Multiple CF HCAs
• Multiple switches
• Set write suspend
for snapshots
• Improved
serviceability
• Monitoring
enhancements
• IBM BladeCenter
10.1 FP2
10.1 GA
•
•
•
•
•
•
•
10.1 FP3
(06/12)
Range partitioning
Workload management
Table space recovery
RoCE for AIX/RHEL
Enhanced CF monitoring
Performance optimizations
pureScale installable
feature of DB2
(12/12)
• GDPC for RHEL
• Multiple interconnect
adapters for
members
• Support for generic
Intel x86 rackmounted servers
10.1 FP1
(09/12)
• KVM support
(05/14)
• Code fixes
10.1 FP3
(09/13)
• Snapshot backup
scripts
pureScale Roadmap Highlights (DB2 10.5)
DB2 Cancun Release
(10.5 FP4)
(08/14)
• Commodity Ethernet
interconnect (sockets)
• VMware/KVM with sockets
• Incremental backup/restore
• Online table reorg
• More GDPC configurations
• Power8 support
• Integrated flash copy backups
• DB2 Spatial Extender
• Improved install and upgrade
experience
• …
10.5 GA
(06/13)
HADR for pureScale DR
Online add member
Performance optimizations
Topology changing backup/restore
pureScale/non-pureScale
backup/restore
• Snapshot backup scripts
• Random key indexes
• Multi-tenancy enhancements with
member subsets and per-member
STMM
•
•
•
•
•
10.5 FP2
(10/13)
• Code fixes
10.5 FP1
(08/13)
• Online fix pack updates
• Explicit Hierarchical Locking
• Support for generic Intel x86
rack-mounted servers
10.5 FP3
(02/14)
• Code fixes
10.5 FP5
(12/14)
• Automatic
CF memory
• GDPC with TCP/IP
• Business Application
Continuity Offering *
• Code fixes

DB2 purescale Overview and Technology Deep Dive

Transcript DB2 purescale Overview and Technology Deep Dive

Directory