RAC_CacheFusion_RSM_SunFireLink
Download
Report
Transcript RAC_CacheFusion_RSM_SunFireLink
Data Dependent Routing may not be
necessary when using Oracle RAC
Through Technology Improvements in:
Oracle 9i - RAC
Oracle 9i - CacheFusion
Solaris - RSM
Sun Cluster – SunFire Link
Ken Gottry
Apr-2003
Objective
• To provide a brief overview of several new technologies
that have been implemented by Oracle and Sun over the
past 18 months. These include:
•
•
•
•
Oracle 9i RAC database cluster
Oracle 9i CacheFusion
Solaris Remote Shared Memory (RSM)
Sun Cluster SunFire Link
• To suggest that, based on the above improvements,
application logic to implement data dependent routing may
no longer be as important when using an Oracle RAC
database cluster.
www.gottry.com
2
Agenda
•
•
•
•
•
•
•
•
Executive Summary
HA-Oracle vs. OPS/RAC
Pinging in OPS
Pinging in RAC
Data dependent routing (DDR)
Oracle 9i CacheFusion
Solaris remote shared memory (RSM)
Sun Cluster Interconnect – SunFire Link
www.gottry.com
3
Executive Summary
• What was called Oracle Parallel Server (OPS) in 8i is now
called Real Application Cluster (RAC) in 9i
• CacheFusion in 9i reduces pinging degradation from 20%
in OPS to 5-10%
• Oracle 9i can use Solaris Remote Shared Memory (RSM)
to move CacheFusion into the kernel level. Pinging
degradation may be reduced to 3-5%
• Sun Cluster supports SunFire Link, a 1.6 Gbps pipe
between cluster nodes with less than 1 ms latency. Up to 6
SunFire Link interconnects between nodes will allow
striping of data transfer. Pinging degradation may be
reduced to 1-3%
• With such reduction in pinging degradation, is data
dependent routing (DDR) a design concern any more?
www.gottry.com
4
HA-Oracle vs. RAC
App
Server
App
Server
DB
Server
DB
Server
App
Server
App
Server
G
C
S
G
C
S
Database
App
Server
App
Server
DB
Server
DB
Server
Failover
Database
DB
Server
HA-Oracle
• Only one DB server active at a
time
• Failover may take a long time
DB
Server
RAC
• Both DB servers active so throughput
is often 80-90% more than with HAOracle
• Distributed Lock Mgr (DLM) called
Global Cache Service (GCS) in 9i
• Failover is immediate
• Requires application coding
Failover
www.gottry.com
Database
App
Server
DB
Server
App
Server
G
C
S
G
C
S
DB
Server
Database
5
Pinging with 8i OPS
App
Server
Pinging
Reduced throughput when DB
node #1 has to ask DB node #2 if
it has the needed block before
DB node #1 can update it
1 UPDATE salary = $1M
w here emp="Gottry"
Update 6
complete
2
DB
Server
#1
D
L
M
do you have
block #123?
it's all yours 4
reads and
updates block 5
#123
D
L
M
3
Database
Oracle 8i OPS
DB node #2 had to flush the block
to disk before DB node #1 could
have it
DB
Server
#2
w rites block
#123 to disk
Throughput was degraded about 20% with OPS pinging.
Example: assume one DB node could process 100 tps.
When adding a second DB node, you would expect the
OPS database cluster to process 200 tps. However, due
to the pinging overhead, you would normally see
(100 + 100) – (20% * (100 + 100)) = 200 – 40 = 160 tps
www.gottry.com
6
Pinging with 9i RAC
App
Server
Oracle 9i RAC
Using CacheFusion, DB node #2
pushes the block to DB node #1
over the cluster interconnect.
1 UPDATE salary = $1M
w here emp="Gottry"
Update 5
complete
CacheFusion
2
DB
Server
#1
G
C
S
Pinging still occurs within RAC,
but is much faster because the
block is transferred between
cache without a disk write by DB
node #2
do you have
block #123?
Here's block 3
#123. it's all
yours
G
C
S
DB
Server
#2
updates block
#123 in cache 4
and w rites to disk
Throughput degraded about 10% with RAC pinging.
Database
Example: assume one DB node could process 100 tps.
When adding a second DB node, you would expect the
RAC database cluster to process 200 tps. However, due
to the pinging overhead, you would normally see
(100 + 100) – (10% * (100 + 100)) = 200 – 20 = 180 tps
www.gottry.com
7
Data Dependent Routing (DDR)
To minimize the impact of pinging, architects often partition the DB, making one DB node primarily responsible
for one-half the DB and the other DB node primarily responsible for the other half. The application must then
contain data dependent routing logic that decides to which DB node to send each SQL call
App knows that DB server #1 is the primary handler
of the portion of the DB containing Patient ID’s 1500. So, app sends SQL request for patient ID 200
to DB server #1 to minimize impact of pinging
App knows that DB server #2 is the primary handler
of the portion of the DB containing Patient ID’s 5011000. So, app sends SQL request for patient ID 800
to DB server #2 to minimize impact of pinging
App
Server
App
Server
1
1
Patient_ID = 800
Patient_ID = 200
2
DB
Server
#1
G
C
S
do you have
block #123?
Nope
2
G
C
S
3
DB
Server
#2
reads and
updates 4
block #123
Patient ID
1-500
Database
Patient ID
501-1000
DB Partitioning
DB
Server
#1
Notice ping still happens, but
no block transfer is required.
It’s the block transfer that
can degrade throughput by
up to 5-20%
www.gottry.com
G
C
S
do you have
block #456?
Nope
G
C
S
3
DB
Server
#2
4
reads and updates
block #456
Patient ID
1-500
Database
Patient ID
501-1000
DB Partitioning
8
CacheFusion and
Remote Shared Memory (RSM)
Oracle 9i CacheFusion makes
the cache on multiple DB nodes
act as one. This speeds up block
transfer when it’s needed.
Taking a closer look,
CacheFusion is implemented at
the application (Oracle) level
Solaris’ Remote Shared Memory
(RSM) allows clustered apps to
share memory at the kernel level.
Oracle 9.1 implements RSM-API
www.gottry.com
cache
CacheFusion
DB Node #1
cache
cache
DB Node #2
CacheFusion
cache
Oracle
Oracle
cluster
cluster
kernel
kernel
cache
CacheFusion
Oracle
cluster
kernel
cache
Oracle
RSM
cluster
kernel
9
SunFire Link Interconnect
Nodes of a cluster use a private network
connection between the nodes to
communicate.
DB Node #1
cluster
kernel
DB Node #2
cluster
interconnect
cluster
SunFire Link
Interconnect
cluster
kernel
Heartbeat (“are you alive”) info is exchanged
over the cluster interconnect
Previously Sun Cluster supported two types
of interconnect:
•
•
ethernet (100Mbps)
proprietary SCI (200 Mbps)
cluster
kernel
In Apr-2003, Sun Cluster announced support
for proprietary SunFire Link interconnect
(1.6Gbps). Up to 6 SFL interconnects can be
used to stripe the data as it is transferred
www.gottry.com
kernel
striped xfer
cluster
cluster
kernel
kernel
10
Is Data Dependent Routing Needed?
This chart and table show the
relative improvement in throughput
using the new technologies.
250
Ideal Throughput (200 tps)
200
150
100
Perhaps this improvement is good
enough to avoid adding data
dependent routing logic to your
application.
50
0
OPS
CacheFusion
RSM
Degradation
Throughput
OPS
20%
80 + 80
160
RAC with CacheFusion
10%
90 + 90
180
RAC with RSM
7%
93 + 93
186
RAC with RSM and SunFire Link
3%
97 + 97
194
Ideal
0%
100 + 100
200
SunFireLink
Based on a 2 node DB cluster with each node capable of 100 tps
Configuration
www.gottry.com
Total
11