ETC Real Application Clusters Demo

Download Report

Transcript ETC Real Application Clusters Demo

Introduction
 3 Tier Architecture
–
–
–
Workload Generator
Application Server
Database Server
Workload Generator Tier
 42 x Compaq DL360
– 512 MB RAM
– 2 CPU (Pentium III 1 GHz)
– Running Mercury Loadrunner to simulate
Applications users
– Rack-mounted in a single cabinet
Application Server Tier
 12 x Compaq ES40
–
–
–
–
–
16 GB RAM
3 CPU (833 MHz)
Tru64 5.1
Local Storage (no Fibre Channel)
No Memory Channel Cards
As there is no shared resource in the Application Servers we can expect
linear scalability
Database Server Tier
 4 x Compaq ES40
– 16 GB RAM
– 3 CPU (833 MHz)
– Tru64 5.1 Patchkit 4
– Dual Fibre Channel Cards
– Dual Memory Channel Interconnects
– Tru64 Clustered Filesystem used for OS + Oracle
Binaries + Datafiles
Overall Picture
DL360
DL360
OLTP
DL360
……………...
OLTP
OLTP
DL360
DL360
Concurrent
Processing
9i Real Application Clusters
• Scalability
• Availability
• Reduce Total Cost of Ownership
- Hardware Procurement Costs
- Database Server Consolidation
Traditional Shared-Disk Clustered
Databases
 Maintaining data coherency is a hard problem
–
–
Need to synchronize updates to shared data
The disk is the only medium for data sharing
 Disk I/O latencies appear in the critical path when
multiple nodes access shared data
 Disk-based coherency is the main bottleneck to
achieving a scalable shared disk cluster
–
Only synthetic fully partitioned workloads scale!
Disk Based Coherency – Parallel Server
2. Update Block A
DLM
DLM
Shared Memory/Global Area
shared
SQL
3. Request
access to Block A
Shared Memory/Global Area
log
buffer
shared
SQL
log
buffer
. .. . ..
...
Ping
1. Read Block A
4.
Block A
5. Access Block A
Block A
Shared Disk Database
• To Fix: Go to
Interconnect
based
coherency
Oracle Real Application Clusters
(RAC)
 An application transparent clustered database
– single node applications run and scale with no changes
– To Application logic
– Or to Oracle database structures
 Cluster interconnect fabric replaces the disk as the medium
for inter-node data sharing
 Cache Fusion protocol for data sharing results in a scalable
cluster for multiple differing workload types
A 9i RAC Database
Network
Users
Low Latency Interconnect
High Speed
Switch or
Interconnect
Clustered
Database Servers
Hub or
Switch
Fabric
Mirrored Disk
Subsystem
Storage Area Network
Oracle9i Real Application Clusters
Base Technology (patented) : DB Cache Fusion
´Fused´ DB cache
SGA
Shared Pool
SGA
Buffer Cache
Buffer Cache
DB cache1
DB cache2
Inter-connect
All DB cache operations reference
1. the local DB cache
2. all remote DB caches
SGA
Shared Pool
Oracle9i Real Application Clusters
Base Technology (patented) : DB Cache Fusion
´Fused´ DB cache
SGA
SGA
Shared Pool
Buffer Cache
DB cache1
Shared Pool
DB cache2
Inter-connect
All DB cache operations reference
1. the local DB cache
2. all remote DB caches
SGA
What is Cache Fusion?
 The underlying technology that enables RAC
 Protocol that allows instances to combine their data
caches into a shared global cache
– Global Cache Service (GCS) coordinates sharing
 Key features are
– Direct sharing of volatile buffer caches
– Efficient inter-node messaging framework
– Fast recovery from node failures using cache and
CPU resources from all surviving node
 Benchmarks and Customer implementations show
1.85 to 1.9 * scalability with the addition of each node
Data Sharing Problem
 Read Sharing for Queries
–
query needs to read a data block that is currently
in another instance’s buffer cache.
 Write Sharing for Updates
–
update needs to modify a data block that is
currently in another instance’s buffer cache.
 With Cache Fusion, a disk read is performed
only if the block is not already in the global
shared cache
Cache Fusion Read Sharing
 Uses Oracle’s Consistent Read (CR) scheme
–
–
undo is applied to make a block transactionally
consistent to a System Change Number (SCN).
a CR copy is shipped to the requesting instance
1
Query
SCN 200
225
Data Block
200
CR Copy
2
3
Instance A
Instance B
Cache Fusion Write Sharing
 Multiple dirty copies of a data block can exist
in the global cache, but only one is current
 The current copy can move between
instances without first being written to disk
–
Changes are logged if not already on disk
 Non-current dirty copies can directly service
queries from any node and instance recovery
Cache Fusion Write Sharing
Instance A
Instance B
4
Current
225
Update
Block 10
1
Master
2
3
Copy
225
Requester
GCS
200
Instance C
Holder
Recovery in a RAC Database
 Survival of one instance guarantees data availability
 Recovery cost is proportional to the number of
failures, not the total number of RAC nodes
–
–
cached copies in surviving nodes are used
only redo logs from failed instances are applied
 Eliminates disk reads for blocks that are present in a
surviving instance’s buffer cache.
 Global cache is available after an initial log scan, well
before redo application begins.
Database Server Consolidation with
9iRAC
Mixed Apps
and Database
Environment
App1 App2 App3
Phase 1
Mixed Apps with
common Database
but separate data
models
Mixed Apps with
common Database
but ‘single version of
the truth’
App1 App2 App3
App1 App2 App3
Real Application Cluster
Real Application Cluster
Phase 2
Phase 3
• 9iRAC will support mixed workloads (OLTP/DSS) within common DB
Comparison of Common Interconnects
Name
Latency
Protocol
Memory
Channel
.003
milliseconds
RDG
Fast
Ethernet
A few
milliseconds
UDP
Gigabit
Ethernet
A few
milliseconds
UDP
RDG = Reliable Datagram
UDP = User Datagram Protocol
Throughput
100MB/s
10MB/s
100MB/s
Comparison of Common Interconnects
Name
Latency
Protocol
Throughput
HP Hyper
Fabric 2
.022
milliseconds
HMP
400MB/s
HMP = Hyper Messaging Protocol
Failover - 1
Server 1
Instance 1
Server 2
Instance 2
• NB. This is not what you
would do in production!
Server 3
Instance 3
Shared
Database
Server 4
Instance 4
Failover - 1
Concurrent
Manager - Batch
Server 1
Instance 1
2027 OLTP
Users
Server 2
Instance 2
2027 OLTP
Users
Server 3
Instance 3
• No headroom within cluster
to fail over any users in the event
of an unplanned outage
Shared
Database
2027 OLTP
Users
Server 4
Instance 4
Failover - 1
Concurrent
Manager - Batch
Server 1
Instance 1
2027 OLTP
Users
Server 2
Instance 2
• Uncontrolled shutdown
of Instance 4
2027 OLTP
Users
Server 3
Instance 3
Shared
Database
2027 OLTP
Users
Server 4
Instance 4
Failover - 1
Concurrent
Manager - Batch
Server 1
Instance 1
2027 OLTP
Users
Server 2
Instance 2
• Surviving Instance (1,2 or 3)
performs instance recovery
2027 OLTP
Users
Server 3
Instance 3
Shared
Database
Server 4
Instance 4
Failover - 2 (Mixed Workload)
Server 1
Instance 1
Server 2
Instance 2
• NB. This is what you
could do in production!
Server 3
Instance 3
Shared
Database
Server 4
Instance 4
Failover - 2 (Mixed Workload)
Concurrent
Manager - Batch
Server 1
Instance 1
OLTP
Workload
Server 2
Instance 2
Free
Instance
Server 3
Instance 3
• Headroom exists within cluster
to fail over any users in the event
of an unplanned outage
Shared
Database
DataWarehouse
Workload
Server 4
Instance 4
Failover - 2 (Mixed Workload)
Concurrent
Manager - Batch
Server 1
Instance 1
OLTP
Workload
Server 2
Instance 2
DataWarehouse
Workload
Server 3
Instance 3
• Uncontrolled shutdown of
Instance 4
• DW Workload fails over to
free Instance 3
Shared
Database
Server 4
Instance 4
Transparent Application Failover (TAF)
(Recovery with Hot Failover)
• Login context maintained
• Little or no user downtime
Shared Memory/Global Area
shared
SQL
log
buffer
Shared Memory/Global Area
shared
SQL
log
buffer
Shared Memory/Global Area
. . .. .
.
Shared Disk Database
shared
SQL
log
buffer
Shared Memory/Global Area
shared
SQL
log
buffer
Transparent Application Failover (TAF) Mission Critical Availability
TAF Protects or fails-over:
• applications using OCI8 (ODBC, JDBC (thick),
Oracle objects for OLE, SQL*PLUS)
• client-server connections
• user session state
• active cursors (select statements)
TAF can also be used to gracefully shutdown a system