Inside the Interconnect

Download Report

Transcript Inside the Interconnect

RAC Basics
Julian Dyke
Independent Consultant
Web Version - February 2008
1
© 2008 Julian Dyke
juliandyke.com
Agenda

2
© 2008 Julian Dyke
Real Application Clusters
 The Theory
 The Reality
juliandyke.com
RAC
The Theory
3
© 2008 Julian Dyke
juliandyke.com
RAC
Redundancy
4

Single Point of Failure
 If component fails, system will be inaccessible

Redundancy
 Duplicate components
 If component fails another can be used
 Active-Active or Active-Passive

Examples include
 Power Supplies
 RAID
 Bonded Networks
 IO Multipathing
 Oracle RAC
© 2008 Julian Dyke
juliandyke.com
RAC
4-node cluster
Public
Network
Private
Network
(Interconnect)
Node 1
Node 2
Node 3
Node 4
Instance
1
Instance
2
Instance
3
Instance
4
Storage
Network
Shared
Storage
5
© 2008 Julian Dyke
juliandyke.com
RAC
Cache Coherency
6

RAC must ensure changes made by any instance
 Are not overwritten by another instance
 Maintain ACID properties

Current Blocks
 Blocks can be updated by any instance
 Only current version of a block can be updated
 Only one current version of a block can exist across all
instances

Consistent Read Blocks
 Can have theoretically unlimited number of consistent
versions of a block
 in each instance
 across all instances
© 2008 Julian Dyke
juliandyke.com
RAC
Cluster Manager
7

All clusters must have cluster management software
 Manages node membership and evictions

Oracle Clusterware
 Mandatory for RAC in Oracle 10.1 and above
 Known as Cluster Ready Services (CRS) 10.1 only
 Can be combined with vendor clusterware
 IBM HA/CMP
 HP ServiceGuard
 Sun Cluster
 Must be running before ASM/RDBMS instances can be
started on a node
 Can be used with non-RAC databases and applications
 Oracle 10.2 and above
© 2008 Julian Dyke
juliandyke.com
RAC
Interconnect
8

Used for inter-node communication by:
 Oracle Clusterware
 ASM Instances
 RDBMS Instances

Optimally high bandwidth / low latency

Typically 1GB Ethernet
 Uses TCP / UDP protocols
 NIC interfaces often bonded for availability

Other physical networks supported e.g. Infiniband
© 2008 Julian Dyke
juliandyke.com
RAC
Shared Storage
9

Required for:
 Oracle Clusterware Files
 Oracle Cluster Registry (OCR)
 Voting Disk
 Database Files
 Control Files
 Database
 Online Redo Logs
 Server Parameter File

Strongly recommended for
 Archived redo logs
 Backup copies
© 2008 Julian Dyke
juliandyke.com
RAC
Shared Storage

10
Can use:
 Storage Area Network (SAN) e.g.:
 EMC Clariion / Symmetrix
 HP MSA / EVA / XP series
 Hitachi
 Fujitsu
 Network Attached Storage (NAS) e.g.:
 Network Appliance
 Pillar Data System
 Sun StorageTek
 EMC Celerra
 JBOD (with ASM)
© 2008 Julian Dyke
juliandyke.com
RAC
Shared Storage
11

Fibre Channel
 SCSI protocol - block based
 Normally 2Gb or 4Gb
 Requires one or more Host Bus Adapters (HBA) per node
 Requires fabric switches

iSCSI
 SCSI protocol - block based
 Packets sent over dedicated IP network
 Can use standard network components
 Processing often offloaded to NIC firmware

NFS
 File-based
 Uses standard network components
© 2008 Julian Dyke
juliandyke.com
RAC
Shared Storage

12
Cluster-aware File Systems:

Automatic Storage Management

Cluster File Systems
 Oracle Cluster File System (OCFS/OCFS2)
 Red Hat GFS
 IBM GPFS
 Sun Storedge QFS
 Veritas CFS

Network File System
 On supported Network Attached Storage only
© 2008 Julian Dyke
juliandyke.com
RAC
Automatic Storage Management (ASM)






13
Introduced in Oracle 10.1
 Additional functionality in 10.2 and 11.1
 Generic code (all supported platforms)
Available for both single-instance and RAC databases
 Provides shared storage for RAC
Can optionally provide mirroring:
 Normal Redundancy (mirrored)
 High Redundancy (triple mirroring)
 Useful with JBOD or extended clusters
Mandatory for Oracle 10g Standard Edition RAC
Presents storage as disk groups containing
 Physical disks
 Logical files
Requires additional ASM instance on each node
© 2008 Julian Dyke
juliandyke.com
RAC
Licensing
14

Standard Edition
 RAC option free
 Maximum two nodes
 Maximum four CPUs
 Must use Oracle Clusterware
 Must use Automatic Storage Management (ASM)
 No extended clusters

Enterprise Edition
 RAC option 50% extra (per EE license)
 No limit on number of nodes
 No limit on number of CPUs
 Can use any shared storage (ASM, CFS or NFS)
 Can use Enterprise Manager Packs (Diagnostics, Tuning..)
© 2008 Julian Dyke
juliandyke.com
RAC
Process Architecture
Clusterware
OPROCD
OCSSD
CRSD
Clusterware
EVMD
OPROCD
OCSSD
+ASM1
EVMD
+ASM2
PMON
SMON
LGWR
DBWn
ARCH
PMON
SMON
LGWR
DBWn
ARCH
LMON
LCK0
LMD0
LMSn
DIAG
LMON
LCK0
LMD0
LMSn
DIAG
PROD1
PROD2
PMON
SMON
LGWR
DBWn
ARCH
PMON
SMON
LGWR
DBWn
ARCH
LMON
LCK0
LMD0
LMSn
DIAG
LMON
LCK0
LMD0
LMSn
DIAG
Node 1
15
CRSD
© 2008 Julian Dyke
Node 2
juliandyke.com
RAC
Reasons For Deployment




16
Availability
 Node failure
 Instance failure
Scalability
 Distribute workload across multiple instances
 Scale out
Manageability
 Economies of scale
 Administration / Monitoring / Backups / Standby
Reduction in total cost of ownership
 Database consolidation
 Commodity hardware
© 2008 Julian Dyke
juliandyke.com
RAC
Availability



17
Ensure continued availability of database in event of node or
instance failure
 Automatic failover
 No human intervention required
In the event of node or instance failure:
 All sessions connected to failed node are terminated
 Sessions connected to remaining nodes are
 temporarily suspended while resources are re-mastered
 resume after brown-out period
 New sessions will be connected to remaining nodes only
Ensuring availability requires spare capacity during normal
operations
 Either additional node
 Or reduction in service level
© 2008 Julian Dyke
juliandyke.com
RAC
Availability
Public
Network
Private
Network
(Interconnect)
Node 1
Node 2
Node 3
Node 4
Instance
1
Instance
2
Instance
3
Instance
4
Storage
Network
Shared
Stoage
18
© 2008 Julian Dyke
juliandyke.com
RAC
Scalability
Resources
Workload can be distributed across multiple nodes
 Workload can be balanced across all nodes using
connection management
 Client-side using Oracle Net
 Server-side using listener processes
 Workload can be directed to specific nodes using services
 Level of scalability dependent on application
Resources

Throughput
19
© 2008 Julian Dyke
Throughput
juliandyke.com
RAC
Scalability
20

Factors that can degrade scalability
 Excessive parsing
 Consistent reads
 SELECT FOR UPDATE / user defined locking
 DDL
 Object-oriented code

Features that can improve scalability
 Services
 Automatic Segment Space Management
 Partitioning
 Sequences
 Reverse indexes
© 2008 Julian Dyke
juliandyke.com
RAC
Manageability
21

Advantages
 Consolidation
 Economies of scale
 Administration
 Monitoring
 Backup and recovery
 Standby database

Disadvantages
 Increased Planned downtime
 Complexity
 Dependencies
 Skills
© 2008 Julian Dyke
juliandyke.com
RAC
Total Cost of Ownership
22

Benefits
 Lower hardware costs - commodity hardware
 Lower support costs
 Management economies of scale

Costs
 Redundant hardware
 Servers, Storage, NIC, HBA, Switches, Fabric
 Oracle licenses
 Experienced staff
 Application modifications
© 2008 Julian Dyke
juliandyke.com
RAC
Applications


23
Most applications should run on RAC without modification
 Performance is not guaranteed
 Applications that perform well in single-instance have best
chance of scaling in RAC
 Applications performing badly in single-instance will
perform worse in RAC
 Some features do not port easily to RAC e.g.:
 DBMS_ALERT, DBMS_PIPE, External files
 Applications that can be logically partitioned tend to scale
best
 Minimize use of interconnect
 Maximize use of buffer caches
Implementation more likely to succeed if you have direct or
indirect access to source code
© 2008 Julian Dyke
juliandyke.com
RAC
Database Services





24
Allow sessions with similar workload characteristics to be
logically grouped and managed
Services can be assigned to
 set of preferred instances - used if available
 set of available instances - used if preferred instances not
available
 failover to available instances is automatic
 failback to preferred instances is manual
Services can be configured to maximize instance affinity
Limited statistics reported at service level
 Can also be reported at service / module / action level
Trace can be enabled at service level
 Can also be enabled at service / module / action level
© 2008 Julian Dyke
juliandyke.com
RAC
Database Services
Before
After
SERVICE1
Listener1
Listener2
Listener1
Listener2
PROD1
PROD2
PROD1
PROD2
SERVICE1
SERVICE1
PROD1
SERVICE1 PREFERRED
25
SERVICE1
© 2008 Julian Dyke
SERVICE1
PROD2
AVAILABLE
juliandyke.com
RAC
Extended Clusters
26

Currently the Holy Grail of high availability

RAC nodes located at physically separate sites
 Implicit disaster recovery
 Requires Enterprise Edition licences + RAC option

In the event of a site failure, database is still available
 Storage is duplicated at each site
 Can use ASM or vendor-supplied storage technology

Active / Active configuration
 Users can access database via either site

Configuration and performance tuning are complex
 Cache fusion traffic between sites
© 2008 Julian Dyke
juliandyke.com
RAC
Extended Clusters
Private Network
Public Network
Instance 1
Quorum
Node 1
Instance 2
Node 2
Site3
Storage
Storage
Network
Network
27
Database
Database
Site1
Site2
© 2008 Julian Dyke
juliandyke.com
RAC
Disaster Recovery



28
Data Guard and RAC are fully compatible
 Can configure any permutation e.g.
Primary
Standby
Single-instance
Single instance
RAC
Single instance
RAC
RAC
Single instance
RAC
All instances can participate in redo log shipping
Only one instance can perform managed recovery
 Standby database might be a potential bottleneck
© 2008 Julian Dyke
juliandyke.com
RAC
Alternatives
29

Single Instance Databases
 No RAC overhead
 Simpler to install / configure / manage
 Single point of failure

Oracle Products
 Oracle Streams
 Oracle Clusterware

Proprietary Clustering Solutions
 HP ServiceGuard
 IBM HA/CMP
 Sun Cluster
© 2008 Julian Dyke
juliandyke.com
RAC
The Reality
30
© 2008 Julian Dyke
juliandyke.com
RAC
The Reality






31
Many sites running RAC
 Mostly Oracle 10.2
 A few still running Oracle 10.1
 Still some Oracle 9.2
Most RAC users develop their own applications or use
bespoke applications developed by a third-party
Probably around 20 extended clusters in production across
Europe
Many Oracle 10.2 sites run ASM
 Very few run OCFS or raw devices
 Very few use third-party cluster file systems
Most sites using SAN - fewer using NAS
In UK most users currently deploy on Linux x86-64
 Solaris very popular in other regions
© 2008 Julian Dyke
juliandyke.com
RAC
The Reality
32

Few Oracle 10g users run vendor clusterware

Most RAC deployments for availability
 Decreased unplanned downtime
 Increased planned downtime

Increasing number of deployments for scalability
 Workload balancing
 Services

Manageability benefits very doubtful
 Economies of Scale versus Additional complexity

TCO reductions possible in some circumstances
 Replace large SMP boxes
 Replace legacy active-passive clusters
© 2008 Julian Dyke
juliandyke.com
RAC
The Reality
33

Most users run 2-node clusters
 Some have 3-node or 4-node clusters
 A handful run five nodes or more

Most users only have one database per cluster
 Few grids

Oracle Clusterware scales well
 Number of nodes does not impact performance

Oracle RAC databases might scale well
 Dependent on application
 Additional nodes may improve or degrade performance
© 2008 Julian Dyke
juliandyke.com
RAC
The Reality
34

ASM currently the most popular RAC storage technology

Deployed in numerous Oracle 10.2 RAC production systems

No operating system utilities
 ASMCMD in Oracle 10.2 and above

Generally disliked by storage administrators
 Too much control to DBAs

Acceptable performance
 ASM instance provides metadata
 RDBMS instances read and write blocks directly from files
© 2008 Julian Dyke
juliandyke.com
Thank you for your interest
35

References
 http://www.juliandyke.com/References/References.html

Questions
 [email protected]
© 2008 Julian Dyke
juliandyke.com