A Rough Guide to RAC
Download
Report
Transcript A Rough Guide to RAC
A Rough Guide to RAC
Julian Dyke
Independent Consultant
Web Version
1
© 2005 Julian Dyke
juliandyke.com
Agenda
2
© 2005 Julian Dyke
Introduction
Availability
Scalability
Manageability
Total Cost of Ownership
Conclusion
juliandyke.com
Introduction
3
© 2005 Julian Dyke
juliandyke.com
Some RAC Terminology
OCRDUMP
RAC
CSS
CRSCTL
SRVCTL
GCS
LMD
CLUVFY
OCFS2
LMS
PI
OCR
OIFCFG
LCK
OCRCHECK
VIP
OCSSD
CRSD
GRD
DIAG
VIPCA
CRS
FAN
EVMD
ONS
LMON
BAST
OCFS
AST
OCRCONFIG
GES
ASM
TAF
LKDEBUG
FCF
CRS_STAT
4
© 2005 Julian Dyke
GSD
juliandyke.com
What is RAC?
Multiple instances running on separate servers (nodes)
Single database on shared storage accessible to all nodes
Instances exchange information over an interconnect network
Instance 1
Interconnect
Node 1
Local
Disk
5
© 2005 Julian Dyke
Instance 2
Node 2
Shared
Storage
Local
Disk
juliandyke.com
Instances versus Databases
6
A RAC cluster includes
one database
one or more instances
A database is a set of files
Located on shared storage
Contains all persistent resources
An instance is a set of memory structures and processes
Contain all temporal resources
Can be started and stopped independently
© 2005 Julian Dyke
juliandyke.com
Instances versus Databases
Private Network
(Interconnect)
Public
Network
Instance 1
Instance 2
Instance 3
Instance 4
Node 1
Node 2
Node 3
Node 4
Storage
Network
Database
7
© 2005 Julian Dyke
juliandyke.com
What is a RAC Database?
8
Located on shared storage accessible by all instances
Includes
Control Files
Data Files
Online Redo Logs
Server Parameter File
May optionally include
Archived Redo Logs
Backups
Flashback Logs (Oracle 10.1 and above)
Change Tracking Writer files (Oracle 10.1 and above)
© 2005 Julian Dyke
juliandyke.com
What is a RAC Database?
Contents similar to single instance database except
One redo thread per instance
ALTER DATABASE ADD LOGFILE THREAD 2
GROUP 3 SIZE 51200K,
GROUP 4 SIZE 51200K;
ALTER DATABASE ENABLE PUBLIC THREAD 2;
If using Automatic Undo Management also require one UNDO
tablespace per instance
CREATE UNDO TABLESPACE "UNDOTBS2"
DATAFILE SIZE 25600K AUTOEXTEND ON
MAXSIZE UNLIMITED EXTENT MANAGEMENT
LOCAL;
9
Additional dynamic performance views (V$, GV$ but not X$)
created by $ORACLE_HOME/rdbms/admin/catclust.sql
© 2005 Julian Dyke
juliandyke.com
What is the Interconnect?
10
Instances communicate with each other over the interconnect
(network)
Information transferred between instances includes
data blocks
locks
SCNs
Typically 1GB Ethernet
UDP protocol
Often teamed in pairs to avoid SPOFs
Can also use Infiniband
Fewer levels in stack
Other proprietary protocols are available
© 2005 Julian Dyke
juliandyke.com
Why Use Shared Storage?
11
Mandatory for
Database files
Control files
Online redo logs
Server Parameter file (if used)
Optional for
Archived redo logs (recommended)
Executables (Binaries)
Password files
Parameter files
Network configuration files
Administrative directories
Alert Log
Dump Files
© 2005 Julian Dyke
juliandyke.com
What Shared Storage is Supported?
12
Oracle supplied options
Oracle Cluster File System (OCFS)
Version 1
Windows and Linux
Supports database and archived redo logs
No executables
Version 2 - August 2005
Linux, Windows and Solaris
As OCFS1 plus executables
Automatic Storage Management (ASM)
Oracle 10.1 and above
More transparent in Oracle 10.2 and above
Both require underlying SAN or NAS
Do not require LVM
© 2005 Julian Dyke
juliandyke.com
What Shared Storage is Supported?
13
Can use (continued)
Network Attached Storage
NFS-based
Potentially lower cost - no fibre channel required
Easy to administer
Raw devices
Difficult to administer
Cannot be used with archived redo logs
Third-party Cluster File System
Still a popular choice with many sites
Others (not supported)
Firewire - maximum two nodes - recommended in 10g
NBD - Network Block Devices - Solaris and Linux
NFS - not supported, but might still work
© 2005 Julian Dyke
juliandyke.com
What is a Shared Oracle Home?
14
Can install multiple copies of Oracle executables on local
disks on each node
Can also install Shared Oracle Home
single copy of Oracle executables on shared storage
Oracle 9.2
Only Oracle database software
Oracle 10.1
Cluster Ready Services (CRS)
Oracle database software + ASM
Oracle 10.2
Oracle Clusterware (CRS)
ASM
Oracle database software
© 2005 Julian Dyke
juliandyke.com
Internal Structures and Services
15
Global Resource Directory (GRD)
Records current state and owner of each resource
Contains convert and write queues
Distributed across all instances in cluster
Global Cache Services (GCS)
Implements cache coherency for database
Coordinates access to database blocks for instances
Maintains GRD
Global Enqueue Services (GES)
Controls access to other resources (locks) including
library cache
dictionary cache
© 2005 Julian Dyke
juliandyke.com
Background Processes
16
Each RAC instance has set of standard background
processes e.g.
PMON
SMON
LGWR
DBWn
ARCn
RAC instances use additional background processes to
support GCS and GES including
LMON
LCK0
LMDn
LMSn
DIAG
© 2005 Julian Dyke
juliandyke.com
Portability
17
Most single-instance applications should port to RAC
Some exceptions
Application must scale well on single instance
Can be difficult to evaluate
Some features do not work e.g.
DBMS_ALERT
DBMS_PIPE
External inputs/outputs may need modification
Flat files etc
Some RAC features require additional coding
TAF
Code may need upgrading to use RAC functionality e.g.
FCF requires JDBC Implicit Connection Cache
© 2005 Julian Dyke
juliandyke.com
Why Do Users Deploy RAC?
18
Users may deploy RAC to achieve
Increasing availability
Increasing scalability
Improving maintainability
Reduction in total cost of ownership
© 2005 Julian Dyke
juliandyke.com
Why Do DBAs Deploy RAC?
19
DBAs may want to deploy RAC because:
Realistic next step for experienced Oracle DBAs
Intellectual challenge
Job protection - ties organisation to Oracle technology
Possible improved earnings
It looks good on their CV
© 2005 Julian Dyke
juliandyke.com
Availability
20
© 2005 Julian Dyke
juliandyke.com
What is Failover?
If one node or instance fails
Node detecting failure will
Read redo log of failed instance from last checkpoint
Apply redo to datafiles including undo segments (roll
forward)
Rollback uncommitted transactions
Cluster is frozen during part of this process
Instance 1
Node 1
21
© 2005 Julian Dyke
Interconnect
Instance 2
Node 2
juliandyke.com
What are Database Services?
22
Database Services are logical groups of sessions
Can be configured using
DBCA
Enterprise Manager (10.2 and above)
Can also be configured using
SRVCTL (Oracle Cluster Registry only)
SQL*Plus (Data Dictionary only)
Text editor (Network Configuration)
In Oracle 10.1 and above, each service has
Preferred Nodes (used by default)
Available Nodes (used if preferred node fails)
© 2005 Julian Dyke
juliandyke.com
What are Database Services?
23
Can be used with Resource Manager to control resource
usage e.g.
CPU
Parallel execution
Can be used for monitoring
V$SERVICE_STATS
Can be used for diagnostics
DBMS_MONITOR
trace
statistics
© 2005 Julian Dyke
juliandyke.com
What is Oracle Clusterware?
24
Introduced in Oracle 10.1 (Cluster Ready Services - CRS)
Renamed in Oracle 10.2 to Oracle Clusterware
Cluster Manager providing
Node membership services
Global resource management
High availability functions
On Linux
Configured in /etc/inittab
Implemented using three daemons
CRS - Cluster Ready Service
CSS - Cluster Synchronization Service
EVM - Event Manager
In Oracle 10.2 includes High Availability framework
Allows non-Oracle applications to be managed
© 2005 Julian Dyke
juliandyke.com
What is the OCR?
25
Oracle Cluster Registry (OCR)
Configuration information for Oracle Clusterware / CRS
Introduced in Oracle 10.1
Replaced Server Management (SRVM) disk/file
Similar to Windows Registry
Located on shared storage
In Oracle 10.2 and above can be mirrored
Maximum two copies
© 2005 Julian Dyke
juliandyke.com
What is the OCR?
26
Defines cluster resources including:
Databases
Instances
RDBMS
ASM
Services
Node Applications
VIP
ONS
GSD
Listener Process
© 2005 Julian Dyke
juliandyke.com
What is a Voting Disk?
27
Known as Quorum Disk / File in Oracle 9i
Located on shared storage accessible to all instances
Used to determine RAC instance membership
In the event of node failure voting disk is used to determine
which instance takes control of cluster
Avoids split brain
In Oracle 10.2 and above can be mirrored
Odd number of copies (1, 3, 5 etc)
© 2005 Julian Dyke
juliandyke.com
What is VIP?
28
Node application introduced in Oracle 10.1
Allows Virtual IP address to be defined for each node
All applications connect using Virtual IP addresses
If node fails Virtual IP address is automatically relocated to
another node
Only applies to newly connecting sessions
© 2005 Julian Dyke
juliandyke.com
What is TAF?
29
TAF is Transparent Application Failover
Sessions connected to a failed instance will be terminated
Uncommitted transactions will be rolled back
Sessions can be reconnected to another instance
automatically if using TAF
Can optionally re-execute in-progress SELECT statements
Statement re-executed with same SCN
Fetches resume at point of failure
Session state is lost including
Session parameters
Package variables
Class and ADT instantiations
© 2005 Julian Dyke
juliandyke.com
What is TAF?
TAF is Transparent Application Failover
Requires additional coding in client
Requires configuration in TNSNAMES.ORA
RAC_FAILOVER =
(DESCRIPTION =
(ADDRESS_LIST =
(FAILOVER = ON)
(ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521))
)
(CONNECT_DATA =
(SERVICE_NAME = RAC)
(SERVER = DEDICATED)
(FAILOVER_MODE =(TYPE=SELECT)(METHOD=BASIC)(RETRIES=30)(DELAY=5))
)
)
30
© 2005 Julian Dyke
juliandyke.com
What is FAN?
31
Fast Application Notification (FAN)
Introduced in Oracle 10.1
Method by which applications can be informed of changes in
cluster status
Handle node failures
Workload balancing
Applications must connect using services
Can be notified using
Server side callouts
Fast Connection Failover (FCF)
ONS API
© 2005 Julian Dyke
juliandyke.com
What is ONS?
32
Oracle Notification Service (ONS)
Introduced in Oracle 10.1
Allows out-of-band messages to be sent to
Nodes in cluster
Middle-tier application servers
Clients
Underlying mechanism for Fast Application Notification (FAN)
© 2005 Julian Dyke
juliandyke.com
Does RAC Increase Availability?
33
Depends on definition of availability
May achieve less unplanned downtime
May have more time to respond to failures
Instance failover means any node can fail without total loss of
service
Must provide have overcapacity in cluster to survive failover
Additional Oracle and RAC licenses
Load can be distributed over all running nodes
Can use Grid to provision additional nodes
© 2005 Julian Dyke
juliandyke.com
Does RAC Increase Availability?
34
Can still get data corruptions
Human errors / software errors
Only one logical copy of data
Only one logical copy of application / Oracle software
Lots of possibility for human errors
Power / network cabling / storage configuration
Upgrades and patches are more complex
Can upgrade software on subset of nodes
If database is affected then still need downtime
© 2005 Julian Dyke
juliandyke.com
Scalability
35
© 2005 Julian Dyke
juliandyke.com
What is Scalability?
36
RAC overhead means that linear scalability is difficult to
achieve
Global Cache Services (blocks)
Global Enqueue Services (locks)
As number of instances increases, probability that instance is
a resource master decreases
Scaling factor of 1.8 is considered good
Dependent on application design and implementation
Scaling factor improves with
Node affinity
Elimination of contention
© 2005 Julian Dyke
juliandyke.com
What is Scalability?
Workload
Scalability is the relationship between increments of
resources and workloads
Can be any resource but with RAC normally refers to adding
instances
Scalability can be
linear - optimal but rare
non-linear - suboptimal but normal
Workload
Linear
Resource
37
© 2005 Julian Dyke
NonLinear
Resource
juliandyke.com
What is Workload Balancing?
Balancing of workload across available instances
Can have
Client-side connection balancing
Server-side connection balancing
Client-side connection balancing
Workload distributed randomly across nodes
RAC =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2)(PORT = 1521))
(LOAD_BALANCE = ON)
(FAILOVER = ON)
(CONNECT_DATA =
(SERVICE_NAME = RAC)
(FAILOVER_MODE = (TYPE = SELECT)(METHOD = BASIC))
)
)
38
© 2005 Julian Dyke
juliandyke.com
What is Workload Balancing?
39
Server-side connection balancing
Dependent on current workload on each node
PMON monitors workload and updates listeners
Depends on long or short connections
In Oracle 10.1
Set PREFER_LEAST_LOADED_NODE in listener.ora
OFF for long connections
ON for short connections (default)
In Oracle 10.2
Can specify load balancing goal for each service
NONE, SERVICE_TIME or THROUGHPUT
Can also specify connection load balancing goal
SHORT or LONG
© 2005 Julian Dyke
juliandyke.com
Increasing Scalability
40
If application scales well on a single-instance then it should
scale well on RAC
Eliminate contention
Use sequences
Use locally partitioned tables and indexes
Attempt to achieve node affinity
Avoid contention for single blocks
Distribute rows for hot blocks
Small block size e.g. 2048 or 4096
ALTER TABLE MINIMIZE RECORDS PER BLOCK
High PCTFREE / Low PCTUSED
Filler columns e.g. CHAR (2000)
© 2005 Julian Dyke
juliandyke.com
Increasing Scalability
41
Use Automatic Segment Space Management
Default in Oracle 10.2
Use larger block size for read-only objects
Reduce number of GCS messages required
Minimize lock usage
Eliminate unnecessary parsing
Increase size of shared pool
Bind variables
Cursor sharing
Use optimistic locking
Eliminate unnecessary SELECT FOR UPDATE statements
© 2005 Julian Dyke
juliandyke.com
Manageability
42
© 2005 Julian Dyke
juliandyke.com
Server Parameter File
Introduced in Oracle 9.0.1
Must reside on shared storage
Shared by all RAC instances
Binary (not text) files
Parameters can be changed using ALTER SYSTEM
Can be backed up using the Recovery Manager (RMAN)
Created using
CREATE SPFILE [ = ‘SPFILE_NAME’ ]
FROM PFILE [ = ‘PFILE_NAME’ ];
init.ora file on each node must contain SPFILE parameter
SPFILE = <pathname>
43
© 2005 Julian Dyke
juliandyke.com
Parameters
RAC uses same parameters as single-instance
Some must be different on each instance
Some must be same on each instance
Can be global or local
[*.]<parameter_name> = <value>
[<sid>]<parameter_name> = <value>
Must be set using ALTER SYSTEM statement
ALTER SYSTEM SET parameter = value
[ SCOPE = MEMORY | SPFILE | BOTH ]
[ SID = <sid>]
ALTER SYSTEM RESET parameter = value
[ SCOPE = MEMORY | SPFILE | BOTH ]
[ SID = <sid>]
44
© 2005 Julian Dyke
juliandyke.com
Parameters
45
Some parameters must be same on each instance including *:
ACTIVE_INSTANCE_COUNT
ARCHIVE_LAG_TARGET
CLUSTER_DATABASE
CONTROL_FILES
DB_BLOCK_SIZE
DB_DOMAIN
DB_FILES
DB_NAME
DB_RECOVERY_FILE_DEST
DB_RECOVERY_FILE_DEST_SIZE
DB_UNIQUE_NAME
MAX_COMMIT_PROPAGATION_DELAY
TRACE_ENABLED
UNDO_MANAGEMENT
* Correct for Oracle 10.1
© 2005 Julian Dyke
juliandyke.com
Parameters
46
Some parameters, if used, must be different on each instance
including
THREAD
INSTANCE_NUMBER
INSTANCE_NAME
UNDO_TABLESPACE
ROLLBACK_SEGMENTS
DML_LOCKS must be identical on each instance if set to zero
© 2005 Julian Dyke
juliandyke.com
DBCA
47
Can be used to
Create RAC database and instances
Create ASM instance
Manage ASM instance (10.2)
Add RAC instances
Create RAC database templates
structure only
with data
Create clone RAC database (10.2)
Create, Manage and Drop Services
Drop instances and database
© 2005 Julian Dyke
juliandyke.com
What is SRVCTL?
48
Utility used to manage cluster database
Configured in Oracle Cluster Registry (OCR)
Controls
Database
Instance
ASM
Listener
Node Applications
Services
Options include
Start / Stop
Enable / Disable
Add / Delete
Show current configuration
Show current status
© 2005 Julian Dyke
juliandyke.com
SRVCTL - Examples
Starting and Stopping a Database
srvctl start database -d RAC
srvctl stop database -d RAC
Starting and Stopping an Instance
srvctl start instance -d RAC -i RAC1
srvctl stop instance -d RAC -i RAC1
Starting and Stopping a Service
srvctl start service -d RAC -s SERVICE1
srvctl stop service -d RAC -s SERVICE1
Starting and Stopping ASM on a specified node
srvctl start asm -n node1
srvctl stop asm -n node1
49
© 2005 Julian Dyke
juliandyke.com
Enterprise Manager
In Oracle 10.1 and above
Database Control
Installed by DBCA
Controls single cluster
Grid Control
Uses separate repository
Oracle 10.2 version available
Requires Oracle 10.1 database
50
Fully supports RAC in both versions
Except
Oracle 10.1 cannot create / delete services
Oracle 10.2 better interconnect performance monitoring
© 2005 Julian Dyke
juliandyke.com
What is CLUVFY?
51
Introduced in Oracle 10.2
Supplied with Oracle Clusterware
Can be downloaded from OTN (Linux and Windows)
Written in Java - requires JRE (supplied)
Also works with 10.1 (specify -10gR1 option)
Checks cluster configuration
stages - verifies all steps for specified stage have been
completed
components - verifies specified component has been
correctly installed
© 2005 Julian Dyke
juliandyke.com
CLUVFY
52
Stages include
-post hwos
post check for hardware and operating system
-pre cfs
pre-check for CFS setup
-post cfs
post-check for CFS setup
-pre crsinst
pre-check for Oracle Clusterware installation
-post crsinst
post-check for Oracle Clusterware installation
-pre dbinst
pre-check for database installation
-pre dbcfg
pre-check for database configuration
© 2005 Julian Dyke
juliandyke.com
CLUVFY
53
Components include
nodereach
Checks reachability between nodes
nodecon
Checks node connectivity
cfs
Checks CFS integrity
ssa
Checks shared storage accessibility
space
Checks space availability
sys
Checks minimum system requirements
clu
Checks cluster integrity
clumgr
Checks cluster manager integrity
ocr
Checks OCR integrity
crs
Checks Oracle Clusterware (CRS) integrity
nodeapp
Checks node applications exist
admprv
Checks administrative privileges
peer
Compares properties with peers
© 2005 Julian Dyke
juliandyke.com
CLUVFY
For example, to check configuration before installing Oracle
Clusterware on node1 and node2 use:
sh runcluvfy.sh stage -pre crsinst -n node1,node2
54
Checks:
node reachability
user equivalence
administrative privileges
node connectivity
shared stored accessibility
If any checks fail append -verbose to display more information
© 2005 Julian Dyke
juliandyke.com
Other Utilities
55
Additional RAC utilities and diagnostics include
OCRCONFIG
OCRCHECK
OCRDUMP
CRSCTL
CRS_STAT
Additional RAC diagnostics can be obtained using
ORADEBUG utility
DUMP option
LKDEBUG option
Events
© 2005 Julian Dyke
juliandyke.com
Does RAC Improve Manageability?
56
Advantages
Fewer databases to manage
Easier to monitor
Easier to upgrade
Easier to control resource allocation
Resources can be shared between applications
Disadvantages
Upgrades potentially more complex
Downtime may affect more applications
Requires more experienced operational staff
Higher cost / harder to replace
© 2005 Julian Dyke
juliandyke.com
Total Cost of
Ownership
57
© 2005 Julian Dyke
juliandyke.com
Reduction in TCO?
58
Possible for sites with legacy systems
Mainframes / Minicomputers
Applications / Packages
RAC option adds 50% to licence costs except for
Users with site licences
Standard edition (10.1+, max 4 CPU with ASM)
Retrain existing staff or use dedicated staff
Consolidation may bring economies of scale
Monitoring
Backups
Disaster Recovery
© 2005 Julian Dyke
juliandyke.com
Reduction in TCO?
59
Additional resources required
Redundant hardware
Nodes
Network switches
SAN fabric
Hardware e.g. fibre channel cards
Reduction in hardware support costs
May not require 24 hour support
Viable to hold stock of spare components
© 2005 Julian Dyke
juliandyke.com
What are the Alternatives to RAC?
60
Data Guard
Physical Standby
Introduced in Oracle 7.3.4
Stable, well proven technology
Requires redundant hardware
Implemented by many sites
Can be used with RAC
Logical Standby
Introduced in Oracle 9.2
Still not widely adopted
Streams
Introduced in Oracle 9.2
Implemented by increasing number of sites
Advanced Replication
© 2005 Julian Dyke
juliandyke.com
What are the Alternatives to RAC?
61
Symmetric Multiprocessing (SMP) Systems
Single Point of Failure
Simplified configuration
Eliminate RAC overhead
Parallel systems
For systems with deterministic input
Messaging
Data Warehouses
Other Clustering Technologies
SAN
Operating System
etc
© 2005 Julian Dyke
juliandyke.com
Conclusion
62
Success of RAC deployments dependent on
Application design and implementation
Failover requirements
IT infrastructure
Flexibility and commitment of IT department(s)
Before deploying RAC
Investigate and reject alternatives
Perform proof of concept
Test application
Evaluate benefits and costs
Learn RAC concepts and administration
Buy a good book :)
© 2005 Julian Dyke
juliandyke.com
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:
[email protected]
My website address is:
www.juliandyke.com
63
© 2005 Julian Dyke
juliandyke.com