Clustering Types of Clustering

Download Report

Transcript Clustering Types of Clustering

Clustering
Types
of Clustering
Objectives




At the end of this module the student will
understand the following tasks and concepts.
What clustering is and why you would want it
Clustering options
Differences between various types of
clustering; advantages and disadvantages
Factors to consider when choosing a cluster
type
What is a cluster?

My definition


Multiple systems performing a single function
Black box
Why Cluster?



Performance
Availability
Recoverability
Features

Speedup



Scaleup



Faster response times
Transactions finish faster
More work done
More capacity, more concurrent transactions
Scalability
Single Node Scaling



Scales to
multiple CPUs
Doesn’t scale
beyond one
node
Multiple single
points of failure
Users
Server
Database
Database
Cluster Definitions




Shared Nothing (Federated)
Replicated Site
Shared Disk
Failover



Active/Passive
Active/Active
Shared Everything
Shared Nothing Cluster




Only one CPU is connected to a disk
May have shared memory
MPP Systems are Shared Nothing
Other vendors have “Shared Nothing”
clusters
Federated (Shared Nothing)
Cluster



Distributed database
(separate database on
each machine)
Data is spread across
nodes; each machine
has part of the data
Function is spread
across nodes
Two-Phase Commit
1.
Good!
3.
Server
Database
Got it?
Got it!

2.
Server
Database
Replicated System




Data replicated at
the server (network)
level or at the
storage (SAN) level
Multiple copies of the
same database
Most common
implementation is
Active/Passive
Failover between
nodes
Active Node
Server level
Replication
Server
or
Storage level
Replication
Database
Passive Node
Server
Database
Shared Disk Cluster





Shared file system
Multiple systems attached to the same disk
All nodes must have access to data
Only one database instance; only one node has
“ownership” of the shared disk
Synchronization between systems; If one node
fails, then the other takes over
Cluster Interconnect

Most Shared Disk clusters require some form of Cluster
Interconnect



Most clusters implement a “heartbeat” between cluster
nodes to monitor node health




Network – i.e. Gigabit Ethernet
Specialized – i.e. Infiniband, Myrinet
Multiple nodes require a switch
Usually separated from the LAN
Some shared disk clusters implement a “heartbeat”
mechanism to a quorum disk via the SAN in addition
to/instead of network heartbeat
Oracle RAC implements Cache Fusion across the
interconnect


Extra network traffic increases the throughput requirements
UDP implementation requires a separate network
Failover Cluster




One system is a standby system for
another
Only one system doing work at a time
Pseudo-Shared Disk
Limited scalability in active/passive mode
Failover Clustering
Users


Fault tolerant
systems; highly
available
Basic failover
clusters don’t
scale beyond
two nodes
Server
Server
Database
Database
Active/Passive vs. Active/Active


Both are failover only
Active/Passive



One node is active
The other is passive until failover
Active/Active





Still uses active/passive technology
2 separate databases
One is active on node A and passive on node B
The second database is active on node B and
passive on node A.
Separate applications and user connections to each
of the different databases
Active/Passive
Node A
Node B



Node A is active
Node B is passive
until/unless Node A fails
Only one Oracle license is
required
Active/Passive
X
Node A
Node B
If Node A fails …
Active/Passive
X
Node A
Node B


Node B becomes
active
Node A is dead
(definitely passive!)
until repaired and then
“failed back” if
necessary.
Active/Active
Node A
Node B
Application A
Application B
User Group A
User Group B
Passive Failover for B

Application Group A and
User Group A are active
on Node A

Application Group B and
User Group B are active
on Node B

Each node serves as
failover for the other.

2 separate databases.
Both nodes are not
accessing the same data
at the same time.

Oracle license required on
each node
Passive Failover for A
Switchover vs. Failover




Many cluster systems utilize the concept of
Service Groups
Service Groups allow granular control of
individual software packages (i.e. individual
Oracle instances)
An individual group can be manually moved to
another server without affecting other service
groups – a “switchover” versus a “failover”
Adds greater management flexibility
N-to-1 Failover Configuration
Node
A
Node
B
Node
C
Node
D

Failover
Application A
Application D
Application G
User Group A
User Group D
User Group G
Application B
Application E
Application H
User Group B
User Group E
User Group H
Application C
Application F
Application I
User Group C
User Group F
X
Failover G

Failover H
Failover I
User Group I
Failback

Node D is a
dedicated failover
node for failures on
Node A, B, and C
Extends number of
active nodes
A problem is that
once the failed node
is available, the
Service Groups on
Node D (failover
node) must failback
to original server to
restore High
Availability
N + 1 Failover Configuration
Node
A
Node
B
Node
C
Node
D

Failover
Application A
Application D
Application G
User Group A
User Group D
User Group G
Application B
Application E
Application H
User Group B
User Group E
User Group H
Application C
Application F
Application I
User Group C
User Group F
X
User Group I
Failover G

Failover H
Failover I

Node D is a
dedicated failover
node for failures on
Node A, B, and C
Extends number of
active nodes
Once Node C is
restored, it becomes
the failover node,
leaving Node D in
production.
N-to-N Failover Configuration
Node
A
Node
B
Node
C
Node
D
Failover G
Failover H
Failover I
Application A
Application D
Application G
Application J
User Group A
User Group D
User Group G
User Group J
Application B
Application E
Application H
Application K
User Group B
User Group E
User Group H
User Group K
Application C
Application F
Application I
Application L
User Group C
User Group F
User Group I
User Group L
X



Node C fails, and
its Service
Groups are redistributed
across surviving
nodes
Optimal solution
for > 2 nodes
Implemented on
third party
failover clusters
and Oracle RAC
Third Party Clusters



Support for extended cluster nodes – up to
32 nodes for vendor Clustering
Supports N + 1 and N - N failover
clustering
Integrated with hardware and/or software
replication for long distance “clusters”
Clustering Solutions from
Oracle






Oracle Failsafe
Oracle Data Guard
Advanced Replication
Shared Nothing Cluster
Oracle Parallel Server
Real Application Clustering (RAC)
Failsafe




MS Clustering Enabled
Two servers one disk subsystem
Switches in the event of a hardware failure
Requires recovery
Standby Database




Copy of Database (usually remote)
Kept up to date with Archive Logs
Oracle 8i feature
Oracle 9i-10g version of a standby
database is Data Guard
Oracle Data Guard


Mirrored Server
Physical Standby



Logical Standby



Archive Logs are applied to the remote database
Switchover occurs in the event of a failure
Log Miner technology is used to generate SQL
Standby Database can also be used for read-only reporting
Advantages



Safe from user failure
Can be in different location
No recovery required
Advanced Replication



Uses Updatable-Snapshots
Replicates to another system
Systems stay in sync
Oracle Parallel Server





Shared disk cluster product
Loosely Coupled
Scalable performance
No downtime in the event of a system
failure
Replaced by RAC in 9i
True Shared Disk Server
(RAC)







ONE database
Separate multiple
instances (processes &
memory)
All nodes can access data
simultaneously
Shared Everything Cluster
Transparent Application
Failover
Oracle license required on
each node
Highest level of cluster
functionality
Node A
Node B
Factors to Consider for
Clustering

Which do you need most?




High Availability – Failover Clusters, Synchronous Replication, Data
Guard
Performance scalability – Active/Active failover clusters, N-to-N
failover clusters
Both – Oracle RAC
Administration complexity


Failover clusters – relatively low
Oracle RAC – relatively high


Local or long distance?



Substantially less complex for 10g RAC than 9i RAC
Local – Failover, RAC
Remote – Federated database, Replication, Standby database/Data
Guard
Oracle license costs


Active/Passive failover clusters – active nodes only
Active/Active failover clusters, RAC – per node
Review




What type of commit is required for a Federated
(shared nothing) cluster?
What is the difference in how the database is
kept up-to-date in Oracle Data Guard vs.
Advanced Replication?
What is the difference between N-to-1 failover
clusters and N + 1 failover clusters?
How many databases are there in an 8 node
Oracle RAC cluster?
Summary

Types of clusters:

Shared Nothing Clusters



Shared Disk Clusters







Active/Passive
Active/Active
N-to-1
N+1
N-to-N
Shared Everything Clusters


Failover
Oracle RAC
Failover Clusters


Federated databases
Replication
Oracle RAC
Choosing a cluster type involves trade-offs in
functionality, costs, and administration complexity