Fault-Tolerance in Real
Download
Report
Transcript Fault-Tolerance in Real
High Availability and FaultTolerance in Real-Time
Databases
Jan Lindström
University of Helsinki
Department of Computer Science
Overview
The
causes of the downtime
Availability solutions
CASE 1: Clustra
CASE 2: TelORB
CASE 3: RODAIN
The Causes of Downtime
Planned downtime
• Hardware expansion
• Database software upgrades
• Operating system upgrades
Unplanned downtime
•
•
•
•
•
•
Hardware failure
OS failure
Database software bugs
Power failure
Disaster
Human error
Traditional Availability Solutions
Replication
Failover
Primary
restart
CASE 1: Clustra
Developed
for telephony applications such
as mobility management and intelligent
networks.
Relational database with location and
replication transparency.
Real-Time data locked in main memory and
API provides precompiled transactions.
NOT a Real-Time Database !
Clustra hardware architecture
Data distribution and replication
How Clustra Handles Failures
Real-Time failover: Hot-standby data is up to date, so
failover occurs in milliseconds.
Automatic restart and takeback: Restart of the failed node
and takeback of operations is automatic, and again
transparent to users and operators.
Self-repair: If a node fails completely, data is copied from
the complementary node to standby. This is also automatic
and transparent.
Limited failure effects
How Clustra Handles Upgades
Hardware,
operating system, and database
software upgrades without ever going
down.
• Process called “rolling upgrade”
– I.e. required changes are performed node by node.
– Each node upgraded to catch up to the status of
complementary node.
– When this is completed, the operation is performed
to next node.
CASE 2: TelORB
Characteristics
Very high availability (HA), robustness implemented in
SW
(soft) Real Time
Scalability by using loosely coupled processors
Openness
Hardware: Intel/Pentium
Language: C++, Java
Interoperability: CORBA/IIOP, TCP/IP, Java RMI
3:rd party SW: Java
TelORB Availability
Real-time object-oriented DBMS supporting
Distributed Transactions
ACID
Data
properties expected from a DBMS
Replication (providing redundancy)
Network
Redundancy
Software Configuration Control
Automatic
restart of processes that originally executed
on a faulty processor on the ones that are working
Self
healing
In service upgrade of software with no disturbance to operation
Hot replacement of faulty processors
Automatic Reconfiguration
reloading
Software upgrade
Smooth software upgrade when old and
new version of same process can coexist
Possibility for application to arrange for
state transfer between old and new static
process (unless important states aren’t
already stored in the database)
Partioning: Types and Data
17
18
19
A
20
B
21
22
17
18
19
20
A
21
22
B
Advantages
Standard interfaces through Corba
Standard languages: C++, Java
Based on commercial hardware
(Soft) Real-time OS
Fault tolerance implemented in software
Fully scalable architecture
Includes powerful middleware: A database management system and
functions for software management
Fully compatible simulated environment for development on
Unix/Linux/NT workstations
CASE 3: RODAIN
Real-Time
Object-Oriented Database
Architechture for Intelligent Networks
Real-Time Main-Memory Database System
Runs on Real-Time OS: Chorus/ClassiX
(and Linux)
Rodain Cluster
Rodain Database Node
Database Primary Unit
User Request
Interpreter Subsystem
Distributed Database
Subsystem
ObjectOriented
Database
Management
Subsystem
Watchdog Subsystem
Fault-Tolerance and
Recovery Subsystem
Fault-Tolerance and
Recovery Subsystem
Distributed Database
Subsystem
User Request
Interpreter Subsystem
Database Mirror Unit
Watchdog Subsystem
ObjectOriented
Database
Management
Subsystem
shared
disk
RODAIN Database Node II
Database Primary Unit
User Request
Interpreter Subsystem
Distributed Database
Subsystem
ObjectOriented
Database
Management
Subsystem
Watchdog Subsystem
Fault-Tolerance and
Recovery Subsystem
Fault-Tolerance and
Recovery Subsystem
Distributed Database
Subsystem
User Request
Interpreter Subsystem
Database Mirror Unit
Watchdog Subsystem
ObjectOriented
Database
Management
Subsystem
shared
disk
ORD Architechture
Index
OCC
Data
TRP
ORD
DDS
FTRS
Fault-Tolerance
Based
on logs and mirroring
Logs send to Mirror
Mirror stores the logs on disk in SSS
Mirror maintains copy of main-memory
database
Mirror makes disk copies of its database
image
Recovery
Based
on role switching
When Primary fails
• Mirror updates its MMDB up to date
• Mirror starts acting as new Primary
• Active transactions are restarted or lost
When
Mirror fails
• Primary stores logs directly to SSS
Recovery II
During
recovery the failed Node
• always starts as a mirror node
• loads most recent database image from disks in
SSS
• updates the log tail to loaded image
• receives the logs from primary node
• continues as normal mirror node
Further reading
Bratsberg, Humborstad: Online Scaling in a Highly Available
Database, Proceedings of the 27th VLDB Conference, Rome, Italy, pp
451-460, 2001.
Clustra Database: Technical Overview, http://www.clustra.com
Björnerstedt, Ketoja, Sintorn, Sköld: Replication between
Geographically Separated Clusters - An Asynchronous Scalable
Replication Mechanism for Very High Availability, Proceedings of the
International Workshop on Databases in Telecommunications II, LNCS
vol 2209, pp. 102-115, 2001.
Lindström, Niklander, Porkka, Raatikainen: A Distributed Real-Time
Main-Memory Database for Telecommunications, Proceedings of the
International Workshop on Databases in Telecommunications, LNCS
vol 1819, pp 158-173, 2000.