Fault-Tolerance in Real

Download Report

Transcript Fault-Tolerance in Real

High Availability and FaultTolerance in Real-Time
Databases
Jan Lindström
University of Helsinki
Department of Computer Science
Overview
 The
causes of the downtime
 Availability solutions
 CASE 1: Clustra
 CASE 2: TelORB
 CASE 3: RODAIN
The Causes of Downtime

Planned downtime
• Hardware expansion
• Database software upgrades
• Operating system upgrades

Unplanned downtime
•
•
•
•
•
•
Hardware failure
OS failure
Database software bugs
Power failure
Disaster
Human error
Traditional Availability Solutions
 Replication
 Failover
 Primary
restart
CASE 1: Clustra
 Developed
for telephony applications such
as mobility management and intelligent
networks.
 Relational database with location and
replication transparency.
 Real-Time data locked in main memory and
API provides precompiled transactions.
 NOT a Real-Time Database !
Clustra hardware architecture
Data distribution and replication
How Clustra Handles Failures




Real-Time failover: Hot-standby data is up to date, so
failover occurs in milliseconds.
Automatic restart and takeback: Restart of the failed node
and takeback of operations is automatic, and again
transparent to users and operators.
Self-repair: If a node fails completely, data is copied from
the complementary node to standby. This is also automatic
and transparent.
Limited failure effects
How Clustra Handles Upgades
 Hardware,
operating system, and database
software upgrades without ever going
down.
• Process called “rolling upgrade”
– I.e. required changes are performed node by node.
– Each node upgraded to catch up to the status of
complementary node.
– When this is completed, the operation is performed
to next node.
CASE 2: TelORB
Characteristics
 Very high availability (HA), robustness implemented in
SW
 (soft) Real Time
 Scalability by using loosely coupled processors
Openness




Hardware: Intel/Pentium
Language: C++, Java
Interoperability: CORBA/IIOP, TCP/IP, Java RMI
3:rd party SW: Java
TelORB Availability
 Real-time object-oriented DBMS supporting
 Distributed Transactions
 ACID
 Data
properties expected from a DBMS
Replication (providing redundancy)
 Network
Redundancy
 Software Configuration Control
 Automatic
restart of processes that originally executed
on a faulty processor on the ones that are working
 Self
healing
 In service upgrade of software with no disturbance to operation
 Hot replacement of faulty processors
Automatic Reconfiguration
reloading
Software upgrade


Smooth software upgrade when old and
new version of same process can coexist
Possibility for application to arrange for
state transfer between old and new static
process (unless important states aren’t
already stored in the database)
Partioning: Types and Data
17
18
19
A
20
B
21
22
17
18
19
20
A
21
22
B
Advantages

Standard interfaces through Corba

Standard languages: C++, Java

Based on commercial hardware

(Soft) Real-time OS

Fault tolerance implemented in software

Fully scalable architecture

Includes powerful middleware: A database management system and
functions for software management

Fully compatible simulated environment for development on
Unix/Linux/NT workstations
CASE 3: RODAIN
 Real-Time
Object-Oriented Database
Architechture for Intelligent Networks
 Real-Time Main-Memory Database System
 Runs on Real-Time OS: Chorus/ClassiX
(and Linux)
Rodain Cluster
Rodain Database Node
Database Primary Unit
User Request
Interpreter Subsystem
Distributed Database
Subsystem
ObjectOriented
Database
Management
Subsystem
Watchdog Subsystem
Fault-Tolerance and
Recovery Subsystem
Fault-Tolerance and
Recovery Subsystem
Distributed Database
Subsystem
User Request
Interpreter Subsystem
Database Mirror Unit
Watchdog Subsystem
ObjectOriented
Database
Management
Subsystem
shared
disk
RODAIN Database Node II
Database Primary Unit
User Request
Interpreter Subsystem
Distributed Database
Subsystem
ObjectOriented
Database
Management
Subsystem
Watchdog Subsystem
Fault-Tolerance and
Recovery Subsystem
Fault-Tolerance and
Recovery Subsystem
Distributed Database
Subsystem
User Request
Interpreter Subsystem
Database Mirror Unit
Watchdog Subsystem
ObjectOriented
Database
Management
Subsystem
shared
disk
ORD Architechture
Index
OCC
Data
TRP
ORD
DDS
FTRS
Fault-Tolerance
 Based
on logs and mirroring
 Logs send to Mirror
 Mirror stores the logs on disk in SSS
 Mirror maintains copy of main-memory
database
 Mirror makes disk copies of its database
image
Recovery
 Based
on role switching
 When Primary fails
• Mirror updates its MMDB up to date
• Mirror starts acting as new Primary
• Active transactions are restarted or lost
 When
Mirror fails
• Primary stores logs directly to SSS
Recovery II
 During
recovery the failed Node
• always starts as a mirror node
• loads most recent database image from disks in
SSS
• updates the log tail to loaded image
• receives the logs from primary node
• continues as normal mirror node
Further reading




Bratsberg, Humborstad: Online Scaling in a Highly Available
Database, Proceedings of the 27th VLDB Conference, Rome, Italy, pp
451-460, 2001.
Clustra Database: Technical Overview, http://www.clustra.com
Björnerstedt, Ketoja, Sintorn, Sköld: Replication between
Geographically Separated Clusters - An Asynchronous Scalable
Replication Mechanism for Very High Availability, Proceedings of the
International Workshop on Databases in Telecommunications II, LNCS
vol 2209, pp. 102-115, 2001.
Lindström, Niklander, Porkka, Raatikainen: A Distributed Real-Time
Main-Memory Database for Telecommunications, Proceedings of the
International Workshop on Databases in Telecommunications, LNCS
vol 1819, pp 158-173, 2000.