Recovery in Parallel Database Systems
Download
Report
Transcript Recovery in Parallel Database Systems
Transaction Communication
Ela Sharda
[email protected]
Outline
Transaction
Concurrency Transparency
ACID properties
Distributed Transactions
Actors
Activity Log
Shadow Paging
The Two Phase Commit Protocol
References
Transaction
Fundamental unit of interconnection b/w client and
server processes in database system.
Database Transaction : Sequence of synchronous
request/reply operations that satisfy ACID properties.
Communication Transaction: Set of asynchronous
request/reply communications
- have ACID properties.
- without sequential constraint of operation.
Concurrency Transparency
Enables several processes to operate concurrently
using shared resources without interference between
them.
Execution of a transaction appears to take place in
a critical section.
But the operations from different transactions are
interleaved to gain more concurrency.
ACID Properties[1]
Transactions are communications with ACID property,
ACID mainly concerned with concurrency transparency
of distributed system.
- Atomicity
- Consistency
- Isolation
- Durability
Atomicity
A transaction is atomic unit of processing.
Either all the operations in a transaction are performed
in its entirety or not performed at all.
It is the responsibility of transaction recovery
subsystem to ensure atomicity.
Consistency
Complete execution of transaction results from one
consistent state to another.
Execution of interleaved transaction is equal to
serial execution of transactions.
It is also referred as serializability.
Isolation
A transaction should appear as it is being executed
in isolation form other transactions.
Execution of the transaction should not be
interfered with any other transactions executing
concurrently.
- Partial results of an incomplete transaction are
not visible to others before the transaction is
successfully committed.
Durability
The results of a committed transaction will be made
permanent even if a failure occurs after the
commitment.
Once a transaction completes successfully(commits),
its changes to the state survive failures.
Distributed Transactions
Consists of one coordinator (initiator of the transaction)
and several participating processes (remote process)
At commit
- Atomicity: either all nodes commit or none do
- Isolation: effects of the transaction not made visible until
all nodes have made an irrevocable decision to commit
or abort
ACID properties can be achieved by the two-phase
commit(2PC) protocol.
There is one coordinator and multiple participants. Each
of them have access to some stable storage.
Actors
Coordinator : The processor that initiates the
transaction.
- The coordinator oversees the activities of the other participants in the
transaction to ensure a consistent outcome.
Participants : All the remaining processors.
Activity Log [2]
Each participant keeps track of updated data objects by
maintaining a private work space.
Updates contain old and new value.
Each site has an activity log which is kept on the disk.
- On abort: undo of uncommitted transactions (rollback)
- After crash: redo of committed transactions (roll forward)
Needed for durability of committed transactions.
Shadow Paging [3]
Here, transaction logs are not required.
Two directories created during the life of transaction
- current directory
- shadow directory
When transaction starts, both directories are same.
Shadow directory never changed during the transaction.
Current directory updated when write operation is
performed.
When transaction commits, shadow directory is discarded
and current directory is copied to the storage.
The Two Phase Commit Protocol
Coordinator
Participant
- precommit the transaction
- send request to all participants
- collect all replies
- received request message
- if ready
then precommit and send YES
else abort transaction and say NO
- if all votes are unanimous YES
then commit and send COMMIT
else abort and send ABORT
- received response
- receive decision
- if commit then COMMIT
- if abort then ABORT
- send response
What does the coordinator write
to the log?
When the coordinator sends request, it writes a
start-2PC record and a list containing the identities
of the participants to its log. It also sends this list to
each participant at the same time as the request
message.
Before the coordinator sends Commit to the
participants, it writes a commit record in the log.
If the coordinator sends rollback to the
participants, it writes a rollback to the log.
What does the participant write
to the log?
If a participant votes Yes, it writes a yes record to its
log before sending yes to the coordinator. This log
record contains the name of the coordinator and the
list of the participants.
If this participant votes No, it writes a rollback record
to the log and then sends the No vote to the
coordinator.
After receiving Commit / Rollback, a participant writes
a commit / rollback record into the log.
Categories of Recovery Actions
[1]
Failures before a precommit
Action : Simply abort the transaction.
Equivalent to voting NO for transaction.
Failures after a precommit but before a commit
Action : Abort the transaction.
Remulticast the request message.
Failures after a commit
Action : Resend the commit message.
References
[1] Distributed Operating Systems & Algorithms, by Randy
Chow and Theodore Johnson, 1997
[2] Operating System Principles, by Silberschatz and Galvin,
Seventh edition
[3] Recovery in Parallel Database Systems, by Svein-Olaf
Hvasshovd, second edition
[4] http://en.wikipedia.org/wiki/Two-phase_commit_protocol
[5] http://www.cnds.jhu.edu/courses/cs437/Week6.pdf