11_TransactionMgmt_S..

Download Report

Transcript 11_TransactionMgmt_S..

COMP5138
Relational Database
Management Systems
Dr. Uwe Röhm
Lecture 11
Transaction Management
and Concurrency Control
L11
Transaction Mgmt.
Today’s Agenda
 Transaction Management
 Transaction Concept
 Serializability
 Concurrency Control Mechanisms
 Locking
 Snapshot Isolation
 Distributed Transactions
 2-Phase Commit Protocol
 Recovery
 Write-Ahead Logging
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-2
L11
Transaction Mgmt.
Transaction Concept
 A transaction is a collection of one or more operations on
one or more databases,which reflects a discrete unit of work
 In the real world, this happened (completely) or it didn’t happen at all
(Atomicity)
 Can be read-only (e.g. RequestBalance), but typically updating
 Commerce examples
 Transfer money between accounts
 Purchase a group of products
 Withdraw money from ATM
 Student record system
 Register for a class (either waitlist or allocated)
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-3
L11
Transaction Mgmt.
Transaction Definition in SQL
 Data manipulation language must include a construct for
specifying the set of actions that comprise a transaction.
 In many DBMS such as Oracle, a transaction begins
implicitly
 Some other DBMS (e.g. Sybase or SQL Server) provide a
BEGIN TRANSACTION command
 A transaction in SQL ends by:
 COMMIT requests to commit current transaction

The system might commit the transaction, or it might abort if needed.
 ROLLBACK causes current transaction to abort - always satisfied.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-4
L11
Transaction Mgmt.
Transaction Example
 Pseudocode for a product order transaction:
BEGIN TRANSACTION
display greeting
get order request
SELECT product record
IF product is available THEN
UPDATE quantityOnOrder of product record
INSERT order record
send message to shipping departement
COMMIT
ELSE
ROLLBACK
END IF
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-5
L11
Transaction Mgmt.
Another Transaction Example
 Transaction in Embedded SQL
1.
2.
3.
4.
5.
6.
7.
8.
EXEC SQL BEGIN DECLARE SECTION
int flight; START TRANSACTION
char date[10]
char seat [3]
int occ;
EXEC SQL END DECLARE SECTION
Void chooseSeat() {
/* C code to prompt the user to enter flight, date, and seat and store these in the three variables
with those name */
9. EXEC SQL Select occupied into :occ
10.
From Flights
11.
Where fltNum=:flight and fltDate=:date and fltSeat=:seat;
12.If (!occ) {
13. EXEC SQL Update Flights
14.
Set occupied = true
15.
Where fltNum=:flight and fltDate=:date and fltSeat=:seat;
16. /*C and SQL code to record the seat assignment and inform the user of the assignment */
17.}
18. Else /* C code to notify user of unavailability and ask for another seat selection */
19.} EXEC COMMIT;
 Cf. also our JDBC program!!!
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-6
L11
Transaction Mgmt.
ACID Properties of Transactions
 Concurrent execution of user programs is essential for
good DBMS performance.
 Problem: In a multi-user environment, simultaneous access to data
can result in interference and data loss
 Transaction execution must maintain the correctness and
consistency of the database
 Hence additional requirements are placed on the execution
of transactions beyond those placed on ordinary programs:
 Atomicity
 Consistency
 Isolation
 Durability
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-7
L11
Transaction Mgmt.
A C I D Properties
 Atomicity. Transaction should either complete or have no
effect at all
 In case of a failure, all effects of operations of not-completed
transactions are undone.
 Consistency. Execution of a transaction in isolation
preserves the consistency of the database.
 Isolation. Although multiple transactions may execute
concurrently, each transaction must be unaware of other
concurrently executing transactions.
 Intermediate transaction results must be hidden from other
concurrently executed transactions.
 Durability. The effect of a transaction on the database state
should not be lost once the transaction has committed
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-8
L11
Transaction Mgmt.
Atomicity of Transactions
 A transaction might commit after completing all its actions,
or it could abort (or be aborted by the DBMS) after
executing some actions.
 Atomicity: a user can think of a transaction as always
executing all its actions in one step, or not executing any
actions at all.
 DBMS logs all actions so that it can undo the actions of aborted
transactions.
 Also, in case of a failure, all actions of not-committed transactions
are undone.
 ACID properties handled transparent for the transaction by
the DBMS
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-9
L11
Transaction Mgmt.
Commit and Abort
 If the transaction successfully completes it is said to commit
 The system is responsible for ensuring that all changes to the
database have been saved
 If the transaction does not successfully complete, it is said
to abort
 The system is responsible for undoing, or rolling back, all changes the
transaction has made
 Possible reasons for abort:

System crash
 Transaction aborted by system
•
•
•
•

Execution cannot be made atomic (a site is down)
Execution did not maintain database consistency (integrity constraint violated)
Execution was not isolated
Resources not available (deadlock)
Transaction requests to roll back
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-10
L11
Example
Transaction Mgmt.
 Let‘s consider two transactions:
T1:
T2:
BEGIN
BEGIN
A=A-100,
A=1.05*A,
B=B+100
B=1.05*B
END
END
 Transaction T1 is transferring $100 from account A to account B.
The second transaction credits both accounts with a 5% interest payment.
 Atomicity requirement — all updates are reflected in the db or none.
 Consistency requirement – T1 does not change the total sum of A and
B, and after T2, this total sum is 5% higher.
 Isolation requirement — There is no guarantee that T1 will execute
before T2 or vice-versa, if both are submitted together. However, the net
effect must be equivalent to these transactions running serially in some
order.
 Durability requirement — once a transaction has completed, the updates
to the database by this transaction must persist despite failures
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-11
L11
Transaction Mgmt.
Example (cont’d)
 Consider a possible interleaving (schedule):
T1: A=A-100,
T2:
B=B+100
A=1.05*A,
B=1.05*B
 This is OK. But what about:
T1: A=A-100,
T2:
B=B+100
A=1.05*A, B=1.05*B
 The DBMS’s view of the second schedule:
T1: R(A),W(A),
R(B),W(B)
T2:
R(A),W(A),R(B),W(B)
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-12
L11
Transaction Mgmt.
Anomalies with Interleaved
Execution
 Reading Uncommitted Data (WR Conflicts, “dirty reads”):
T1: R(A),W(A),
R(B),W(B),Abort
T2:
R(A),W(A),Commit
 Unrepeatable Reads (RW Conflicts):
T1: R(A),
R(A),W(A),Commit
T2:
R(A),W(A),Commit
 Overwriting Uncommitted Data (WW Conflicts):
T1:
T2:
W(A),
W(B),Commit
W(A),W(B),Commit
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-13
L11
Transaction Mgmt.
Serializability
 The Issue: Maintaining database correctness when many
transactions are accessing the database concurrently
 Basic Assumption – Each transaction preserves database consistency.
 Thus serial execution of a set of transactions preserves database
consistency.
 Schedule – sequence of operations that indicates the
chronological order in which instructions of concurrent
transactions are executed.
 Serializability:
A schedule is serializable if it is equivalent to a serial
schedule
 Different notions of schedule equivalence…
 This lecture concentrates on conflict serializability
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-14
L11
Transaction Mgmt.
Conflict Serializability
 Two schedules are conflict serializable if:
 Involve the same actions of the same transactions
 Every pair of conflicting actions is ordered the same way
 Two actions ai and aj of transactions Ti and Tj conflict if and
only if they access the same data X and at least one of
these actions wrote X. (ai, aj) are called a conflict pair.
1. ai = read(X), aj = read(X). don’t conflict.
2. ai = read(X), aj = write(X). they conflict.
3. ai = write(X), aj = read(X). they conflict
4. ai = write(X), aj = write(X). they conflict
 Note: With SQL - Select corresponds to read,
Insert, Delete, Update correspond to write
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-15
L11
Transaction Mgmt.
Serializable Schedule Example 1
 The concurrent schedule
S: r1(x) w2(z) w1(y)
is equivalent to the serial schedules of T1 and T2 in either
order:
 T1, T2: r1(x) w1(y) w2(z)
 T2, T1: w2(z) r1(x) w1(y)
and
 Reason: operations of distinct transactions on different data
items commute.
 Hence, S is a serializable schedule
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-16
L11
Transaction Mgmt.
Serializable Schedule Example 2
 The concurrent schedule
S: r1(z) r2(q) w2(z) r1(q) w1(y)
is equivalent to the serial schedule T1,T2:
r1(z) r1(q) w1(y) r2(q) w2(z)
since read operations of distinct transactions on the same
data item commute.
 Hence, S is a serializable schedule
 However, S is not equivalent to T2,T1 since read and write
operations (or two write operations) of distinct transactions
on the same data item do not commute.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-17
L11
Non-Serializable Schedule Example
Transaction Mgmt.
 Example: course registration; cur_reg is the number of
current registrants
T1: r(cur_reg : 29)
w(cur_reg : 30)
T2:
r(cur_reg : 29) w(cur_reg : 30)
 Schedule not equivalent to T1,T2 or T2,T1
 Database state no longer corresponds to real-world state,
integrity constraint violated
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-18
L11
Transaction Mgmt.
Testing for Conflict Serializability
 Consider some schedule of a set of transactions T1,T2,...,Tn
 Precedence graph:
 direct graph where the vertices are the transactions.
 edge from Ti to Tj if the two transaction conflict, and Ti accessed the
data item on which the conflict arose earlier.
 Central Theorem:
A schedule is conflict
serializable if and only if its
precedence graph is acyclic.
 Example:
T1 and T2 have
2 conflict pairs
T1
T2
 The serializability order can be
obtained by a topological sorting of
the graph.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-19
L11
Transaction Mgmt.
Today’s Agenda
 Transaction Management
 Transaction Concept
 Serializability
 Concurrency Control Mechanisms
 Locking
 Snapshot Isolation
 Distributed Transactions
 2-Phase Commit Protocol
 Recovery
 Write-Ahead Logging
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-20
L11
Transaction Mgmt.
Concurrency Control vs.
Serializability Tests
 Testing a schedule for serializability after it has executed is
a little bit too late!
 Concurrency Control: The protocol that manages
simultaneous operations against a database so that
serializability (actually: isolation levels) is assured.
 Such protocols will generally not examine the precedence graph as it
is being created; instead a protocol will impose a discipline that
avoids non-seralizable schedules.
 Tests for serializability help understand why a concurrency control
protocol is correct.
 Two important techniques:
 Locking Protocol
 Versioning
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-21
L11
Transaction Mgmt.
Lock-based Concurrency Control
 Two-phase Locking (2PL) Protocol:
 Locks are associated with each data item
 A transaction must obtain a S (shared) lock on object before reading,
and an X (exclusive) lock on object before writing.

exclusive (X) lock: Data item can be both read as well as written
by just one transaction
 shared (S) lock: Data item can only be read (but shared by transactions)
 All locks held by a transaction are released when the transaction
complete, and a transaction can not request additional locks once it
releases any locks.
 If a transaction holds an X lock on an object, no other transaction
can get a lock (S or X) on that object.

Similar if a transaction requests a X lock of an already locked data object
 Instead, such transactions must wait until the conflicting lock is released
from the previous transaction(s)
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-22
L11
Transaction Mgmt.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
Locking Example
11-23
L11
2PL versus Strict 2PL
Transaction Mgmt.
 Two-Phase locking: All locks are acquired before any lock is
released
locks
However: possible
cascading aborts!
t
Phase 1
Phase 2
Strict: Transaction holds all locks until completion
locks
t
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-24
L11
Lock Compatibility Matrix
and Lock Granularity
Transaction Mgmt.
Held Shared
Exclusive
Requested







Shared
OK
T2 wait on T1
Exclusive
T2 wait on T1
T2 wait on T1
Locking Granularity: size of the database item locked
database
table / index
page
row
column
Tradeoff between waiting time and overhead
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-25
L11
Transaction Mgmt.
Pitfalls of Lock-Based Protocols
 Consider the partial schedule
 Neither T3 nor T4 can make progress
 Such a situation is called a deadlock.
 To handle a deadlock one of T3 or T4 must be rolled back and its
locks released.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-26
L11
Transaction Mgmt.
Deadlocks
 Deadlock: Cycle of transactions waiting for locks to be
released by each other.
Two ways of dealing with deadlocks:
 Deadlock prevention
 E.g. priorities based on timestamps
 Deadlock detection
 A transaction in the cycle must be aborted by DBMS (since
transactions will wait forever)
 DBMS uses deadlock detection algorithms or timeout to deal with it
 Most commonly used
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-27
L11
Transaction Mgmt.
Isolation Levels
 Serializability provides a conservative definition of correctness
 For a particular application there might be many acceptable nonserializable schedules
 Requiring serializability might degrade performance
 In addition to serializable, every DBMS offers a variety of less
stringent isolation levels
 SERIALIZABLE is the most stringent (correct for all applications)
 Lower levels of isolation give better performance

Might allow incorrect schedules
 Might be adequate for some applications
 Performance requirements might not be achievable if schedules are
serializable
 Application programmer is responsible for choosing
appropriate level!!!
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-28
L11
Transaction Mgmt.
Anomalies in Non-Serializable
Schedules
 Dirty read (previous example – write lock given up early)
w1(x) r2(x) abort1
 Non-Repeatable Read (read lock given up early)
r1(x) w2(x) commit2 r1(x)
 Lost Update (result of non-repeatable read – read lock
given up early)
 Two transactions trying to deposit in the same bank account – the
deposit of transaction 2 is lost
r1(Bal) r2(Bal) w2(Bal) commit2 w1(Bal) commit1
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-29
L11
Transaction Mgmt.
ANSI Standard Isolation Levels
 Serializable — default according to SQL-standard…
 In practice, most systems have weaker default level!
 Repeatable read — only committed records to be read,
repeated reads of same record must return same value.
However, a transaction may not be serializable – it may find
some records inserted by a transaction but not find others.
 Read committed — only committed records can be read,
but successive reads of record may return different (but
committed) values. (most common in practice!)
 Read uncommitted - even uncommitted records may be read
 Lower degrees of consistency useful for gathering
approximate information about the database, e.g., statistics
for query optimizer.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-30
L11
Transaction Mgmt.
ANSI Standard Isolation Levels
 Serializable — default according to SQL-standard…
 In practice, most systems have weaker default level!
 Repeatable read — only committed records to be read,
repeated reads of same record must return same value.
However, a transaction may not be serializable – it may find
some records inserted by a transaction but not find others.
 Read committed — only committed records can be read,
but successive reads of record may return different (but
committed) values. (most common in practice!)
 Read uncommitted - even uncommitted records may be read
 Lower degrees of consistency useful for gathering
approximate information about the database, e.g., statistics
for query optimizer.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-31
L11
Transaction Mgmt.
Locks and Isolation Levels
 DBMS guarantees that each SQL statement is isolated
 Early (non-strict) lock release used to implement levels
 Short-term locks - held for duration of single statement
 Long-term locks - held until transaction completes (strict)
 At all levels, transactions obtain long-term write locks
 This means for isolation levels:
 READ UNCOMMITTED - no read locks (dirty reads possible since
transaction can read a write-locked item)
 READ COMMITTED - short-term read locks on rows (non-repeatable
reads possible since transaction releases read lock after reading)
 REPEATABLE READ - long-term read locks on rows (phantoms
possible)
 SERIALIZABLE - combination of long-term table, row, and index locks
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-32
L11
Transaction Mgmt.
Snapshot Isolation
 Not an ANSI standard isolation level, but used in some
major DBMS (Oracle, SQL Server 2005, PostgreSQL 8)
 Multiversion database: The old value of an item is not
overwritten when it is updated. Instead, new version created
 DBMS can construct, for any i, the state of an item as a result of the
execution of the first i transactions to commit
 Snapshot: The database state produced by the execution of the first
i transactions to commit (snapshots never change)
 No read locks necessary: a transaction reads all values from
latest snapshot at time it started. Thus, read/only
transactions do not wait.
 A transaction T that has updated x can commit if no other
transaction that concurrrently updated x has committed
 First-committer-wins
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-33
L11
Transaction Mgmt.
Discussion
 Locking is a conservative approach in which conflicts are
prevented. Disadvantages:
 Lock management overhead.
 Deadlock detection/resolution.
 Lock contention for heavily used objects.
 Snapshot Isolation
 Good performance (‚readers never block‘)
 BUT: Serializability not guaranteed:
r1(x) r1(y) r2(x) r2(y) w1(x) w2(y)
 Implementation complicated by need to maintain multiversion
database. Eventually old versions must be discarded (creates
problems for long-running transactions
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-34
L11
Transaction Mgmt.
Today’s Agenda
 Transaction Management
 Transaction Concept
 Serializability
 Concurrency Control Mechanisms
 Locking
 Snapshot Isolation
 Distributed Transactions
 2-Phase Commit Protocol
 Recovery
 Write-Ahead Logging
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-35
L11
Transaction Mgmt.
Distributed Transactions
 Information supporting a large enterprise is generally stored
on multiple computer systems scattered through the
enterprise
 E.g., Bank has a database at each branch recording local branch
data and a database at the main office recording aggregate data
 Each DBMS is independent, supporting local transactions at
a site
 With increased enterprise integration and automation,
global, or distributed, transactions, involving multiple sites
must also be supported
 A distributed transaction is a transaction that invokes local
transactions
 Eg. Bank transfer: invoke withdraw at one site and deposit at another
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-36
L11
Transaction Mgmt.
Example Distributed Transaction
 Incorporate transactions at multiple servers into a single
(distributed) transaction
 Not all distributed applications are legacy systems; some are built
from scratch as distributed systems
tx_begin;
order_part;
withdraw_part;
payment;
tx_commit;
Inventory
Application
DBMS 1 Site B
Billing
Application
DBMS 2 Site C
Site A
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-37
L11
Transaction Mgmt.
Correctness of Distributed Tx
 Although a distributed transaction is consistent, maintaining
ACID properties in a multi-database is an important issue
 Each subtransaction is locally ACID (e.g., local constraints
maintained, locally serializable), but we also strive for globally ACID
 Atomicity:
 Either all subtransactions of a distributed transaction commit or all
abort in spite of failures (e.g., message loss, site crash)
 requires the use of a two-phase commit protocol
 Isolation:
 Concurrently executing subtransactions are globally serializable
 Even if local sites are serializable, subtransactions of two distributed
transactions might be serialized in different orders at different sites

At site A, T1A is serialized before T2A
 At site B, T2B is serialized before T1B
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-38
L11
Transaction Mgmt.
2-Phase Commit Protocol
 Implemented as an exchange of messages between the
coordinator and the subordinates
 The subordinates are the individual subtransactions that participated
in the transaction
 The coordinator polls the subordinates to see if they want to commit
 When a distributed transaction wants to commit:
1. Coordinator sends prepare message to each subordinate
2. Each subordinate ‘prepares’ (flushes abort or prepare log entry) and
then sends a vote message (yes or no) to the coordinator
3. Coordinator decides on global outcome, logs it, and then sends a
commit/abort message to each subordinate
4. Subordinate write a commit or abort log entry and then send an
acknowledge message back to the coordinator
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-39
L11
Transaction Mgmt.
Two-Phase Commit Protocol
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
COMP5138 "Relational Database
Managment Systems" - 2005 (U. Röhm)
Source: http://www.vermicelli.pasta.cs.uit.no/ipv6/students/andrer/doc/html/node18.html
11-40
L11
Transaction Mgmt.
Global Serializability
 Theorem: If all sites use a two-phase locking protocol and
a two-phase commit protocol is used, transactions are
globally serializable
 Transactions are serialized in the same order at every site – the
order in which the transactions committed
 Problems:
Global deadlock can be another result of implementing twophase locking and two-phase commit protocols
 Example: At site A, T1A is waiting for a lock held by T2A
At site B, T2B is waiting for a lock held by T1B
 Need for a distributed deadlock detection algorithm or timeout…
 2PC protocol is blocking and adds writes -> it does not scale
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-41
L11
Transaction Mgmt.
Today’s Agenda
 Transaction Management
 Transaction Concept
 Serializability
 Concurrency Control Mechanisms
 Locking
 Snapshot Isolation
 Distributed Transactions
 2-Phase Commit Protocol
 Recovery
 Write-Ahead Logging
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-42
L11
Transaction Mgmt.
Motivation for Recovery
 Atomicity:
 Transactions may abort (“Rollback”).
 Durability:
 What if DBMS stops running? (Causes?)
 Desired behavior
after system restarts:
 T1, T2 & T3
should be durable.
 T4 & T5 should be
aborted (effects not seen).
T1
T2
T3
T4
T5
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
crash!
11-43
L11
Transaction Mgmt.
Aborting a Transaction
 If a transaction Ti is aborted, all its actions have to be
undone.
 if Tj reads an object last written by Ti, Tj must be aborted as well!
 Most systems avoid such cascading aborts by releasing a
transaction’s locks only at commit time.
 If Ti writes an object, Tj can read it only after Ti commits.
 To undo the actions of an aborted transaction,DBMS
maintains a log in which every write is recorded.
 The log is also used to recover from system crashes:
all active Xacts at the time of the crash are aborted when the system
comes back up.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-44
L11
Transaction Mgmt.
Basic Idea: Logging
 Record REDO and UNDO information, for every update, in a
log.
 Sequential writes to log (put it on a separate disk).
 Minimal information written to log, so multiple updates fit in a single
log page.
 Log: An ordered list of REDO/UNDO actions
 Log record contains:
<TID, pageID, offset, length, old data, new data>
 additional control information
 Log records are chained together by Xact id, so it’s easy to
undo a specific Xact.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-45
L11
Transaction Mgmt.
Write-Ahead Logging (WAL)
 The Write-Ahead Logging Protocol:
1. Must force the log record for an update before the corresponding
data page gets to disk.
2. Must write all log records for a transaction before commit.

#1 guarantees Atomicity.
 #2 guarantees Durability.
 Log is often duplexed and archived on stable storage.
 All log related activities (and in fact, all CC related activities
such as lock/unlock, dealing with deadlocks etc.) are
handled transparently by the DBMS.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-46
L11
Crash Recovery
Transaction Mgmt.
Oldest log
rec. of Xact
active at
crash
Start from a checkpoint.
DBMS periodically writes a checkpoint
Smallest
recLSN in
dirty page
table after
Analysis
Three phases. Need to:
Figure out which transactions committed
since checkpoint, which failed (Analysis).
REDO all actions.
(repeat history)
UNDO effects of failed transactions.
Last chkpt
CRASH
A R U
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-47
L11
Transaction Mgmt.
Summary
 Concurrency control and recovery are among the most
important functions provided by a DBMS.
 Transactions: Databases guarantees ACID properties
 but implementations give freedom of lower isolation levels..
 Users do not need to worry about concurrency.
 System automatically schedules actions of different transactions in
such a way that the schedule is equivalent to a serial execution.
 Two main approaches: 2-Phase Locking and Snapshot Isolation
 DBMS provides several facilities for recovery
 Backup, Logging, Checkpointing
 Automatically restores the system to a consistent state after a crash.
 Consistent state: Only the effects of commited transactions seen.
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-48
L11
Transaction Mgmt.
Further Reading
 Transaction concept: Standard database texts, e.g. textbook
(Ramakrishnan/Gehrke) Chapter 16
 Main implementation techniques:
e.g. textbook (Ramakrishnan/Gehrke) Chapters 17 and 18
 Big picture: “Principles of Transaction Processing” by P.
Bernstein and E. Newcomer
 Theory: “Transactional Information Systems” by G. Weikum
and G. Vossen
 The gory details:
“Transaction Processing” by J. Gray and A. Reuter
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-49
L11
Transaction Mgmt.
Next Week
 Data Warehousing and OLAP
 ROLAP, MOLAP
 ETL Process
 Star Schema
 Warehousing Architectures
 Textbook
 Chapter 25
COMP5138 "Relational Database Managment Systems" - 2005 (U. Röhm)
11-50