Unit 8 - Transaction Mgmt
Download
Report
Transcript Unit 8 - Transaction Mgmt
Overview of Transaction
Management
Introduction
The concept of transaction is the foundation for concurrent
execution and recovery from system failure in a DBMS.
What is a Transaction?
Any action that reads from and/or writes to a database.
Example: Simple SELECT statement to generate a list of table
contents
A series of related UPDATE statements to change the values
of attributes in various tables
A series of INSERT statements to add rows to one or more
tables
A combination of SELECT, UPDATE, and INSERT statements
Acid Properties
A DBMS must ensure four important properties of
transactions to maintain data in the face of
concurrent access and system failures
Atomicity: A very important property guaranteed by the
DBMS for all transactions is that they are atomic. That
is, a user can think of a transaction as carrying out all
actions or none.
DBMS logs all actions so that it can undo the actions of
aborted transactions.
Redo actions of committed transactions not yet
propagated to disk when system crashes.
Acid Properties
Durability: Once the transaction has been successfully
completed, its effects should persist even if the system
crashes before all its changes are reflected on disk.
Serializability (or) Consistency: Ensures that the
concurrent execution of several transactions yields
consistent results
Isolation : Transactions are isolated or protected, from
the effects of concurrently scheduling other transactions
Data used during execution of a transaction cannot be used
by second transaction until first one is completed
Transactions and Schedules
The read action of a transaction is denoted by R(A)
where A is an object being read, Similarly write action is
denoted by W(A)
A Schedule represents an actual or potential execution
sequence of the actions of a transaction
For Ex: The execution order for actions of two transactions
T1 and T2
T1
T2
R(A)
W(A)
R(B)
W(B)
R(C)
W(C)
Transactions and Schedules
We move forward in time as we go down from
one row to the next
A schedule that contains either an abort or a
commit for each transaction whose actions are
listed in it is called a complete schedule
If the actions of different transactions are not
interleaved – that is transactions are executed
from start to finish, one by one, the schedule is
called Serial Schedule
Interleaved Execution – Concurrent Execution
Concurrent Execution of
Transactions
Why Concurrent Execution?
While one transaction is waiting for a page to be
read in from disk , the CPU can process another
transaction, there by reducing the idling time of CPU
and disks.
Also increases System Throughput ( i.e the average
no of transactions completed in a given time).
Interleaved execution of a short transaction with a
long transaction usually allows short transaction to
complete quickly
In Serial execution a Short transaction could get stuck
behind a long transaction, leading to delays in response
time
Concurrent Execution of
Transactions
A Serializable Schedule over a set S of
committed transactions is a schedule whose effect
on any consistent database instance is guaranteed
to be identical to that of some complete serial
schedule over S
i.e the database instance that results from
executing the given schedule is identical to the
database instance that results from executing
the transactions in some serial order
A Serializable Schedule
T1
T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
R(B)
W(A)
COMMIT
COMMIT
Example
Consider two transactions (Xacts):
T1:
T2:
•
•
•
BEGIN A=A+100, B=B-100 END
BEGIN A=1.06*A, B=1.06*B END
1st xact transfers $100 from B’s account to A’s
2nd credits both accounts with 6% interest.
Assume at first A and B each have $1000. What are the
legal outcomes of running T1 and T2?
• T1 ; T2 (A=1166,B=954)
• T2 ; T1 (A=1160,B=960)
• In either case, A+B = $2000 *1.06 = $2120
• There is no guarantee that T1 will execute before T2 or
vice-versa, if both are submitted together.
Example (Contd.)
Consider a possible interleaved schedule:
T1:
T2:
•
A=1.06*A,
B=B-100
B=1.06*B
This is OK (same as T1;T2). But what about:
T1:
T2:
•
A=A+100,
A=A+100,
A=1.06*A, B=1.06*B
B=B-100
Result: A=1166, B=960; A+B = 2126, bank loses $6 !
The DBMS’s view of the second schedule:
T1:
T2:
R(A), W(A),
R(A), W(A), R(B), W(B)
R(B), W(B)
Anomalies due to Interleaved
Execution
Reading Uncommitted Data(WR Conflicts)
Unrepeatable Reads ( RW Conflicts)
Overwriting Uncommitted Data(WW Conflicts)
Overwriting Uncommitted Data (WW Conflicts,
“blind write”):
T1:
T2:
W(A),
W(A), W(B), C
W(B), C
Anomalies with Interleaved Execution
Reading Uncommitted Data (WR Conflicts, “dirty
reads”):
T1:
T2:
A=A+100,
A=1.06*A, B=1.06*B
B=B-100
Unrepeatable Reads (RW Conflicts, “Unrepeatable
reads”):
T1:
T2:
R(A),
R(A), W(A), C
W(A), C
Schedules involving aborted
Transactions
Extended definition of Serializable schedule
A Serializable Schedule over a set S of committed
transactions is a schedule whose effect on any consistent
database instance is guaranteed to be identical to that of
some complete serial schedule over the set of committed
transactions in S
T1:
T2:
T1:
T2:
B=B-100
A=1.06*A, B=1.06*B, commit
R(A), W(A),
Abort
R(A), W(A), R(B), W(B), Commit
An Unrecoverable Schedule
Abort
Concurrency Control
with Locking Methods
Lock
Guarantees exclusive use of a data item to
a current transaction
Required to prevent another transaction
from reading inconsistent data
Lock manager
Tracks lock requests, grants locks on
database objects when they become
available
Lock Granularity
Indicates the level of lock use
Locking can take place at the following
levels:
Database
Table
Page
Row
Field (attribute)
Lock Granularity (continued)
Database-level lock
Table-level lock
Entire database is locked
Entire table is locked
Page-level lock
Entire disk page is locked
Lock Granularity (continued)
Row-level lock
Allows concurrent transactions to access
different rows of the same table, even if the
rows are located on the same page
Field-level lock
Allows concurrent transactions to access the
same row, as long as they require the use
of different fields (attributes) within that
row
Lock Types
Binary lock
Has only two states: locked (1) or unlocked (0)
Exclusive lock
Access is specifically reserved for the transaction
that locked the object
Must be used when the potential for conflict
exists
Shared lock
Concurrent transactions are granted Read access
on the basis of a common lock
Lock-Based Concurrency Control
Two-phase Locking (Strict 2PL) Protocol:
Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on
object before writing.
If an Xact holds an X lock on an object, no other
Xact can get a lock (S or X) on that object.
DBMS internally enforces the above locking
protocol.
Two phases: acquiring locks, and releasing them
also called Growing phase and Shrinking phase
Strict 2PL
•
2PL allows only serializable schedules but is subjected to
cascading aborts.
•
Example: rollback of T1 requires rollback of T2!
T1:
T2:
R(A), W(A),
R(A), W(A), R(B), W(B)
Abort
•
To avoid Cascading aborts, use Strict 2PL
•
Strict Two-phase Locking (Strict 2PL) Protocol:
–
Same as 2PL, except:
–
A transaction releases no locks until it completes
Two-Phase Locking
to Ensure Serializability
Defines how transactions acquire and relinquish
locks
Guarantees serializability, but it does not prevent
deadlocks
Growing phase, in which a transaction acquires
all the required locks without unlocking any
data
Shrinking phase, in which a transaction
releases all locks and cannot obtain any new
lock
Two-Phase Locking
to Ensure Serializability (continued)
Governed by the following rules:
Two transactions cannot have conflicting locks
No unlock operation can precede a lock
operation in the same transaction
No data are affected until all locks are
obtained—that is, until the transaction is in its
locked point
Two-Phase Locking Protocol
Deadlocks
Condition that occurs when two transactions wait
for each other to unlock data
Possible only if one of the transactions wants to
obtain an exclusive lock on a data item
No deadlock condition can exist among shared
locks
Control through
Prevention
Detection
Avoidance
The Transaction Log
Stores
A record for the beginning of transaction
For each transaction component (SQL statement)
Type of operation being performed (update, delete,
insert)
Names of objects affected by the transaction (the name
of the table)
“Before” and “after” values for updated fields
Pointers to previous and next transaction log entries for
the same transaction
The ending (COMMIT) of the transaction
Performance of Locking
Lock based schemes use two basic mechanisms:
Blocking and Aborting. Both mechanisms involve a
performance penalty
Blocked transactions hold locks that force other
transactions to wait. Delays due to blocking affect
throughput
Aborting a transaction, wastes the work done thus
far by that transaction.
A deadlock represents an extreme instance of
blocking
Fewer than 1% of transactions are involved in a
deadlock, and therefore relatively few aborts
Throughput
Fig : Lock Trashing
Thrashing
Active Transactions
Performance of Locking (contd..)
The first few transactions are unlikely to
conflict, and throughput rises in proportion to
the number of active transactions.
Delays due to blocking increase with the
number of active transactions and through
put increases more slowly than the number of
active transactions.
Then there comes a point where adding
another active transaction actually reduces
throughput. We say that the system Trashes
at this point.
Performance of Locking (contd..)
If a system begins to trash, the database
administrator should reduce the number of
transactions allowed to run concurrently
Throughput can be increased in three ways:
By locking the smallest sized objects possible.
By reducing the time that transaction hold locks
By reducing hot spots. A hot spot is a database
object that is frequently accessed and modified,
and causes a lot of blocking delays.
Transaction Management with SQL
Transaction support in SQL is provided by two
statements: COMMIT and ROLLBACK
In SQL : 1999, two new features are provided to
support applications that involve long-running
transactions.
Save point – For identifying a point in a transaction
and selectively rollback operations carried out after
this point.
Chained Transactions – Minimizes the overhead of
requiring to run several transactions one after the
other, In chained transactions, a transaction is
committed or roll backed, and another transaction is
initiated immediately
Introduction to Crash Recovery
Recovery Manager
Upon recovery from crash:
Ensures transaction Atomicity and Durability
Undoes actions of transactions that do not commit
Redoes lost actions of committed transactions
Must bring DB to a consistent transactional state
lost during system failures or media failures
Recovery Manager maintains log information
during normal execution of transactions for use
during crash recovery
The Log
Log consists of “records” that are written
sequentially.
Stored on a separate disk from the DB
Typically chained together by Xact id
Log is often duplexed and archived on stable storage
(guaranteed to survive system crashes and media failures)
All log related activities are handled transparently by the
DBMS.
Write Ahead Logging (WAL) Property
Log entries describing a change to the database are written
to stable storage before the change is made.
implemented via a handshake between log manager and
the buffer manager.
Stealing Frames and Forcing Pages
Two approaches w.r.t writing objects:
Steal Approach: Changes made to an object O in the
buffer pool by a transaction T be written to disk before T
commits.
Such writes are executed when another transaction wants to
bring in a page. And we say the second transaction steals a
frame from T.
Force Approach:When a transaction commits, all the
changes it has made to objects in buffer pool are
immediately forced to disk.
From the standpoint of implementing a recovery
manager, it is simplest to use a buffer manager
with a no-steal, force approach
Stealing Frames and Forcing Pages
If no-steal approach is used we do not have to undo
the changes of an aborted transaction & If a force
approach is used we do not have to redo the
changes of committed transactions
However these policies have important drawbacks
No-steal approach assumes that all pages modified by
ongoing transactions can be accommodated in the buffer
pool, and in presence of large transactions this assumption
is unrealistic.
The force approach results in excessive page I/O costs. If a
highly used page is updated in succession by 20
transactions, it would be written to disk 20 times.
For these reasons most systems use a steal, noforce approach
Logging Continued
Log record must go to disk before the changed
page!
As was true of CC-related activities such as
lock/unlock, dealing with deadlocks, etc.
ARIES Recovery
ARIES Recovery algorithm is designed to work with a steal, no-force
approach. There are 3 phases in ARIES recovery protocol:
Analysis: Scan the log forward (from the most recent checkpoint)
to identify all Xacts that were active, and all dirty pages in the
buffer pool at the time of the crash.
Redo: Redoes all updates to dirty pages in the buffer pool, as
needed, to ensure that all logged updates are in fact carried out and
written to disk.
Undo: The writes of all Xacts that were active at the crash are
undone (by restoring the before value of the update, as found in the
log), working backwards in the log.
Finally, the database reflects only the actions of committed
transactions
Some care must be taken to handle the case of a crash occurring
during the recovery process!