CM20145 Database Design

Transcript CM20145 Database Design

Dr Alwyn Barry
Dr Joanna Bryson
CM20145
Recovery +
Intro. to Concurrency




Last Time…
Transaction Concepts


ACID
Possible States
Schedules
Serializability



Conflict
View
Others
Testing for Serializability



Precedence Graphs
Conflict
View
Now: Recovery & Some Concurrency
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Recovery Algorithms
 Recovery algorithms are techniques to
ensure database consistency and
transaction atomicity and durability
despite failures.
 Recovery algorithms have two parts
1. Actions taken during normal transaction
processing to ensure enough information
exists to recover from failures.
2. Actions taken after a failure to recover the
database contents to a state that ensures
atomicity, consistency and durability.
Failure – Classifications
 Transaction failure:
 Logical errors: transaction cannot complete due to
some internal error condition.
 System errors: the database system must
terminate an active transaction due to an error
condition (e.g., deadlock – covered in Lecture 15).
 System crash: a power failure or other
hardware or software failure causes the
system to crash.
 Fail-stop assumption: non-volatile storage
contents are assumed to not be corrupted by system
crash.
 Database systems have numerous integrity checks to
prevent corruption of disk data.
 Disk failure: a head crash or similar disk
failure destroys all or part of disk storage.
 Destruction is assumed to be detectable.
 Disk drives use checksums to detect failures.
Recoverability
 How do we address failures
when we are running
concurrent transactions?
 Recoverable schedule: if a transaction Tj
reads a data item previously written by a
transaction Ti , the commit operation of Ti
appears before the commit operation of Tj
 The schedule above is not recoverable if T9
commits immediately after the read.
 If T8 should abort, T9 would have read (and
possibly shown to the user) an inconsistent
database state. Hence database must ensure
that schedules are recoverable.
©Silberschatz, Korth and Sudarshan
Modifications & additions by J Bryson
Cascading Rollbacks
 Cascading rollback – a
single transaction failure
leads to a series of
transaction rollbacks.
 Consider the following
schedule where none of the
transactions has yet
committed (so the schedule
is recoverable).
 If T10 fails, T11 and T12 must
also be rolled back.
 Can lead to the undoing of
a significant amount of
work.
Cascadeless Schedules
 Cascadeless schedules
— cascading rollbacks
cannot occur; for each pair
of transactions Ti and Tj
such that Tj reads a data
item previously written by
Ti, the commit operation
of Ti appears before the
read operation of Tj
 Every cascadeless schedule
is also recoverable.
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Storage Hierarchy (Lecture 9)
Storage Structure
 Volatile storage:
 Does not survive system crashes.
 Examples: main memory, cache memory.
 Nonvolatile storage:
 Survives system crashes.
 Examples: disk, tape, flash memory,
non-volatile (battery backed up) RAM.
 Stable storage:
 A mythical form of storage that survives all
failures.
 Approximated by maintaining multiple
copies on distinct nonvolatile media.
Stable-Storage Implementation
 Maintain multiple copies of each
block on separate disks (&
locations…)
 Failure during data transfer can still
result in inconsistent copies. Block
transfer can result in:
 Successful completion,
 Partial failure – destination block
has incorrect information, or
 Total failure – destination block was
never updated.
Data Access
 Physical blocks: blocks residing on the disk.
 Buffer blocks: blocks residing temporarily in
main memory.
 Block movements between disk and main
memory are initiated through the following
two operations:
 input(B) transfers the physical block B to main
memory.
 output(B) transfers the buffer block B to the disk,
and replaces the appropriate physical block there.
 Each transaction Ti has its private work-area in
which local copies of all data items accessed
and updated by it are kept.
 Ti's local copy of a data item X is called xi.
 We assume, for simplicity, that each data item
fits in, and is stored inside, a single block.
Sample Data Access Diagram
buffer
input(A)
Buffer Block A
x
Buffer Block B
Y
A
output(B)
read(X)
write(Y)
x2
x1
B
y1
work area
of T1
memory
work area
of T2
disk
Data Access (Cont.)
 Transaction transfers data items between
system buffer blocks and its private workarea.
 Transactions
 Perform read(X) while accessing X for the first time;
 All subsequent accesses are to the local copy.
 After last access, transaction executes write(X).
 output(BX) need not immediately follow
write(X). System can perform the output
operation when it deems fit.
 Reminder: Volatile memory is faster, but more
vulnerable!
Protecting Storage
(FYI, not for exam)
 During data transfer two copies of each block:
1. Write the information onto the first physical block.
2. When the first write successfully completes, write the
same information onto the second physical block.
3. The output is completed only after the second write
successfully completes.
 To recover from failure:
1. First find inconsistent blocks:
1. Expensive solution: Compare the 2 copies of every disk block.
2. Better solution:



Record in-progress disk writes on non-volatile storage (Non-volatile
RAM or special area of disk).
Use this information during recovery to find blocks that may be
inconsistent, and only compare copies of these.
Used in hardware RAID systems.
2. If either copy of an inconsistent block is detected to have
an error (bad checksum), overwrite it by the other copy. If
both have no error, but are different, overwrite the second
block by the first block.
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Recovery and Atomicity
 Modifying the database without
ensuring that the transaction will
commit may leave the database in an
inconsistent state.
 To ensure atomicity despite failures, we
first output information describing the
modifications to stable storage without
modifying the database itself.
 Two approaches shown here:
 shadow-paging (naïve), and
 log-based recovery.
 We’ll assume that transactions run
serially (book goes further if you’re
curious).
Shadow Database
 Assume only one transaction is active at a time.
 db_pointer always points to the current
consistent copy of the database.
 Updates made on a copy of the database.
Pointer moved to updated copy after transaction
reaches partial commit & pages written.
 On transaction failure, old consistent copy
pointed to by db_pointer is used, and the
shadow copy is deleted.
Assumes disks don’t fail.
Useful for text editors,
but extremely inefficient
for large database -executing a single
transaction requires
copying the entire
database!
Log-Based Recovery
 A log is kept on stable storage.
 A log is a sequence of log records, records
the update activities on the database.
 When transaction Ti starts, it registers itself by writing a
<Ti start> log record.
 Before Ti executes write(X), a log record <Ti, X, V1, V2>
is written, where V1 is the value of X before the write, and
V2 is the value to be written to X.
 When Ti finishes its last statement, the log record
<Ti commit> is written.
 Assume here that log records are written
directly to stable storage (that is, they are
not buffered).
 Two approaches using logs:
 Deferred database modification.
 Immediate database modification.
Deferred Database Modification
 Deferred database modification scheme
records all modifications to the log, but defers
all writes to after partial commit.
 Transaction starts by writing <Ti start>
record to log.
 A write(X) operation results in a log record
<Ti, X, V> being written, where V is the new
value for X.
 Note: old value is not needed for this scheme.
 The real write is not performed on X at this
time, but is deferred.
 When Ti partially commits, <Ti commit> is
written to the log.
 Finally, the log records are used to actually
execute the previously deferred writes.
Assumes that transactions execute serially.
Deferred DB Modification (2)
 During recovery, a transaction needs to be
redone if and only if both <Ti start>
and<Ti commit> are there in the log.
 Redoing a transaction Ti ( redoTi) sets the
value of all data items updated by the
transaction to the new values.
 Crashes can occur while:
 the transaction is executing the original updates,
or
 while recovery action is being taken
 Example: T0 and T1 (T0 executes before T1):
T0: read (A)
T1 : read (C)
A: - A - 50
C:- C- 100
write (A)
write (C)
read (B)
B:- B + 50
write (B)
Deferred DB Modification (3)
 Consider a log at three instances of time.
 If log on stable storage at time of crash:
(a) No redo actions need to be taken.
(b) redo(T0) must be performed since <T0 commit> is
present.
(c) redo(T0) must be performed followed by redo(T1)
since <T0 commit> and <Ti commit> are present.
Immediate DB Modification
 The immediate database modification
scheme allows database updates of an
uncommitted transaction to be made as the
writes are issued.
 Since undoing may be needed, update logs
must have both old value and new value.
 Update log record must be written before
database item.
 Log record must be output directly to stable storage.
 Can postpone log record output, so long as prior to
execution of an output(B) operation, all log records
corresponding to items B are flushed to stable
storage.
 Output of updated blocks can take place at any
time before or after transaction commit.
 Order in which blocks are output can be
different from the order they are written.
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & Immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Checkpoints
 Problems with log-based recovery
procedure:
1. Searching the entire log is time-consuming.
2. We might unnecessarily redo transactions
which have already output their updates to
the database.
 Can streamline recovery procedure by
periodically performing checkpointing.
1. Output all log records currently residing in
main memory onto stable storage.
2. Output all modified buffer blocks to the
disk.
3. Write a log record <checkpoint> onto
stable storage.
Checkpoints & Recovery
 Need consider only transactions that didn’t
commit before checkpoint.
 Simple algorithm if serialized transactions:
1. Scan backwards from end of log to find the most
recent <checkpoint> record.
2. Continue scanning backwards till a record <Ti start>
is found.
3. Need only consider the part of log following above
start record. Earlier part of log can be ignored
during recovery, and can be erased whenever
desired.
4. For all transactions with no <Ti commit>, execute
undo(Ti).
5. Scanning forward in the log, for all transactions
starting from Ti or later with a <Ti commit>,
execute redo(Ti).
Example of Checkpoints
Tf
Tc
T1
T2
T3
T4
checkpoint
system failure
 T1 can be ignored (updates already
output to disk due to checkpoint)
 T2 and T3 redone.
 T4 undone
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & Immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Concurrency
 Goal – to develop concurrency control
protocols that will ensure serializability.
 These protocols will impose a discipline
that avoids nonseralizable schedules.
 A common concurrency control protocol
uses locks.
 While one transaction is accessing a data
item, no other transaction can modify it.
 Require a transaction to lock the item before
accessing it.
 Topic of Lecture 15!
 But we’ll introduce locking now.
Lock-Based Protocols



A lock is a mechanism to control
concurrent access to a data item.
Lock requests are made to
concurrency-control manager.
Transaction can proceed only after
request is granted.
Data items can be locked in two
modes:
1. exclusive (X) mode. Data item can be both
read and written. X-lock is requested using
the lock-X instruction.
2. shared (S) mode. Data item can only be
read. S-lock is requested using lock-S.
Lock-Based Protocols (2)
 Lock-compatibility matrix:
 A transaction may be granted a lock on an
item if the requested lock is compatible with
locks already held on the item by other
transactions
 Any number of transactions can hold shared
locks on an item, but if any transaction holds
an exclusive on the item no other transaction
may hold any lock on the item.
 If a lock cannot be granted, the requesting
transaction is made to wait till all incompatible
locks held by other transactions have been
released. The lock is then granted.
Lock-Based Protocols (3)
 Example of a transaction performing locking:
T2: lock-S(A);
read (A);
unlock(A);
lock-S(B);
read (B);
unlock(B);
display(A+B)
 Locking as above is not sufficient to guarantee
serializability — if A and B get updated between
the read of A and B, and the display of their
sum, that sum would be out of date.
 A locking protocol is a set of rules followed
by all transactions while requesting and
releasing locks. Locking protocols restrict the
set of possible schedules.
Pitfalls of Lock-Based Protocols
 Consider the partial schedule:
 Neither T3 nor T4 can make progress.
 Executing lock-S(B) causes T4 to wait for T3 to
release its lock on B, while executing lock-X(A)
causes T3 to wait for T4 to release its lock on A.
 Such a situation is called a deadlock.
 To handle a deadlock one of T3 or T4 must be
rolled back and its locks released.
Pitfalls of Locking (2)
 The potential for deadlock exists in most
locking protocols.
 Deadlocks are a necessary evil.
 Starvation is also possible if the
concurrency-control manager is badly
designed. For example:
 A transaction may be waiting for an X-lock
on an item, while a sequence of other
transactions request and are granted an
S-lock on the same item.
 The same transaction is repeatedly rolled
back due to deadlocks.
 Concurrency-control managers can be
designed to prevent starvation.
The Two-Phase Locking Protocol
 This is a protocol which ensures conflictserializable schedules.
 Phase 1: Growing Phase
 transaction may obtain locks
 transaction may not release locks
 Phase 2: Shrinking Phase
 transaction may release locks
 transaction may not obtain locks
 The protocol assures serializability. It
can be proved that the transactions can
be serialized in the order of their lock
points (i.e. the point where a
transaction acquired it’s final lock.)
The Two-Phase Locking Protocol
 Two-phase locking does not ensure
freedom from deadlocks
 Cascading roll-back is possible under
two-phase locking.
 Avoided with strict two-phase locking.
 Transaction must hold all its exclusive locks
till it commits/aborts.
 Rigorous two-phase locking is even
stricter:
 All locks are held till commit/abort.
 This lets protocol transactions be serialized
in the order in which they commit.
Lock Conversions
 Two-phase locking with lock conversions:
–
First Phase:
 can acquire a lock-S on item
 can acquire a lock-X on item
 can convert a lock-S to a lock-X (upgrade)
–
Second Phase:
 can release a lock-S
 can release a lock-X
 can convert a lock-X to a lock-S (downgrade)
 This protocol assures serializability. But still
relies on the programmer to insert the various
locking instructions.
Overview
 Recovery





Cascading Rollbacks
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & Immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Weak Levels of Consistency
 Degree-two consistency: differs from
two-phase locking in that S-locks may
be released at any time, and locks may
be acquired at any time
 X-locks must be held till end of transaction
 Serializability is not guaranteed,
programmer must ensure that no erroneous
database state will occur]
 Cursor stability:
 For reads, each tuple is locked, read, and
lock is immediately released
 X-locks are held till end of transaction
 Special case of degree-two consistency
FYI only – you aren’t responsible for this slide’s content .
Weak Consistency in SQL
 SQL allows non-serializable executions
 Serializable: is the default
 Repeatable read: allows only committed
records to be read, and repeating a read
should return the same value (so read locks
should be retained)
 However, the phantom phenomenon need not be
prevented
 T1 may see some records inserted by T2, but
may not see others inserted by T2
 Read committed: same as degree two
consistency, but most systems implement it
as cursor-stability
 Read uncommitted: allows even
uncommitted data to be read
FYI only – you aren’t responsible for this slide’s content .
Summary
 Recovery





Cascading & Its Avoidance
Storage & Data Access
Algorithms for:
 Shadow paging,
 Log-based recovery
 Deferred & immediate DB modifications.
Checkpoints
Intro. to Concurrency

Introduction to Locking

Weaker Levels of Consistency (for interest only)
 Pitfalls of Locking
 The Two-Phase Locking Protocol
Next: Concurrency Control
Reading & Exercises
 Reading
 Silberschatz Ch: 17.1-6.
 Connolly & Begg: 20.3
 You will need the rest of 20.2 for next
week, so if you want to stay in order go
ahead and read that.
 Exercises:
 Silberschatz 17.1-7
 Connolly & Begg 20.13-15, 20.27

CM20145 Database Design

Transcript CM20145 Database Design

Directory