transactions_and_recovery

Download Report

Transcript transactions_and_recovery

Advanced Databases CG096
Lecture 3: Transactions and
Recovery

Transactions (ACID)
 Recovery
Nick Rossiter [Emma-Jane Phillips-Tait]
1
Content
 What is a Transaction?
 ACID properties
 Transaction Processing
 Database Recovery
2
1. What is a Transaction?
Definition
 The sequence of logically linked actions that access a common
database

often used in online or live systems
Examples
 Airlines operation


ATM Cash operation


Check credentials. Check money. Withdraw amount from account.
Pay amount.
Credit card sale


Reserve an airline seat. Buy an airline ticket. Assemble cabin crew.
Fly.
Log on with the card. Verify credit card details. Check money.
Deliver goods. Issue withdrawal.
Internet sale

Request an item from an on-line catalogue. Check availability.
Provide credit card details. Check details. Issue order. Dispatch.
Issue withdrawal.
3
Origin and Needs for Transactions in DB
4
Automated Teller Machines (ATM)
5
2. A.C.I.D. properties

Transactions have 4 main properties
 Atomicity - all or nothing
 Consistency - preserve database integrity
 Isolation - execute as if they were run alone
 Durability - results are not lost by a failure
6
2.1 Atomicity



All-or-nothing, no partial results. An event either happens and
is committed or fails and is rolled back.
 e.g. in a money transfer, debit one account, credit the other.
Either both debiting and crediting operations succeed, or
neither of them do.
 Transaction failure is called Abort
Commit and abort are irrevocable actions. There is no undo for
these actions.
An Abort undoes operations that have already been executed
 For database operations, restore the data’s previous value
from before the transaction (Rollback-it); a Rollback
command will undo all actions taken since the last commit
for that user.
 But some real world operations are not undoable.
Examples - transfer money, print ticket, fire missile
7
2.2 Consistency



Every transaction should maintain DB consistency
 Referential integrity - e.g. each order references an existing
customer number and existing part numbers
 The books balance (debits = credits, assets = liabilities)
Consistency preservation is a property of a transaction, not of
the database mechanisms for controlling it (unlike the A, I,
and D of ACID)
If each transaction maintains consistency,
then a serial execution of transactions does also
8
2.3 Isolation
Intuitively, the effect of a set of transactions should
be the same as if they ran independently.

Formally, an interleaved execution of transactions is
serializable if its effect is equivalent to a serial one.

Implies a user view where the system runs each user’s
transaction stand-alone.
Of course, transactions in fact run with lots of
concurrency, to use device parallelism – this will be
covered later.

Transactions can use common data (shared data)

They can use the same data processing mechanisms
(time sharing)
9
2.4 Durability



When a transaction commits, its results will survive failures
(e.g. of the application, OS, DB system … even of the disk).
Makes it possible for a transaction to be a legal contract.
Implementation is usually via a log
 DB system writes all transaction updates to a log file
 to commit, it adds a record “commit(Ti)” to the log
 when the commit record is on disk, the transaction is
committed.
 system waits for disk ack before acknowledging to user
10
3. Transaction Processing
Can be automatic (controlled by the RDBMS) or
programmatic (programmed using SQL or other
supported programming languages, like PL/SQL)

Identifying critical points for database changes


Preparation for control over transaction progress


through set of database states
using labels of transaction states
Management of the transactions

using explicit manipulation of transaction states and enforcing
transaction operations
11
3.1 Database State and Changes
D1, D2 - Logically consistent states of the database data
TTransaction for changing the database
t1, t2 - Absolute time before and after the transaction
12
Transaction Parameters

diff D = D2  D1 can have different scale:

single data item in one memory area
 many items across several files and databases
 structural changes such as new database schema
 t = t2 - t1 is the time for executing T



T occupies real physical resources
 between D1 and D2 there may be intermediate states D11, D12 …;

some of them can be inconsistent

the final state D2 could be unreachable
When T fails
 first come back to D1 (recovery)
 then try again to reach D2 (redo)
13
Transaction Operations 1
•
For recovery purposes the system needs to keep track of
when a transaction starts, terminates and commits.

begin: marks the beginning of a transaction execution

end: specifies that the read and write operations have
ended


marks the end limit of transaction execution
commit: signals a successful end of the transaction

Any updates executed by the transaction can be safely
committed to the database and will not be undone
14
Transaction Operations 2

rollback: signals that the transaction has ended
unsuccessfully


undo: similar to rollback


Any changes that the transaction may have applied to the
database must be undone
but it applies to a single operation rather than to a whole
transaction
redo: specifies that certain transaction operations must be
redone

to ensure that all the operations of a committed transaction
have been applied successfully to the database
15
Reading and Writing
Specify read or write operations on the database items that are
executed as part of a transaction

read (X): reads a database item named X into a program variable also
named X.
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the buffer to the program variable

write (X): writes the value of program variable X into the database
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the program variable named X into its
current location in the buffer
4. store the updated block in the buffer back to disk (this step
updates the database on disk)
16
3.2 Transaction State and Progress
A transaction reaches its commit point when all
operations accessing the database are completed
and the result has been recorded in the log. It then
writes a [commit, <transaction-id>] and terminates.
BEGIN
END
active
READ , WRITE
COMMIT
partially
committed
committed
ROLLBACK
ROLLBACK
aborted
terminated
When a system failure occurs, search the log file for entries
[start, <transaction-id>]
and if there are no logged entries [commit, <transaction-id>]
then undo all operations that have logged entries
[write, <transaction-id>, X, old_value, new_value]
17
3.3 Controlling Transactions
18
Logging transaction states



Save the initial database state D1 before starting the transaction
T: D1->D2 (transaction begins)
Save all intermediate states D11, D12 … (checkpoint logs)
In the case of a failure at an intermediate state D1i before
reaching D2



restore D1 (rollback)
the simplest strategy is to apply a series of atomic actions R which
change the state to the initial state R: D1i->D1
In the case of successful reach of the last intermediate state D2,
force-write or flush the log file to disk and change the database
state to it (transaction ends)
Note: if the transactions are controlled in SQL (using COMMIT), the
rollback operation should be initiated explicitly (using
ROLLBACK)
19
Entries in the log file





[start, <transaction-id>]: the start of the
execution of the transaction identified
by transaction-id
[read, <transaction-id>, X]: the
transaction identified by transaction-id
reads the value of database item X
[write, <transaction-id>, X, old-value,
new-value]: the transaction identified by
transaction-id changes the value of
database item X from old-value to newvalue
[commit, <transaction-id>]: the
transaction identified by transaction-id
has completed all data manipulations
and its effect can be recorded
[rollback, <transaction-id>]: the
transaction identified by transaction-id
has been aborted and its effect lost
Procedure Credit (
trans_id INTEGER,
accno INTEGER,
bcode CHAR(6),
amount NUMBER)
old NUMBER;
new NUMBER;
begin
SELECT balance INTO old
FROM account
WHERE no = accno and
branch = bcode;
new := old + amount;
UPDATE account
SET amount = new
WHERE no = accno and
branch = bcode;
COMMIT;
EXCEPTION
WHEN FAILURE THEN
ROLLBACK;
END credit;
17
20
Controlling Subtransactions


All intermediate states of the transaction which are end
states of the defined subtransactions should become
consistent database states
In the case of successful reach of an intermediate state of
this type the actions are




temporary suspension of transaction execution
forced writing of all updated database blocks in main memory
buffers to disk
flush the log file
resume transaction execution
Note: If the transactions are controlled in SQL, the rollback
operation can be made to an intermediate state which is
labeled (using ROLLBACK TO <label>)
21
Adding checkpoints to the log file



A [checkpoint, <label>] record is created each time a new
checkpoint is encountered
[commit,<transaction-id>] entries for the active subtransactions
are automatically written when the system writes to the
database the effect of write operations of a successful
transaction
In the case of a rollback to a given checkpoint within a
transaction


an entry [commit,<transaction-id>] is logged against this
subtransaction
In the case of a rollback of the global transaction to a given
checkpoint

no subtransactions in the path will be committed either
22
4. Database Recovery


Need for recovery from failure during transaction
 for preventing the loss of data
 for avoiding global inconsistency of the database
 for analyzing the possible reasons for failure
Factors considered in database recovery
 what is the nature of the failure?
 when did the problem occur in the transaction?
 what do we need to recover?
23
4.1 Categories of Transactions at Failure
t check
t fail
T1
T2
T3
T4
T5
T1 - Can be ignored (committed before the previous checkpoint)
T2 - Must Redo complete (the database will be rolled back to a state
when the transaction was not committed)
T3 - Must Undo (not finished, and rollback to a state when not finished)
T4 - Must Redo if possible (finished, but not committed)
T5 - Must Undo (did not finish and the rollback will lead to a state
before it was even started)
24
4.2 Types of Failure
If an error or hardware/software crash occurs between the begin and
end of transaction, the database will be inconsistent

Catastrophic failure
 Restore a previous copy of the database from archival backup
 Apply transaction log

to reconstruct a more current state by redoing committed transaction
operations up to failure point
Perform an incremental dump logging each transaction
Non-catastrophic failure
 Reverse the changes that caused the inconsistency by undoing the
operations and possibly redoing legitimate changes which were
lost
 The entries kept in the system log are consulted during recovery.
 No need to use the complete archival copy of the database.


25
4.3 Recovery Strategy



Mirroring
 keep two copies of the database maintained simultaneously
Backup
 periodically dump the complete state of the database to some
form of tertiary storage
System Logging
 keep track of all transaction operations affecting the values of
database items.

the log is kept on disk so that it is not affected by failures except
for disk and catastrophic failures.
26
Write-ahead Logging






Deferred Update:

no actual update of the
database until the
transaction reaches its
commit point

1. Updates recorded in log

2. Transaction commit point
3. Force log to the disk

4. Update the database


FAILURE!
• REDO database from log
entries
• No UNDO necessary because
database has not been altered
Immediate Update:

the database may be updated
by some operations of a
transaction before it reaches
its commit point.
1. Update X recorded in log
FAILURE!
2. Update X in database
• UNDO X
3. Update Y recorded in log
4. Transaction commit point
5. Force log to the disk
FAILURE!
6. Update Y in database
• REDO Y
FAILURE!
• UNDO in reverse order to log
• REDO in committed log order
(using the write log entry)
27
Page Buffering Technique

Data is not updated ‘in place’

The database is considered to
be made up of a number of n
fixed-size disk blocks or pages,
for recovery purposes.


A page table with n entries is
constructed where the ith page
table entry points to the ith
database page on disk.
The current page table points
to most recent current database
pages on disk
Database data
pages/blocks
page 5
Page table
1
2
3
4
5
6
page 1
page 4
page 2
page 3
page 6
28
Paging Technique – cont.

When a transaction
begins executing
 the current page
table is copied into a
buffer page table
 buffer page table is
then saved
 buffer page table is
never modified
during transaction
execution
 write operations—
new copy of
database page is
created and current
page table entry
modified to point to
new disk page/block
Current page table
After updating
pages 2,6
1
2
3
4
5
6
Database
data pages
(blocks)
page 5
(old)
page 1
page 4
page 2
(old)
page 3
Buffer
page table
(not updated)
1
2
3
4
5
6
page 6
page 2
(new)
page 5
(new)
29
Paging Technique - final



Database
To recover from a failure
data pages
 check the state of the
(blocks)
database (before transaction
page 5
execution) through the buffer
Current page table (old)
page table
After updating
page 1
 free modified pages
pages 2,6
 discard current page table
page 4
 Recover state by reinstating 1
2
page 2
the buffer page table to
3
(old)
become the current page table 4
5
once more
page 3
6
Commit a transaction
page 6
 discard previous buffer page
 free old referenced page
page 2
tables
(new)
Garbage collection
page 5
(new)
Buffer
page table
(not updated)
1
2
3
4
5
6
30