Transaction Management

Download Report

Transcript Transaction Management

Transaction-Oriented Database Recovery
Application
Programmer
(e.g., business analyst,
Data architect)
Application
Sophisticated
Application
Programmer
Query Processor
(e.g., SAP admin)
Indexes
Storage Subsystem
Concurrency Control
Recovery
DBA,
Tuner
Operating System
Hardware
[Processor(s), Disk(s), Memory]
Outline
• Principles of transaction-oriented database
recovery
• Recovery tuning
Transaction-Oriented Database Recovery
• Transaction properties
–
–
–
–
A: Atomicity
C: Consistency
I: Isolation
D: Duration
• A database is transaction or logically consistent
iff it contains the results of successful
transactions
Failures To Recover From
• Transaction failure
– Self- or system-abort
– To recover within time for normal transaction
– 10-100 times per min.
• System failure
– OS or DBMS crash
– To recover in same amount of time as required for all interrupted
transactions
– A few times per week
• Media failure
– Disk crash
– To recover in hours
– A few times per year
Recovery Actions
•
•
•
•
Transaction UNDO – roll-back a specific active trans
Global UNDO – roll-back all active trans
Partial REDO – re-instate some committed trans
Global REDO – re-instate all committed trans
Failure Type
Recovery Action
Transaction
Transaction UNDO
System
Global UNDO, Partial REDO
Media
Global REDO
Log for UNDO/REDO
• Logical logging – operators & their arguments
– Requires atomic actions from physical layer
– Not always possible/justifiable
• Physical state logging
– Before and/or after image
• Physical transition logging
–
–
–
–
Use XOR: commutative and associative
Log XOR before image  after image
Log XOR after image  before image
Lower space consumption (1 entry/change; compress
long strings of 0s – small number of changes)
System Framework
Source: T. Haerder, A. Reuter
Log Timing
• UNDO entries must reach log file before
changes are written out – Write-Ahead Logging
(WAL) principle
– To enable roll-back if necessary
• REDO entries must reach log file before End-OfTransaction (EOT) is acknowledged
– To enable re-instatement after failure
Dependency with Buffer Management
UNDO
• STEAL: Modified pages
may be written anytime
• ~STEAL: Modified pages
kept in buffer till after
transaction commits
– Large buffers required
– No global UNDO
– Transaction UNDO within
memory
– No logging required for
UNDO
REDO
• FORCE: All modified
pages written during EOT
– No need to log for partial
REDO
– Need logging for global
REDO
• ~FORCE: No propagation
during EOT
At least one of global UNDO or partial
REDO is always required. Why?
Checkpointing to Optimize Recovery
• Problem
– With LRU buffer replacement, frequently used pages
will remain in buffer
– Partial REDO has to go back very far
• Checkpointing limits amount of partial REDO
• Checkpoint
– Write BEGIN-CHECKPOINT to temporary log
– Write checkpoint data to log
– Write END-CHECKPOINT to temporary log
Crash Recovery with Checkpoint
Checkpoint
Nothing
Oldest Page
Crash
In Buffer
T1
T2
REDO
T3
T4
UNDO
T5
Analyze
Recovery
Process
UNDO
REDO
Transaction-Oriented Checkpoint (TOC)
• FORCE  TOC
• EOT  (BEGINCHECKPOINT, ENDCHECKPOINT)
• Frequently used pages
need to be written out
each time a transaction
commits
• Not suitable for large
applications
Source: T. Haerder, A. Reuter
Transaction-Consistent Checkpoint (TCC)
Source: T. Haerder, A. Reuter
Transaction-Consistent Checkpoint (TCC)
• When checkpoint generation is triggered
– All new update transactions are put on hold
– All incomplete update transactions are completed
– Write out all modified pages
• Both REDO and UNDO are bounded
– REDO starts from latest checkpoint
– UNDO back to latest checkpoint
• Drawback
– Delay new update transactions; not suitable for large
multi-user DBMS
– High checkpointing costs
Action-Consistent Checkpoint (ACC)
Source: T. Haerder, A. Reuter
Action-Consistent Checkpoint (ACC)
• When checkpoint generation is triggered
– All new actions are put on hold
– All incomplete actions are completed
– Write out all modified pages
• Less disruptive than TCC
• Partial REDO only from the most recent
checkpoint
• Global UNDO not bounded
• Still costly when buffers are large
Fuzzy ACC
• During checkpointing, the numbers of all dirty
pages in buffer are written to the log
• If a modified page is found in the previous
checkpoint, and since then has not been written
out, write it out now
• Partial REDO from penultimate checkpoint
Archive Recovery
Source: T. Haerder, A. Reuter
Make sure the two paths are independent!!
Multi-Generation Archive Copies
• Archive copies are accessed very infrequently
• Subject to magnetic decay
• Keep several generations
Source: T. Haerder, A. Reuter
Duplicate Archive Logs
Source: T. Haerder, A. Reuter
Duplicate Archive Logs
• Archive log must extend back to the oldest
archive copy
• Log susceptible to magnetic decay as well
• Duplicate archive log
• Need to synchronize both archive logs with
temporary log at EOT
• Very expensive!
Decouple Archive Logs from EOT
Source: T. Haerder, A. Reuter
Decouple Archive Logs from EOT
• Log entries written only to temporary log during
EOT
• Asynchronous process copies REDO entries to
archive log
• Need to replicate temporary log
• Synchronize both temporary logs at EOT
Summary
• Failure types
Failure Type
Transaction
System
Media
Recovery Action
Transaction UNDO
Global UNDO, Partial REDO
Global REDO
• Crash recovery
– TOC: Per transaction
– TCC: Transaction boundary
– ACC: Action boundary
• Archive recovery
– Multi-generation archive
copy
– Duplicate archive logs
– Decouple archive log from
EOT