Transcript ppt

Why Transactions?
Database systems are normally being
accessed by many users or processes at
the same time.
 Both queries and modifications.
2
0.5-5 Clocks
1-5 Clocks
50-100 clocks
105clocks
If a CPU instruction= 1 second, a disk read takes
a month!
3
Why Transactions?
Concurrent execution of user programs is essential
for good DBMS performance.
–
Disk accesses may be frequent and slow,
–
Idea: keep CPUs humming by working on several user
programs concurrently.
Optimize for throughput (# of TXs)
Trade for latency (time for any one TX)
4
Why Transactions?
Unlike operating systems, which
support interaction of processes, a
DMBS needs to keep processes from
troublesome interactions.
5
Example: Bad Interaction
You and your domestic partner each
take $100 from different ATM’s at about
the same time.
 The DBMS better make sure one account
deduction doesn’t get lost.
Compare: An OS allows two people to
edit a document at the same time. If
both write, one’s changes get lost.
6
Example: single statements
Client 1:
UPDATE Product
SET Price = Price – 1.99
WHERE pname = ‘Gizmo’
Client 2:
UPDATE Product
SET Price = Price*0.5
WHERE pname=‘Gizmo’
Two managers attempt to discount products.
What could go wrong?
77
Example: multiple statements
Client 1:
INSERT INTO SmallProduct(name, price)
SELECT pname, price
FROM Product
WHERE price <= 0.99
DELETE Product
WHERE price <=0.99
What’s wrong ?
Client 2: SELECT count(*)
FROM Product
SELECT count(*)
FROM SmallProduct
88
Example: crashes
Client 1:
INSERT INTO SmallProduct(name, price)
SELECT pname, price
FROM Product
WHERE price <= 0.99
(CRASH!)
DELETE Product
WHERE price <=0.99
What’s wrong ?
9
9
Transactions
Transaction = process involving
database queries and/or modification.
Normally with some strong properties
regarding concurrency.
Formed in SQL from single statements
or explicit programmer control.
10
Revised Code
Client 1: START TRANSACTION
UPDATE Product
SET Price = Price – 1.99
WHERE pname = ‘Gizmo’
COMMIT
Client 2: START TRANSACTION
UPDATE Product
SET Price = Price*0.5
WHERE pname=‘Gizmo’
COMMIT
Now it works like a charm
11
11
ACID Transactions
ACID transactions are:
 Atomic : Whole transaction or none is done.
 Consistent : Database constraints preserved.
 Isolated : It appears to the user as if only one
process executes at a time.
 Durable : Effects of a process survive a crash.
Optional: weaker forms of transactions are
often supported as well.
12
ACID: Atomicity
•
•
Two possible outcomes for a transaction
–
It commits: all the changes are made
–
It aborts: no changes are made
That is, transaction’s activities are all or
nothing
13
13
ACID: Consistency
•
The state of the tables is restricted by integrity
constraints
–
Account number is unique
–
Stock amount can’t be negative
–
Sum of debits and of credits is 0
•
Constraints may be explicit or implicit
•
How consistency is achieved:
–
Programmer makes sure a txn takes a consistent state to a
consistent state
–
System makes sure that the txn is atomic
14
ACID: Isolation
•
•
A transaction executes concurrently with
other transaction
Isolation: the effect is as if each transaction
executes in isolation of the others
15
15
ACID: Durability
•
•
The effect of a transaction must continue to
exists after the transaction, or the whole
program has terminated
Means: write data to disk
Change on the horizon? Non-Volatile Ram
(NVRam). Byte addressable.
16
Challenges for ACID properties
•
In spite of failures: Power failures, but not
media failures
Need to log what happened
•
Users may abort the program: need to
“rollback the changes”
Performance!!
•
Many users executing concurrently
–
Can be solved via locking
17
Example:
Consider two transactions:
T1: BEGIN A=A+100, B=B-100 END
T2: BEGIN A=1.06*A, B=1.06*B END
•
•
T1 transfers $100 from B’s account to A’s account.
T2 credits both accounts with a 6% interest payment.
18
Gold Standard
•
Serial Execution
T1: A=A+100, B=B-100
T2:
A=1.06*A,
B=1.06*B
Execute T1 and T2, one after the other.
Or T2 then T1.
Database allows either! Tricky!
19
Database has freedom to
interleave
Consider a possible interleaving (schedule):
T1: A=A+100,
B=B-100
T2:
A=1.06*A,
B=1.06*B
Seems OK… But what about:
T1: A=A+100,
B=B-100
T2:
A=1.06*A, B=1.06*B
The DBMS’s view of the second schedule:
T1: R(A), W(A),
R(B), W(B)
T2:
R(A), W(A), R(B), W(B)
20
Scheduling Definitions
Serial schedule schedule that does not interleave the
actions of different transactions.
Equivalent schedules: For any database state, the
effect on DB of executing the first schedule is
identical to the effect of executing the second
schedule.
Serializable schedule: A schedule that is equivalent to
some serial execution of the transactions.
21
Database has freedom to
interleave
Are these serializable?
T1: A=A+100,
B=B-100
T2:
A=1.06*A,
B=1.06*B
Yes, equivalent to T1 then T2 –
even though actions of T1 & T2 are interleaved
T1: A=A+100,
B=B-100
T2:
A=1.06*A, B=1.06*B
T1, T2  (x+100)*1.06
T2, T1  (x*1.06)-100
For both of x in (a,b)
(a+100)*1.06
(b*1.06)-100
They
22differ!
What else can go wrong?
Anomalies
Often Referred to by name
23
Anomalies with Interleaved
Execution
Reading Uncommitted Data (WR Conflicts, “dirty
reads”):
T1:
T2:
R(A), W(A),
R(B), W(B), Abort
R(A), W(A), C
Unrepeatable Reads (RW Conflicts):
T1: R(A),
R(A), W(A), C
T2:
R(A), W(A), C
24
Anomalies (Continued)
Overwriting Uncommitted Data (WW Conflicts):
T1: W(A),
W(B), C
T2:
W(A), W(B), C
25
Famous anomalies Summary
Dirty read
T reads data written by T’ while T’ is running
– Then T’ aborts
Lost update
–
–
Two tasks T and T’ both modify the same data
T and T’ both commit
– Final state shows effects of only T, but not of T’
Inconsistent read
–
–
Task T sees some but not all changes made by T’
26
Transaction Statements
- Isolation Levels
27
Example: Interacting Processes
Assume the usual Sells(bar,beer,price)
relation, and suppose that Joe’s Bar sells
only Bud for $2.50 and Miller for $3.00.
Sally is querying Sells for the highest and
lowest price Joe charges.
Joe decides to stop selling Bud and
Miller, but to sell only Heineken at $3.50.
28
Sally’s Program
Sally executes the following two SQL
statements called (min) and (max) to
help us remember what they do.
(max) SELECT MAX(price) FROM Sells
WHERE bar = ’Joe’’s Bar’;
(min) SELECT MIN(price) FROM Sells
WHERE bar = ’Joe’’s Bar’;
29
Joe’s Program
At about the same time, Joe executes the
following steps: (del) and (ins).
(del) DELETE FROM Sells
WHERE bar = ’Joe’’s Bar’;
(ins) INSERT INTO Sells
VALUES(’Joe’’s Bar’, ’Heineken’, 3.50);
30
Interleaving of Statements
Although (max) must come before
(min), and (del) must come before
(ins), there are no other constraints on
the order of these statements, unless
we group Sally’s and/or Joe’s
statements into transactions.
31
Example: Strange Interleaving
Suppose the steps execute in the order
(max)(del)(ins)(min).
{3.50}
Joe’s Prices: {2.50,3.00} {2.50,3.00}
(max)
(del)
(ins)
(min)
Statement:
3.00
3.50
Result:
Sally sees MAX < MIN!
32
Fixing the Problem by Using
Transactions
If we group Sally’s statements
(max)(min) into one transaction, then
she cannot see this inconsistency.
She sees Joe’s prices at some fixed
time.
 Either before or after he changes prices, or
in the middle, but the MAX and MIN are
computed from the same prices.
33
Another Problem: Rollback
Suppose Joe executes (del)(ins), not as
a transaction, but after executing these
statements, thinks better of it and
issues a ROLLBACK statement.
If Sally executes her statements after
(ins) but before the rollback, she sees a
value, 3.50, that never existed in the
database.
34
Solution
If Joe executes (del)(ins) as a
transaction, its effect cannot be seen by
others until the transaction executes
COMMIT.
 If the transaction executes ROLLBACK
instead, then its effects can never be
seen.
35
START
The SQL statement START
TRANSACTION begins a transaction.
36
COMMIT
The SQL statement COMMIT causes a
transaction to complete.
 It’s database modifications are now
permanent in the database.
37
ROLLBACK
The SQL statement ROLLBACK also
causes the transaction to end, but by
aborting.
 No effects on the database.
Failures like division by 0 or a
constraint violation can also cause
rollback, even if the programmer does
not request it.
38
Isolation Levels
SQL defines four isolation levels =
choices about what interactions are
allowed by transactions that execute at
about the same time.
Only one level (“serializable”) = ACID
transactions.
Each DBMS implements transactions in
its own way.
39
Choosing the Isolation Level
 Within a transaction, we can say:
SET TRANSACTION ISOLATION LEVEL X
where X =
1.
2.
3.
4.
SERIALIZABLE
REPEATABLE READ
READ COMMITTED
READ UNCOMMITTED
40
Serializable Transactions
If Sally = (max)(min) and Joe =
(del)(ins) are each transactions, and
Sally runs with isolation level
SERIALIZABLE, then she will see the
database either before or after Joe
runs, but not in the middle.
41
Isolation Level Is Personal Choice
Your choice, e.g., run serializable,
affects only how you see the database,
not how others see it.
Example: If Joe Runs serializable, but
Sally doesn’t, then Sally might see no
prices for Joe’s Bar.
 i.e., it looks to Sally as if she ran in the
middle of Joe’s transaction.
42
Read-Commited Transactions
If Sally runs with isolation level READ
COMMITTED, then she can see only
committed data, but not necessarily the
same data each time.
Example: Under READ COMMITTED,
the interleaving (max)(del)(ins)(min) is
allowed, as long as Joe commits.
 Sally sees MAX < MIN.
43
Repeatable-Read Transactions
Requirement is like read-committed,
plus: if data is read again, then
everything seen the first time will be
seen the second time.
 But the second and subsequent reads may
see more tuples as well.
44
Example: Repeatable Read
Suppose Sally runs under REPEATABLE
READ, and the order of execution is
(max)(del)(ins)(min).
 (max) sees prices 2.50 and 3.00.
 (min) can see 3.50, but must also see 2.50
and 3.00, because they were seen on the
earlier read by (max).
45
Read Uncommitted
A transaction running under READ
UNCOMMITTED can see data in the
database, even if it was written by a
transaction that has not committed (and
may never).
Example: If Sally runs under READ
UNCOMMITTED, she could see a price
3.50 even if Joe later aborts.
46
Implementation Mechanisms
47
Motivation
•
Atomicity: Transactions may abort (“Rollback”).
•
Durability: What if DBMS crash?
Desired Behavior after
system restarts:
 T1, T2 & T3 should be
durable.
 T4 & T5 should be
aborted (effects not seen).
crash!
T1
T2
T3
T4
T5
48
Two High-level Mechanisms
1.
1.
Use Logging to make sure we can undo
operations.
Use Locking to make sure that each
transaction “sees” a consistent view of the
world
Recovery uses 1. & 2. to ensure that the database stays
consistent even after crashes or transactions abort.
49
The Log
•
Is a list of modifications
•
Log is duplexed and archived on stable storage.
–
•
Assume we don’t lose it!
Can force write entries to disk
–
A page goes to disk.
50
Basic Idea: (Physical) Logging
Record UNDO information for every update!
–
Sequential writes to log
–
Minimal info (diff) written to log
Log: An ordered list of actions
–
Log record contains:
<XID, location, old data, new data>
–
Sufficient to UNDO any transaction!
51
A picture of logging
T: R(A), W(A)
T
Log
A=0
B=5
Main Memory
A=0
Data on Disk
Log on Disk
52
Write-Ahead Logging (WAL)
DB uses Write-Ahead Logging Protocol.
Each update is logged! Why not reads?
1.
Must force log record for an update before
the corresponding data page goes to storage.
1.
Must write all log records for a TX before
53
commit.
A picture of logging
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=0
Data on Disk
Log on Disk
54
A picture of logging
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=0
Data on Disk
Log on Disk
55 before disk!
NB: Logging can happen after modification, but not
What can go wrong?
56
DBMS Writes Back A
57
A picture of logging
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=0
Data on Disk
Log on Disk
What if we crash now? Or T aborts?
58
With WAL!
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=0
Data on Disk
Log on Disk
59
Now, if we crash can recover correct value
of A !
TX commit
60
Transaction Commit
FORCE Write commit record to log
All log records up to last update from this TX are
FORCED
Commit() returns
Transaction is committed once commit record
61
is on stable storage
Incorrect Commit Protocol
T: R(A), W(A)
T
A=1
Commit?
B=5
A=01
Log
Main Memory
Ok, Commit!
If we crash
now, Is T
durable?
A=0
Data on Disk
Log on Disk
Lost T’s update!
62
Incorrect Commit Protocol
T: R(A), W(A)
T
A=1
Commit?
B=5
A=01
Log
Main Memory
Ok, Commit!
If we crash
now, Is T
durable?
A=0
Data on Disk
Log on Disk
63
Improved
64
Commit Protocol
T: R(A), W(A)
T
A=1
Commit?
B=5
A=01
Log
Main Memory
Ok, Commit!
If we crash
now? Is T
durable?
A=0
Data on Disk
Log on Disk
65
Wrong Commit Protocol
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=0
Data on Disk
Log on Disk
66 visible?
Crash … MM is wiped! Are T’s effects
Commit Protocol
T: R(A), W(A)
T
A=01
Log
A=1
B=5
Main Memory
A=1
A=0
Data on Disk
Log on Disk
67 visible?
On Crash, MM is wiped! Are T’s effects
With WAL!
T: R(A), W(A)
T
Log
A=1
B=5
Now, if we
crash can
recover A!
A=01
Main Memory
A=0
Data on Disk
Log on Disk
68
Write-Ahead Logging (WAL)
DB uses Write-Ahead Logging Protocol.
Each update is logged! Why not reads?
1.
Must force log record for an update before
the corresponding data page goes to storage.
2.
Must write all log records for a TX before
commit.
#1 guarantees Atomicity.
#2 guarantees Durability.
69
Logging Summary
•
•
•
If DB says TX commits, TX effect remains after
database crash
DB can undo actions and help us with
atomicity.
it’s only half the story…
70
Lock-Based Concurrency Control
Strict Two-phase Locking (Strict 2PL) Protocol:
1.
2.
3.
Each Xact must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on
object before writing.
All locks held by a transaction are released when
the transaction completes.
If an Xact holds an X lock on an object, no other
Xact can get a lock (S or X) on that object.
71
Picture of 2-Phase Locking (2PL)
# Locks
Lock
Acquisition
Lock Release
0 locks
Time
Strict 2PL
72
Need to be able to perform
Two Related Tasks
1.
1.
Abort a transaction. Ability to abort and
clean up a task’s impact on the system’s
state,
Recover from a crash. Main idea is called
“repeating history” in the ARIES algorithm.
73
2PL Locking & Serializability
74
Conflict Serializable Schedules
Two schedules are conflict equivalent if:
–
Involve the same actions of the same transactions
–
Every pair of conflicting actions of two
transactions are ordered in the same way
Two actions conflict if they are in different TXs, on
the
same object,
and one
of them is if
a write.
Schedule
S is conflict
serializable
S is conflict
equivalent to some serial schedule
75
Example
A schedule that is not conflict serializable:
T1: R(A), W(A),
R(B), W(B)
T2:
R(A), W(A), R(B), W(B)
A
T1
T2
Dependency graph
B
76
Dependency Graph
Dependency graph:
–
One node for each committed Xact T1…TN
–
edge from Ti to Tj if an actions in Ti precedes and
conflicts with an action in Tj.
Theorem: Schedule is conflict serializable if and
only if its dependency graph is acyclic
77
Strict 2PL
Thm: Strict 2PL allows only schedules whose
precedence graph is acyclic
Therefore, Strict 2PL only allows serializable
schedules
Are all serializable schedules allowed by Strict 2PL?
78
Serializable but not Conflict
Serializable
T1: R(A),
W(A), C
T2:
W(A), C
T3:
W(A), C
This is equivalent to T1 T2 T3, so serializable.
But not conflict equivalent (T1 and T2s) writes are
ordered differently.
79
Summary So far
•
•
•
If a schedule follows strict 2PL, it is serializable
Not all serializable schedules are allowed by
strict 2PL.
So let’s use strict 2PL, what could go wrong?
80
Deadlocks
•
•
Deadlock: Cycle of transactions waiting for
locks to be released by each other.
Two ways of dealing with deadlocks:
–
Deadlock prevention
–
Deadlock detection
81
Deadlock Prevention
•
•
Assign priorities based on timestamps.
Assume Ti wants a lock that Tj holds. Two
policies are possible:
–
Wait-Die: It Ti has higher priority, Ti waits for Tj;
otherwise Ti aborts
–
Wound-wait: If Ti has higher priority, Tj aborts;
otherwise Ti waits
If a transaction re-starts,
make
it has its
Issue:
Whatsure
if a transaction
never makes progress?
original timestamp
82
Deadlock Detection
•
•
Create a waits-for graph:
–
Nodes are transactions
–
There is an edge from Ti to Tj if Ti is waiting for Tj
to release a lock
Periodically check for cycles in the waits-for
graph
83
Deadlock Detection (Continued)
In general, must search through this
big graph. Sounds expensive! Is it?
Example:
T1: S(A), R(A),
T2:
S(B)
X(B),W(B)
T3:
T4:T1
T4
X(C)
S(C), R(C)
T2
T3
X(A)
T1
X(B)
T2
T3
T3
84
Locking Summary
•
Locks must be atomic, primitive operation
•
2PL does not avoid deadlock
•
Deadlock detection sounds more expensive
than it is….
85
Multiple-Granularity Locks
•
Hard to decide what granularity to lock
(tuples vs. pages vs. tables).
•
Shouldn’t have to decide!
•
Data “containers” are
nested:
Database
contains
Tables
Pages
Tuples
86
Solution: New Lock Modes,
Protocol
•
Allow Xacts to lock at each level, but with a
special protocol using new “intention”
locks:
IS IX S
--
Before locking an item, Xact
must set “intention locks” on
all its ancestors.
 For unlock, go from specific to
general (i.e., bottom-up).
 SIX mode: Like S & IX at the
same time.

--




IS




IX



S


X

87

X

Multiple Granularity Lock
Protocol
88
•
•
Each Xact starts from the root of the hierarchy.
To get S or IS lock on a node, must hold IS or IX
on parent node.
–
•
•
What if Xact holds SIX on parent? S on parent?
To get X or IX or SIX on a node, must hold IX or
SIX on parent node.
Must release locks in bottom-up order.
Protocol is correct in that it is equivalent to directly
setting locks at the leaf levels of the hierarchy.
89
•
T1 scans R, and updates a few tuples:
–
•
T2 uses an index to read only part of R:
–
•
T1 gets an SIX lock on R, then repeatedly gets
an S lock on tuples of R, and occasionally
upgrades to X on the tuples.
T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R.
T3 reads all of R:
–
T3 gets an S lock on R.
–
OR, T3 could behave like T2; can
lock escalation to decide which.
IS IX S
X
-- 




IS 



IX 


--
 
X use
S
90

Distributed Databases
91
•
T1 scans R, and updates a few tuples:
–
•
T2 uses an index to read only part of R:
–
•
T1 gets an SIX lock on R, then repeatedly gets
an S lock on tuples of R, and occasionally
upgrades to X on the tuples.
T2 gets an IS lock on R, and repeatedly
gets an S lock on tuples of R.
T3 reads all of R:
–
T3 gets an S lock on R.
–
OR, T3 could behave like T2; can
lock escalation to decide which.
IS IX S
X
-- 




IS 



IX 


--
 
X use
S
92

Distributed Databases Desiderata
Think: 10s of distributed DBMSs
Distributed Data Independence: Users do not
need to know on which machine their data sits
Distributed TX Isolation: Users write TXs that
touch many sites just as they would for a single
site.
93
Distributed Data
Data may be horizontally or vertically
fragmented
To ensure that vertical fragments can be
decomposed losslessly, may assign a tuple id.
94
Replication
We store several copies or replicas of a fragment
of data redundantly.
1. Increased Availability of Data
2. Increased Performance of Queries.
95
2 Phase Commit (2PC not 2PL), guarantee
ACID in distributed RDBMSs.
96
Challenges: What can go wrong
in a Distributed Database?
•
•
Everything that can go wrong in a single node
RDBMS (we’ll refer to this as site failure)
Link failure. Two machines may both be
running, but are unable to communicate.
Protocol must cope with both types of errors
97
The Characters in our Story
•
Each site maintains:
–
A log of the actions that are taken locally (by
subtransactions)
–
We need the ability to force write records.
For any TX, the site where it originates is called a
coordinator. Other sites are called subordinates.
98
FW = Force Write
The 2PC Protocol
User decides to
commit.
NB: Log records always written
before messages are sent.
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
99
FW = Force Write
The 2PC Protocol
User decides
to commit.
When is the TX officially
committed?
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
100
Restart after Node Failure
•
•
At each node: A Recovery Process. For each TX
T
If we have a commit/abort record for T. Did we
send the ACK? Why does this matter or not?
Case I: We are a subordinate for T
•
•
If prepare record for T, but no commit/abort.
Then, we need to contact the coordinator.
Why? Then, complete TX
If we have no prepare record. Then T could
not have voted to commit, so we may
101 abort T.
Restart after Node Failure
•
Come back up.
•
See the log.
•
Have to redo/undo all the TXs
•
Focus on the messages exchanged.
102
Case II: Rise of the Coordinator
•
•
If we have a commit/abort record, then we
send all the subordinates the status until we
get back ACKS. We then write an END Record.
If we have no prepare, commit, or abort for T
then we it could not have committed. The
coordinator should respond by saying T is
aborted if asked.
103
FW = Force Write
The 2PC Protocol
User decides
to commit.
Convince ourselves that
failures are handled
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
104
Some Observations
about the Protocol
105
FW = Force Write
User decides
to commit.
If a sub observes that the coordinator
fails after sending a YES vote. What
action must it take next?
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
106
FW = Force Write
Why are the ACK
messages useful?
User decides
to commit.
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
107
Many optimizations possible!
•
CW: biggest problem with 2PC is blocking
which reduces availability.
–
•
There are variants to reduce blocking and to
reduce # of Force Writes.
Was decidedly out of fashion.
–
Then, GOOG wrote spanner, back in the big time.
–
Not really, but (nerdy) public perception…
108
FW = Force Write
User decides
to commit.
If the coordinator crashes right after
sending out a prepare, should it abort or
commit? can it?
Coordinator
Subordinates
Prepare
If all Yes, FW commit.
If one No FW abort.
Votes: Yes or No
Outcome of vote
After all ACKs, write
an end record
ACK
Each subordinate decides:
1.
Commit, FW a prepare
record
2.
Abort, FW an abort
record
Each subordinate force
writes the outcome
109