3_concurrency

Download Report

Transcript 3_concurrency

Locks
• What do we want:
– if one transaction is accessing a shared resource
– another transaction cannot simultaneously access it.
• Locks: mechanism for enforcing this
– When want to access an item x, ask for a lock
– If somebody else has the lock on x then what ?
•
•
•
•
Wait. If nobody else has the lock on x ?
Go ahead and access x.
When finished with x, release the lock. Why ?
So other transactions can access the item.
1
Locks basics
• First look at simpler binary locks (yes/no)
– i.e. either you have a lock or you don’t
• Then look at more realistic shared/exclusive
locks
– Aka read/write locks
– Multiple transactions can simultaneously have a
shared lock. Why is this OK ?
• If two transactions are simultaneously reading,
not interfering with one another.
2
Binary Locks Eg
• Eg: T1 and T2 both trying to write y
T1
T2
wants to write y
asks for a lock on y
gets lock on y
wants to write y
write(y)
asks for a lock on y
waits
unlock(y)
gets lock on y
write(y)
3
Binary Locks basics
• When are locks released?
• Could be immediately after read/write
• Will see that this can lead to problems
– So can do at end of transaction when either committing or
aborting : synchpoints
– But for now assume immediately after read/write
• How locks work:
–
–
–
–
LOCK = 0: available
LOCK = 1 : not available
When want to access item: Lock_item(Y)
When done with item: Unlock_item(Y)
• How to implement Lock_item(Y), Unlock_item(Y) ?
4
Locking/unlocking: proposed solution
Lock_item(Y)
if LOCK (Y) = 0 (*item is unlocked*) then
LOCK (Y)  1 (*lock the item*)
else
go to sleep;
Unlock_item(Y)
LOCK (Y)  0 (*unlock the item*)
if any transactions are waiting then
wake up one of the waiting the transactions;
• Will this work?
5
Problems with proposed solution
1. Suppose T1 and T2 both trying to get a lock
•
Could be with multiple processors or interleaved
execution on a single processor
2. T1 checks value of lock, finds it =0
3. Before T1 can change lock value to 1, control
passes to T2
4. T2 checks value of lock, finds it =0 !
• Testing lock = 0 and setting lock to 1 has to be
an atomic indivisible operation. How?
• Semaphores, TSL etc: done by OS: CS644/320
6
Shared/Exclusive Locks
• Two locks modes : shared (S) and exclusive (X)
– To read, need shared (S) lock
– To write, need exclusive (X) lock
• If T1 holds S lock, T2 wants S lock ?
• Will get it. If T1 holds S lock, T2 wants X lock ?
• Will have to wait. If T1 holds X lock and T2 wants X
lock ?
• Will have to wait. If T1 holds X lock, T2 wants S lock?
• Will have to wait.
7
Who deals with the locking ?
• All of the locking is transparent to the user
– User programs don’t have to request locks
– User programs just try to read and write values
– DBMS inserts lock requests automatically into the
transaction
• Why not let user programs deal with locks?
• Don’t want to rely on user programs to behave
– Eg: suppose user program in charge of getting locks
– But did a write without getting a lock. Problem ?
• Inconsistent data
8
Eg: Shared/Exclusive Locks
• T1: is going to do r(A), r(B), display (A+B)
get S-lock (A);
read (A);
unlock (A);
get S-lock (B);
read (B);
unlock (B);
display (A+B)
• Does this guarantee serializability ?
9
Lock-Based Protocols
• Locking as above is not sufficient to guarantee
serializability. Why ?
• If A and B get updated in-between the read of A
and B, the displayed sum would be wrong.
– T2 : transferring : adding 5 to A, subtracting 5 from B
• Locking protocol : set of rules followed by all
transactions while requesting or releasing locks.
• Locking protocols
– Restrict set of possible schedules – don’t allow some
– Control the concurrency
10
Deadlock Eg
• If transactions don’t release locks till commit
– Executing lock-X on X causes T2 to wait for T1 to release its lock
on X
– Executing lock-X on Y causes T1 to wait for T2 to release its lock
on Y
• Deadlock: Neither T1 nor T2 can make progress
T1
lock-X on X
write (X)
T2
lock-X on Y
write (X)
wait for lock-X on X
wait for lock-X on Y
11
Deadlocks
• Deadlock: Cycle of transactions waiting for
locks to be released by each other.
• Two main ways of dealing with deadlocks:
• Deadlock detection :
–
–
–
•
allow deadlock to happen
figure out when it has happened
deal with it by rolling back one or more
transactions
Deadlock prevention : protocol ensures that
system will never enter into deadlock
12
Deadlock detection/Rollback
• After deadlock detected, rollback one of the
transactions. Problems ?
• Wasted effort
• Another possible issues is Starvation : one
process doesn’t get service
– Is possible if concurrency control manager is badly
designed. In above example ?
• The same transaction is repeatedly rolled back
due to deadlocks.
• Concurrency control manager can be designed
to prevent starvation.
13
Detection vs Prevention
• There are 2 ways to think about deadlock:
– Blocking : forcing a T to wait
– Aborting : killing a T
• Both involve performance penalties:
– Blocking : if T has a lock on Y, and T is forced to
wait, then all other transactions that need Y will
also have to wait.
– Aborting : waste work done, overhead in restarting.
• Deadlock prevention: more blocking
• Deadlock detection: more aborting
14
Granularity:what gets locked
• Complex topic, might cover later
• What can be locked ?
–
–
–
–
A single record (a row in a table)
A page or several pages of a file storing table
Whole table
Index or part of an index
• The coarser (the bigger) the structure being
locked:
– Less concurrency
– Less overhead
– Greater safety
15
Lock Conversion
• Lock upgrade: convert existing read lock to
write lock. When should this be allowed ?
If Ti has a read-lock(y) and no other Tj has
read-lock(y) then
convert read-lock(y) to write-lock(y)
else
force Ti to wait until Tj unlocks y
• Lock downgrade: convert existing write lock
to read lock
16
Issues with Lock Conversion
• If T1 holds S lock, T2 waiting for X lock, T3
comes and asks for S lock, what to do?
• Won’t give T3 S lock. Why ?
• o/w T2 might be waiting forever - starvation
• T1 has S lock, T2 waiting for X lock. T1 wants to
upgrade to X lock. Allow ?
• Depends on protocol. Suppose T1 not allowed to
upgrade. What now?
• T1 won’t give up S lock, T2 has to wait for T1 to
get X lock, T1 has to wait for T2 . Deadlock !
17
Implementation of Locking
• lock manager: implemented as a separate process
– to which transactions send lock and unlock requests
• Lock manager replies to a lock request by
– Either granting the lock
– Or requesting transaction waits till lock available
– Or asking the transaction to roll back, in case of a deadlock
• lock table : data-structure maintained by lock
manager to record granted locks and pending requests
– usually implemented as hash table
– indexed on the name of the data item being locked
18
Lock Table
Granted
Waiting
• Black rectangles indicate granted
locks, white ones indicate waiting
requests
• Lock table also records the type of
lock granted or requested
• New request is added to the end of
the queue of requests for the data
item, and granted if it is compatible
with all earlier locks: FCFS
• Unlock requests result in the
request being deleted, and later
requests are checked to see if they
can now be granted
• If transaction aborts, all waiting or
granted requests of the transaction
are deleted
– lock manager may keep a list of
locks held by each transaction,
to implement this efficiently
19
Deadlock Detection
• Create a waits-for graph: Nodes are transactions
• When Ti requests data item currently held by Tj
– then the edge Ti  Tj is inserted in the wait-for graph.
– This edge is removed only when Tj is no longer holding a
data item needed by Ti.
If there is deadlock, what can we say about the graph ?
• Deadlock if and only if there is a cycle
• Periodically check for cycles in waits-for graph. How?
•
– Done in CSCI 6632/3326
– Saw some thing similar with serializability
– Not serializable if there is a cycle in the precedence graph
20
Two Wait-for Graph Egs
Wait-for graph with a cycle
T1
T4
T2
T3
• Wait-for graph for
schedule on the right ?
– Is there deadlock
21
Wait-for Graph Eg
22
Another Wait-for Graph Eg
•
•
•
•
T1: S(A), R(A),
T2:
X(B),W(B)
T3:
T4:
X(C)
S(C), R(C)
X(A)
X(B)
• What is the wait-for graph for this ?
– We assume transactions do not give up any locks
till they commit
– Is there deadlock ?
23
Deadlock Detection – when to do?
• Do either periodically at fixed intervals or
when a lot of transactions are waiting.
• If periodically, if do too often ?
• Expensive: lot of overhead
• If do infrequently ?
• Lots of transactions may be blocked
• Deadlock detection works well if little
interference among transactions
– Less chance of deadlock
24
Deadlock Recovery
• When deadlock is detected, some transaction
will have to rolled back (made a victim) to
break deadlock. Which one ?
• One strategy: select that transaction as victim
that will incur minimum cost.
25
Deadlock Recovery
• Rollback -- determine how far to roll back
transaction
• Total rollback: Abort the transaction and then
restart it.
• More effective to roll back transaction only as
far as necessary to break deadlock.
• Starvation happens if same transaction is
always chosen as victim
– Include the number of rollbacks in the cost factor
to avoid starvation
26
Deadlock Prevention
• Use some protocol which will never allow
deadlock. Couple of simple example:
• No waiting: if T can’t get lock immediately,
abort T. No deadlock – why ? Problem ?
• Leads to lots of T getting aborted, better
• Cautious waiting: Ti wants lock, Tj has lock.
if Tj blocked then
abort (Ti)
else
Ti gets blocked
• Why no deadlock ?
27
Transaction Timestamp
• Each transaction has a unique timestamp
– TS(T1) < TS (T2) means T1 is older than T2
• Assign priorities based on timestamps
– Older T get higher priority. Why ?
• Has done more work, so more expensive to kill
• Can do in different way to prevent deadlock
– We look at 2 algorithms: use timestamps to
determine when to rollback, and whom to rollback
• Wait-Die
• Wound-Wait
28
Deadlock Prevention Wait-Die
• Wait-Die scheme
• If older T2 wants resource held by younger T1,
T2 waits for T1
• If younger T1 wants resource held by older T2,
T1 won’t wait for T2
– T1 will be rolled back instead.
• Why no deadlock ?
• Can’t get a cycle. Why ?
• If cycle, somewhere there would have to be an
edge from younger to older transaction.
29
Deadlock Prevention Wait-Die
• A transaction may die several times before
acquiring needed data item.
• Each time, it is re-started with same timestamp.
– Why same timestamp ?
– What would be the problem if given new timestamp
each time?
• It would always be young. Problem ?
• It would be getting rolled back all the time :
starvation
30
Deadlock Prevention: Wound-Wait
• Wound-Wait scheme :
• If younger T1 wants resource held by older T2
– T1 can wait for T2.
• If older T2 wants resource held by younger T1
– T2 wounds (forces rollback of ) T1
• instead of waiting for T1
– T1 restarted with original timestamp
• Why no deadlock?
• Older T never waits for younger T
• Preemptive: running T can be preempted
31
2-phase locking (2PL)
•
•
•
•
What might we want from a locking protocol ?
Serializable: equivalent to serial
Strict: easy to abort, no cascading rollbacks
No deadlock
32
Figure 22.3
T1
T2
Result
read_lock (Y);
read_item (Y);
unlock (Y);
write_lock (X);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);
read_lock (X);
read_item (X);
unlock (X);
Write_lock (Y);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
Initial values: X=20; Y=30
Result of serial execution
T1 followed by T2
X=50, Y=80.
Result of serial execution
T2 followed by T1
X=70, Y=50
• If T1 and T2 behave as above, can we be sure
that interleaved schedule will be serializable?
• No –example not serializable ?
– next slide
33
Figure 22.3
T1
T2
read_lock (Y);
read_item (Y);
unlock (Y);
Result
X = 20; Y = 30
Nonserializable
X=50, Y=50
read_lock (X);
read_item (X);
unlock (X);
write_lock (Y);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
write_lock (X);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);
• Why did this happen ?
• Y lock should not have been given up by T1
– till X lock had been obtained.
34
2-phase locking
• Two-phase locking: A transaction can not
request additional locks once it releases any
locks.
• Phase 1: Growing Phase
– transaction may obtain locks
– transaction may not release locks
• Phase 2: Shrinking Phase
– transaction may release locks
– transaction may not obtain locks
• Doesn’t mean get all locks in the beginning
– Is enough to guarantee, but not the only way
35
Figure 22.4 2-phase locking Eg
T1
T2
read_lock (Y);
read_item (Y);
write_lock (X);
unlock (Y);
read_item (X);
X:=X+Y;
write_item (X);
unlock (X);
read_lock (X);
read_item (X);
Write_lock (Y);
unlock (X);
read_item (Y);
Y:=X+Y;
write_item (Y);
unlock (Y);
• In this eg, T1, T2 follow 2PL:
– After give up a lock, never ask for a lock
• Two questions
– Can deadlock happen
– If no deadlock, can non-serializable happen
36
2-phase locking, deadlock
• Does 2-phase locking guarantee deadlock free?
• Look at Figure 22.4 again
T1
T2
read_lock (Y);
read_item (Y);
read_lock (X);
read_item (X);
write_lock (X);
(waits for X)
• Did T1, T2 follow 2PL ?
• Yes. Are T1, T2 in deadlock ?
• Yes.
write_lock (Y);
(waits for Y)
37
2-phase locking, serializability
• If every transaction follows 2PL, are we
guaranteed serializability ?
• Yes. What we will show
– non-serializable → deadlock
– How does this help ?
• Contrapositive of “non-serializable →
deadlock” ?
• No deadlock → serializable
• We will now prove
– non-serializable → deadlock
38
2-phase locking, serializability
• Suppose non-serializable.
– What can we say about precedence graph ?
• Has to have a cycle: say T1 T2  T1
• T1 T2 edge in precedence graph tells us what ?
• T1 accessed resource Y before T2
– So T1 had lock on Y before T2 got lock on Y
• T1 won’t give up lock on Y till acquired all locks
• T2 T1 edge in precedence graph tells us what ?
• T2 accessed resource X before T1
– So T2 had lock on X before T1 got lock on X
• T2 won’t give up lock on X till acquired all locks
39
2-phase locking, serializability
•
•
•
•
•
•
•
T1 got lock on Y first
T2 has to wait for T1 to release lock on Y
T2 got lock on X first
T1 has to wait for T2 to release lock on X
Result ?
Deadlock !
We have now proved:
– non-serializable → deadlock
• Contrapositive
– No deadlock → serializable
40
Conservative 2-phase locking
• Conservative (aka Static) 2PL = 2PL + additional
requirement:
– Predeclare all locks: T has to get all locks it is ever going to
need before it starts executing
– All locks obtained together at the start
– i.e. if T doesn’t get one lock, won’t get any locks
• Eg:
–
–
–
–
Suppose T wants to first read y and then read z
T’ has write lock on z, nobody has a lock on y
T will predeclare read locks on y, z.
Will T get read lock on y ?
• No : if can’t get all, can’t get any.
• Advantage of this compared to regular 2PL?
41
Conservative 2-phase locking
• No deadlock. Why ? Suppose only T1, T2.
• If edge from T2 to T1 in wait-for graph
– This means T1 got in before T2. So …
•
•
•
•
•
T1 got all locks before T2 got any locks. So …
No edge from T1 to T2. So …
No cycle and hence no deadlock. Downside
Reduces concurrency. Why ?
If T can’t get lock on y, can’t proceed
– even if y not needed for long time
• How do you know which locks you will need ?
42
T1
2-phase locking:
bad
Eg
T2
write_lock (Y);
read_item (Y);
Y=Y+2
write_item (Y);
give up write lock on Y
read_lock (Y);
read_item (Y);
abort
• Problem ?
• T2 read from a write done by uncommitted T1
– Cascading rollback: T2 has to be rolled back.
• Did we follow 2PL?
• Yes, but this still happened : why ?
43
Strict 2-phase locking
• Lock given by T1 too early. To prevent
problem, when should it have been given up.
• Only when T1 committed or aborted
• Strict 2PL : T does not release any X locks till
it either commits or aborts
• Leads to strict schedules. Why ?
• No T’ can read or write any item written by T
till T has committed. Why ?
• T’ can’t get locks :T only releases on commit
44
Strict, Rigorous, Conservative 2PL
• Rigorous (aka Strong strict ) 2PL : T does not
release any X or S locks till it either commits
or aborts
– Easier to implement than strict 2PL since in this
case no locks released till commit
– Has some advantages in distributed databases
•
•
•
•
•
Strict/rigorous 2PL : both commonly used.
Does a conservative schedule have to be strict?
No : HW problem
Does a strict schedule have to be conservative?
No : HW problem
45
SQL Server Locking
•
Locking in Microsoft SQL Server, Alexander Chigrik
– http://www.mssqlcity.com/Articles/Adm/SQL70Locks.htm
• “There are three main types of locks that SQL Server uses:
• Shared locks are used for operations that do not change or
update data, such as a SELECT statement.
• Update locks are used when SQL Server intends to modify a
page, and later promotes the update page lock to an exclusive
page lock before actually making the changes.
• Exclusive locks are used for the data modification operations,
such as UPDATE, INSERT, or DELETE.
• Shared locks are compatible with other Shared locks or
Update locks.
• Update locks are compatible with Shared locks only.
• Exclusive locks are not compatible with other lock types.”
46
SQL Server Locking Eg
• “Let me to describe it on the real example. There are four processes, which
attempt to lock the same page of the same table. These processes start one
after another, so Process1 is the first process, Process2 is the second process
and so on.
Process1 : SELECT, Process2 : SELECT
Process3 : UPDATE, Process4 : SELECT
• Process1 sets the Shared lock on the page, because there are no other locks on
this page.
Process2 sets the Shared lock on the page, because Shared locks are
compatible with other Shared locks.
Process3 wants to modify data and wants to set Exclusive lock, but it cannot
make it before Process1 and Process2 will be finished, because Exclusive lock
is not compatible with other lock types. So, Process3 sets Update lock.
Process4 cannot set Shared lock on the page before Process3 will be finished.
• So, there is no Lock starvation. Lock starvation occurs when read transactions
can monopolize a table or page, forcing a write transaction to wait
indefinitely. So, Process4 waits before Process3 will be finished.
After Process1 and Process2 were finished, Process3 transfer Update lock into
Exclusive lock to modify data. After Process3 was finished, Process4 sets the
Shared lock on the page to select data.”
47
Oracle locks
Oracle® Database Concepts, 10g Release 2
http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/consist.htm
• “Oracle uses two modes of locking in a multiuser
database:
• Exclusive lock mode prevents the associates resource
from being shared. This lock mode is obtained to
modify data. The first transaction to lock a resource
exclusively is the only transaction that can alter the
resource until the exclusive lock is released.
• Share lock mode allows the associated resource to be
shared, depending on the operations involved. Multiple
users reading data can share the data, holding share
locks to prevent concurrent access by a writer (who
needs an exclusive lock). Several transactions can
acquire share locks on the same resource.”
48
More on Oracle locks
• “Types of Locks: Oracle automatically uses different types of locks to
control concurrent access to data and to prevent destructive interaction
between users. Oracle automatically locks a resource on behalf of a
transaction to prevent other transactions from doing something also requiring
exclusive access to the same resource. The lock is released automatically
when some event occurs so that the transaction no longer requires the
resource.
• Throughout its operation, Oracle automatically acquires different types of
locks at different levels of restrictiveness depending on the resource being
locked and the operation being performed.
• Oracle locks fall into one of three general categories.
• DML locks (data locks) DML locks protect data. For example, table locks
lock entire tables, row locks lock selected rows.
• DDL locks (dictionary locks) DDL locks protect the structure of schema
objects—for example, the definitions of tables and views.
• Internal locks and latches protect internal database structures such as data
files.”
49
More on Oracle locks
• “Lock Duration : All locks acquired by statements within a transaction are held
for the duration of the transaction, preventing destructive interference including
dirty reads, lost updates, and destructive DDL operations from concurrent
transactions. The changes made by the SQL statements of one transaction become
visible only to other transactions that start after the first transaction is committed.
• Oracle releases all locks acquired by the statements within a transaction when you
either commit or undo the transaction. Oracle also releases locks acquired after a
savepoint when rolling back to the savepoint. However, only transactions not
waiting for the previously locked resources can acquire locks on the now available
resources. Waiting transactions will continue to wait until after the original
transaction commits or rolls back completely.
• Data Lock Conversion Versus Lock Escalation
• A transaction holds exclusive row locks for all rows inserted, updated, or deleted
within the transaction. Because row locks are acquired at the highest degree of
restrictiveness, no lock conversion is required or performed.
• Oracle automatically converts a table lock of lower restrictiveness to one of higher
restrictiveness as appropriate. For example, assume that a transaction uses a
SELECT statement with the FOR UPDATE clause to lock rows of a table. As a
result, it acquires the exclusive row locks and a row share table lock for the table.
If the transaction later updates one or more of the locked rows, the row share table
lock is automatically converted to a row exclusive table lock.”
50
More on Oracle locks
• “Lock escalation occurs when numerous locks are held at one level of
granularity (for example, rows) and a database raises the locks to a higher
level of granularity (for example, table). For example, if a single user locks
many rows in a table, some databases automatically escalate the user's row
locks to a single table. The number of locks is reduced, but the restrictiveness
of what is being locked is increased.
• Oracle never escalates locks. Lock escalation greatly increases the likelihood
of deadlocks. Imagine the situation where the system is trying to escalate
locks on behalf of transaction T1 but cannot because of the locks held by
transaction T2. A deadlock is created if transaction T2 also requires lock
escalation of the same data before it can proceed.
• Deadlock Detection : Oracle automatically detects deadlock situations and
resolves them by rolling back one of the statements involved in the deadlock,
thereby releasing one set of the conflicting row locks. A corresponding
message also is returned to the transaction that undergoes statement-level
rollback. The statement rolled back is the one belonging to the transaction
that detects the deadlock. Usually, the signaled transaction should be rolled
back explicitly, but it can retry the rolled-back statement after waiting.
• Deadlocks most often occur when transactions explicitly override the default
locking of Oracle. Because Oracle itself does no lock escalation and does not
use read locks for queries, but does use row-level locking (rather than pagelevel locking), deadlocks occur infrequently in Oracle.”
51
Elmasri Company Database
• The company is organized into
DEPARTMENTS.
• Each department has a name, number and an
employee who manages the department
– We keep track of the start date of the department
manager.
• Each department controls a number of
PROJECTS.
• Each project has a name, number and is
located at a single location.
52
Elmasri Company Database
• For each EMPLOYEE, we store the social security
number, address, salary, sex, and birthdate.
• Employees may have a supervisor
• Each employee works for one department but may
work on several projects.
• We keep track of the number of hours per week that
an employee currently works on each project.
• Each employee may have a number of
DEPENDENTS.
– For each dependent, we keep track of their name, sex,
birthdate, and relationship to employee.
53
The ER
diagram for
the
COMPANY
database.
54
Elmasri database with FK: Figure 3.7
55
Elmasri
Figure 3.6:
Relational
Instance
56
Review of Database Programming
• Done in CS 622
• Accessing database from application program
– as opposed to interactive interfaces. Why?
• Interactive interface convenient but not enough
– Majority of database ops done through application
programs
• written by application programmer
• We do quick review of database programming
– To help understand SQL support for transactions
57
Database Programming Approaches
• Embedded commands:
– Database commands embedded in a generalpurpose programming language
• Library of database functions:
– Available to the host language for database calls
– Application Program Interface: Eg: JDBC/ODBC
• Design brand new, special database language
– Minimizes impedance mismatch.
– Eg: Oracle’s PL/SQL
• We will only review embedded SQL
58
Embedded SQL approach
• Approach: Embed SQL in the host language.
– Eg: Pro*C, Pro*COBOL, Pro*FORTRAN,
• DBMS-specific preprocessor transforms
embedded SQL statements into function calls in
the host language
• Then a regular compiler is used to compile the
code.
• Details of translation vary across DBMSs
• Source code possibly same for different DBMS
– Final executable will work only with one DBMS
59
Embedded SQL : PRO*C
•
•
•
•
Embedded SQL programming language
Used by Oracle, Sybase SQL server
Pro*C uses either C or C++ as its host language
During compilation
– embedded SQL statements interpreted by
precompiler
– replaced by C or C++ function calls
• Output from the Pro*C precompiler is standard
C or C++ code
– This is compiled by C or C++ compiler into an
executable
60