PPT - NYU Stern School of Business
Download
Report
Transcript PPT - NYU Stern School of Business
C20.0046: Database
Management Systems
Lecture #26
Matthew P. Johnson
Stern School of Business, NYU
Spring, 2004
M.P. Johnson, DBMS, Stern/NYU, Sp2004
1
Agenda
Previously: Indices
Next:
Finish Indices, advanced indices
Failure/recovery
Data warehousing & mining
Websearch
Hw3 due today
no extensions!
1-minute responses
Review: clustered, dense, primary, #/tbl, syntax
M.P. Johnson, DBMS, Stern/NYU, Sp2004
2
User/
Application
Transaction
commands
Let’s get physical
Query
update
Query compiler/optimizer
Record,
index
requests
Transaction manager:
•Concurrency control
•Logging/recovery
Read/write
pages
Execution engine
Query execution
plan
Index/record mgr.
Page
commands
Buffer manager
Storage manager
storage
M.P. Johnson, DBMS, Stern/NYU, Sp2004
3
BSTs
Very simple data structure in CS: BSTs
Each node has two children:
Binary Search Trees
Keep balanced
Each node ~ one item
Left subtree: <
Right subtree: >=
Can search, insert, delete in log time
log2(1MB = 220) = 20
M.P. Johnson, DBMS, Stern/NYU, Sp2004
4
Search for DBMS
Big improvement: log2(1MB) = 20
Each op divides remaining range in half!
But recall: all that matters is #disk accesses
20 is better than 220 but:
Can we do better?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
5
BSTs B-trees
Like BSTs except each node ~ one block
Branching factor is >> 2
Data stored only in leaves
Each access divides remaining range by, say, 300
B-trees = BSTs + blocks
B+ trees are a variant of B-trees
Leaves form a (sorted) linked list
Better supports range queries
Consequences:
Much shorter depth Many fewer disk reads
Must find element within node
Trades CPU/RAM time for disk time
M.P. Johnson, DBMS, Stern/NYU, Sp2004
6
B+ Trees
Parameter n branching factor is n+1
Largest number s.t. one block can contain n
search-key values and n+1 pointers
30
Keys k < 30
120
Keys 30<=k<120
240
Keys 120<=k<240
Keys 240<=k
Each node (except root) has at least n/2 keys
40
50
60
Next leaf
40
50
60
M.P. Johnson, DBMS, Stern/NYU, Sp2004
7
Searching a B+ Tree
Exact key values:
Start at the root
If we’re in leaf, walk through its key values;
If not, look at keys K1..Kn
If Ki <= K <= Ki+1, look in child i
Range queries:
Select name
From people
Where age = 25
As above
Then walk left until test fails
M.P. Johnson, DBMS, Stern/NYU, Sp2004
Select name
From people
Where 20 <= age
and age <= 30
8
B+ Tree Example
Find the key 40
n=4
80
40 80
20
60
100
12
0
140
20 < 40 60
10
15
18
20
30
40
50
60
65
80
85
90
30 < 40 40
10
15
18
20
30
40
50
60
65
80
85
90
NB: Leaf keys are sorted; data pointed to is only if clustered
M.P. Johnson, DBMS, Stern/NYU, Sp2004
9
Clustered & unclustered B-trees
Data entries
Data entries
(Index
File)
(Data file)
Data Records
CLUSTERED
Data Records
UNCLUSTERED
B+ trees, and, or
Assume index on a,b,c
Intuition: phone book
WHERE a = ‘x’ and b = ‘y’
WHERE b = ‘y’ and c = ‘z’
WHERE a = ‘a’ and c = ‘z’
WHERE a = ‘x’ or b = ‘y’ or c = ‘z’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
11
B+ trees and LIKE
Supports only hard-coded prefix LIKE checks
Intuition: phone book
Select * from T where a like ‘xyz%’
Select * from T where a like ‘%xyz’
Select * from T where a like ‘xyz%zyx%’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
12
B-tree search efficiency
With params:
the largest n satisfying 4n+8(n+1) <= 4096 is n=340
block=4k
integer = 4b,
pointer = 8b
Each node has 170..340 keys
assume on avg has (170+340)/2=255
Then:
255 rows depth = 1
2552 = 64k rows depth = 2
2553 = 16M rows depth = 3
2554 = 4G rows depth = 4
M.P. Johnson, DBMS, Stern/NYU, Sp2004
13
B-trees in practice
Most DBMSs use B-trees for most indices
Speeds up
Default in MySQL
Default in Oracle
where clauses
Some like checks
Min or max functions
joins
Limitation: fields used must
Be a prefix of indexed fields
Be ANDed together
M.P. Johnson, DBMS, Stern/NYU, Sp2004
14
Next topic: Advanced types of indices
Spatial indices based on R-trees (R = region)
Support multi-dimensional searches on
“geometry” fields
2-d not 1-d ranges
Oracle:
CREATE INDEX geology_rtree_idx ON
geology_tab(geometry) INDEXTYPE IS
MDSYS.SPATIAL_INDEX;
MySQL:
CREATE TABLE geom (g GEOMETRY NOT
NULL, SPATIAL INDEX(g));
M.P. Johnson, DBMS, Stern/NYU, Sp2004
15
Advanced types of indices
Inverted indices for web doc search
First, think of each webpage as a tuple
One column for every possible word
True means the word appears on the page
Index on all columns
Now can search: you’re fired
select * from T where youre=T and fired=T
M.P. Johnson, DBMS, Stern/NYU, Sp2004
16
Advanced types of indices
Can simplify somewhat:
1.
2.
For each field index, delete False entries
True entries for each index become a bucket
Create “inverted index”:
One entry for each search word
Search word entry points to corresponding
bucket
Bucket points to pages with its word
Amazon
M.P. Johnson, DBMS, Stern/NYU, Sp2004
17
Advanced types of indices
Function-based indices
Speeds up WHERE upper(name)=‘BUSH’, etc.
create index on T(my_soundex(name));
create index on T(substr(DOB),4,5));
Now supported in Oracle 8, not MySQL
Bitmap indices
Speeds up arbitrary combination of reqs
Not limited to prefixes or conjunctions
Now supported in Oracle 9, not MySQL
M.P. Johnson, DBMS, Stern/NYU, Sp2004
18
Bitmap indices
Assume table has n records
Assume F is a field with m different values
Bitmap index on F: m length-n bitstrings
One bitstring for each value of F
Each one says which rows have that value for F
Example:
n = , mF =
1
, mG =
2
3
Q: find rows where
4
F=50 or (F=30 and G=‘Baz’)
5
6
M.P. Johnson, DBMS, Stern/NYU, Sp2004
F
30
30
40
50
40
30
G
Foo
Bar
Baz
Foo
Bar
Baz
19
Bitmap index search
Larger example: (age,salary) of jewelry buyers:
Sal.
Age
Sal.
1
25
60
2
45
3
4
Age
Sal.
5
50
120
9
25
400
60
6
70
110
10
45
350
50
75
7
85
140
11
50
275
50
100
8
30
260
12
50
260
Bitmaps for age:
Age
25:100000001000, 30:000000010000, 45:01000000100,
50:001110000010, 60:000000000001, 70:000001000000,
85:000000100000
Bitmaps for salary:
60:110000000000, 75:001000000000, 100:000100000000,
110:000001000000, 120:000010000000, 140:000000100000,
260:000000010001, 275:000000000010,
20
M.P. Johnson, DBMS, Stern/NYU, Sp2004
350:000000000100, 400:000000001000
Bitmap index search
Query: find buyers of age 45-55 with salary
100-200
Age range: 010000000100 (45) |
001110000010 (50) = 011110000110
Bitwise or of Salary range: 000111100000
AND together: 011110000110 &
000111100000 = 000110000000
What does this mean?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
21
Bitmap index search
Once we have row numbers, then what?
Get rows with those numbers (How?)
Bitmap indices in Oracle:
CREATE BITMAP INDEX ON T(F,G);
Best for low-cardinality fields
Boolean, enum, gender
lots of 0s in our bitmaps
Compress: 000000100001 6141
“run-length encoding”
M.P. Johnson, DBMS, Stern/NYU, Sp2004
22
New topic: Recovery
Type of Crash
Prevention
Wrong data entry
Constraints and
Data cleaning
Disk crashes
Redundancy:
e.g. RAID, archive
Fire, theft,
bankruptcy…
Buy insurance,
Change jobs…
System failures:
e.g. blackout
DATABASE
RECOVERY
M.P. Johnson, DBMS, Stern/NYU, Sp2004
23
System Failures
Each transaction has internal state
When system crashes, internal state is lost
Don’t know which parts executed and which didn’t
Remedy: use a log
A file that records each action of each xact
Trail of breadcrumbs
M.P. Johnson, DBMS, Stern/NYU, Sp2004
24
Media Failures
Rule of thumb: Pr(hard drive has head crash
within 10 years) = 50%
Simpler rule of thumb: Pr(hard drive has head
crash within 1 years) = 10%
Serious problem
Soln: different RAID strategies
RAID: Redundant Arrays of Independent
Disks
M.P. Johnson, DBMS, Stern/NYU, Sp2004
25
RAID levels
RAID level 1: each disk gets a mirror
RAID level 4: one disk is xor of all others
E.g.:
Each bit is sum mod 2 of corresponding bits
Disk 1: 11110000
Disk 2: 10101010
Disk 3: 00111000
Disk 4:
How to recover?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
26
Transactions
Transaction: unit of code to be executed
atomically
In ad-hoc SQL
one command = one transaction
In embedded SQL
Transaction starts = first SQL command issued
Transaction ends =
COMMIT
ROLLBACK (=abort)
Can turn off/on autocommit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
27
Primitive operations of transactions
Each xact reads/writes rows or blocks: elms
INPUT(X)
READ(X,t)
copy transaction local variable t to element X
OUTPUT(X)
copy element X to transaction local variable t
WRITE(X,t)
read element X to memory buffer
write element X to disk
LOG RECORD
M.P. Johnson, DBMS, Stern/NYU, Sp2004
28
Transaction example
Xact: Transfer $100 from savings to checking
A = A+100;
B = B-100;
READ(A,t);
t
:= t+100;
WRITE(A,t);
READ(B,t);
t
:= t-100;
WRITE(B,t)
M.P. Johnson, DBMS, Stern/NYU, Sp2004
29
Transaction example
READ(A,t); t := t+100;WRITE(A,t); READ(B,t); t := t-100;WRITE(B,t)
Action
t
INPUT(A)
Mem A
Mem B
Disk A
Disk B
1000
1000
1000
READ(A,t)
1000
1000
1000
1000
t:=t+100
1100
1000
1000
1000
WRITE(A,t)
1100
1100
1000
1000
INPUT(B)
1100
1100
1000
1000
1000
READ(B,t)
1000
1100
1000
1000
1000
t:=t-100
900
1100
1000
1000
1000
WRITE(B,t)
900
1100
900
1000
1000
OUTPUT(A)
900
1100
900
1100
1000
OUTPUT(B)
900
1100
900
1100
900
M.P. Johnson, DBMS, Stern/NYU, Sp2004
30
The log
An append-only file containing log records
Note: multiple transactions run concurrently,
log records are interleaved
After a system crash, use log to:
Redo some transaction that didn’t commit
Undo other transactions that didn’t commit
Three kinds of logs: undo, redo, undo/redo
We’ll discuss only Undo
M.P. Johnson, DBMS, Stern/NYU, Sp2004
31
Undo Logging
Log records
<START T>
<COMMIT T>
T has committed
<ABORT T>
transaction T has begun
T has aborted
<T,X,v>
T has updated element X, and its old value was v
M.P. Johnson, DBMS, Stern/NYU, Sp2004
32
Undo-Logging Rules
U1: Changes logged (<T,X,v>) before being
written to disk
U2: Commits logged (<COMMIT T>) after being
written to disk
Results:
May forget we did whole xact (and so wrongly undo)
Will never forget did partial xact (and so leave)
Log-change, change, log-change, change,
Commit, log-commit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
33
Undo-Logging e.g. (inputs omitted)
Action
T
Mem A
Mem B
Disk A
Disk B
Log
<START T>
READ(A,t)
1000
1000
1000
1000
t:=t+100
1100
1000
1000
1000
WRITE(A,t)
1100
1100
1000
1000
READ(B,t)
1000
1100
1000
1000
1000
t:=t-100
900
1100
1000
1000
1000
WRITE(B,t)
900
1100
900
1000
1000
OUTPUT(A)
900
1100
900
1100
900
OUTPUT(B)
900
1100
900
1100
900
<T,A,8>
<T,B,8>
COMMIT
<COMMIT T>
M.P. Johnson, DBMS, Stern/NYU, Sp2004
34
Recovery with Undo Log
After system’s crash, run recovery manager
1.
Decide for each xact T whether it was completed
<START T>….<COMMIT T> yes
<START T>….<ABORT T>
yes
<START T>…………………… no
2.
Undo all modifications from incomplete xacts, in
reverse order (why?) and abort each
M.P. Johnson, DBMS, Stern/NYU, Sp2004
35
Recovery with Undo Log
Read log from the end; cases:
<COMMIT T>: mark T as completed
<ABORT T>: mark T as completed
<T,X,v>:
if T is not completed then
write X=v to disk
else
ignore
<START T>: ignore
M.P. Johnson, DBMS, Stern/NYU, Sp2004
36
Recovery with Undo Log
Start:
…
…
<T2,X2,v2>
…
…
<START T5>
<START T4>
<T1,X1,v1>
<T5,X5,v5>
<T4,X4,v4>
<COMMIT T5>
<T3,X3,v3>
<T2,X2,v2>
Q: Which updates are
undone?
Crash!
M.P. Johnson, DBMS, Stern/NYU, Sp2004
37
Recovery with Undo Log
Note: undo commands are idempotent
How far back in the log do we go?
No harm done if we repeat them
Q: What if system crashes during recovery?
Don’t go all the way back to the start
May be very large
Better idea: use checkpointing
M.P. Johnson, DBMS, Stern/NYU, Sp2004
38
Checkpointing
Checkpoint the database periodically
Stop accepting new transactions
Wait until all current xacts complete
Flush log to disk
Write a <CKPT> log record, flush log
Resume accepting new xacts
M.P. Johnson, DBMS, Stern/NYU, Sp2004
39
Undo Recovery with Checkpointing
During recovery,
can stop at first
<CKPT>
…
…
<T1,X1,v1>
…
…
(all completed)
<CKPT>
<START T2>
<START T3
<START T5>
<START T4>
<T4,X4,v4>
<T5,X5,v5>
<T4,X4,v4>
<COMMIT T5>
<T3,X3,v3>
<T2,X2,v2>
other xacts
xacts T2,T3,T4,T5
M.P. Johnson, DBMS, Stern/NYU, Sp2004
40
Non-quiescent Checkpointing
Problem: database must freeze during
checkpoint
Would like to checkpoint while database is
operational
Idea: non-quiescent checkpointing
Quiescent: quiet, still, at rest; inactive
M.P. Johnson, DBMS, Stern/NYU, Sp2004
41
Next time
Next: Data warehousing mining!
For next time: reading online
Proj5 due next Thursday
no extensions!
Now: one-minute responses
Relative weight: warehousing, mining, websearch
Data mining techniques
NNs
GAs
kNN
Decision Trees
M.P. Johnson, DBMS, Stern/NYU, Sp2004
42