Database Availability Benchmark

Transcript Database Availability Benchmark

Initial Availability Benchmarking
of a Database System
Aaron Brown
[email protected]
2001 Winter ISTORE Retreat
Slide 1
Motivation
• Extend availability benchmarks to new areas
– explore generality and limitations of approach
– gain more understanding of system failure modes
• Why look at database availability?
– databases hold the critical hard state for most
enterprise and e-business applications
» the most important system component to keep available
– we trust databases to be highly reliable. Should we?
» how do DBMSs react to hardware faults/failures?
» what is the user-visible impact of such failures?
Slide 2
Approach
• Use our availability benchmarking methodology
to evaluate database robustness
– focus on storage system failures
– study 3-tier OLTP workload
» back-end: commercial database
» middleware: transaction monitor & business logic
» front-end: web-based form interface
– measure availability in terms of performance
» also possible to look at consistency of data
Slide 3
Refresher: availability benchmarks
• Goal: quantify variation in quality of service
as system availability is compromised
• Leverage existing performance benchmark
– to measure & trace quality of service metrics
– to generate fair workloads
• Use fault injection to compromise system
• Observe results graphically
}
QoS Metric
normal behavior
(99% conf)
injected
fault
0
system handles fault
Time
Slide 4
Availability metrics for databases
• Possible OLTP quality of service metrics
– transaction throughput
– transaction response time
» better: % of transactions longer than a fixed cutoff
– rate of transactions aborted due to errors
– consistency of database
– fraction of database content available
• Our experiments focused on throughput
– rates of normal and failed transactions
Slide 5
Fault injection
• Disk subsystem faults only
– realistic fault set based on Tertiary Disk study
» correctable & uncorrectable media errors,
hardware errors, power failures, disk hangs/timeouts
» both transient and “sticky” faults
» note: similar fault set to RAID benchmarks
– injected via an emulated SCSI disk (~0.5ms overhead)
– faults injected in one of two partitions:
» database data partition
» database’s write-ahead log partition
Slide 6
Experimental setup
• Database
– Microsoft SQL Server 2000, default configuration
• Middleware/front-end software
– Microsoft COM+ transaction monitor/coordinator
– IIS 5.0 web server with Microsoft’s tpcc.dll HTML
terminal interface and business logic
– Microsoft BenchCraft remote terminal emulator
• TPC-C-like OLTP order-entry workload
– 10 warehouses, 100 active users, ~860 MB database
• Measured metrics
– throughput of correct NewOrder transactions/min
– rate of aborted NewOrder transactions (txn/min)
Slide 7
Experimental setup (2)
Front End
DB Server
SCSI
system
disk
MS BenchCraft RTE
IIS + MS tpcc.dll
MS COM+
Intel P-III/450
256 MB DRAM
Windows 2000 AS
IDE
system
disk
100mb
Ethernet
Disk Emulator
Emulated
Disk
Adaptec
3940
SQL Server 2000
AMD K6-2/333
128 MB DRAM
Windows 2000 AS
SCSI
system
disk
AdvStor
ASC-U2W
IBM
18 GB
10k RPM
DB data/
log disks
Adaptec
2940
IBM
18 GB
10k RPM
emulator
backing disk
(NTFS)
ASC VirtualSCSI lib.
Intel P-II/300
128 MB DRAM
Windows NT 4.0
= Fast/Wide SCSI bus, 20 MB/sec
• Database installed in one of two configurations:
– data on emulated disk, log on real (IBM) disk
– data on real (IBM) disk, log on emulated disk
Slide 8
Results
• All results are from single-fault microbenchmarks
• 14 different fault types
– injected once for each of data and log partitions
• 4 categories of behavior detected
1)
2)
3)
4)
normal
transient glitch
degraded
failed
Slide 9
Type 1: normal behavior
140
Throughput, txn/min
120
100
80
60
40
fault
20
Successful txns
Failed txns
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Time (minutes)
• System tolerates fault
• Demonstrated for all sector-level faults except:
– sticky uncorrectable read, data partition
– sticky uncorrectable write, log partition
Slide 10
40
20
Type 2: transient glitch
fault
140
Throughput, txn/min
120
100
80
0
60
0
40
1
2
fault
20
3
4
Successful txns
Failed txns
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
5
6
Time (
Time (minutes)
• One transaction is affected, aborts with error
• Subsequent transactions using same data would fail
• Demonstrated for one fault only:
– sticky uncorrectable read, data partition
7
Slide 11
Type 3: degraded behavior
140
Throughput, txn/min
120
100
80
60
40
20
Successful txns
Failed txns
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Time (minutes)
• DBMS survives error after running log recovery
• Middleware partially fails, results in degraded perf.
• Demonstrated for one fault only:
– sticky uncorrectable write, log partition
Slide 12
Type 4: failure
140
140
120
120
Throughput, txn/min
Throughput, txn/min
• Example behaviors (10 distinct variants observed)
100
80
60
40
20
Successful txns
Failed txns
fault
0
100
80
60
40
20
Successful txns
Failed txns
fault
0
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Time (minutes)
Disk hang during write to data disk
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15
Time (minutes)
Simulated log disk power failure
• DBMS hangs or aborts all transactions
• Middleware behaves erratically, sometimes crashing
• Demonstrated for all fatal disk-level faults
– SCSI hangs, disk power failures
Slide 13
Results: summary
• DBMS was robust to a wide range of faults
– tolerated all transient and recoverable errors
– tolerated some unrecoverable faults
» transparently (e.g., uncorrectable data writes)
» or by reflecting fault back via transaction abort
» these were not tolerated by the SW RAID systems
• Overall, DBMS is significantly more robust to
disk faults than software RAID systems!
Slide 14
Results: discussion
• DBMS’s extra robustness comes from:
– redundant data representation in form of log
– transactions
» standard mechanism for reporting errors (txn abort)
» encapsulate meaningful unit of work, providing
consistent rollback upon failure
compare RAID: blocks don’t let you do this
• But, middleware was not robust, compromising
overall system availability
– crashed or behaved erratically when DBMS recovered
or returned errors
– user cannot distinguish DBMS and middleware failure
– system is only as robust as its weakest component!
Slide 15
Discussion of methodology
• General availability benchmarking methodology
does work on more than just RAID systems
• Issues in adapting the methodology
– defining appropriate metrics
– measuring non-performance availability metrics
– understanding layered (multi-tier) systems with only
end-to-end instrumentation
Slide 16
Future directions
• Last retreat: James Hamilton proposed
availability/maintainability extensions to TPC
• This work is a (small) step toward that goal
– exposed limitations, capabilities of disk fault injection
– revealed importance of middleware, which clearly
must be considered as part of the benchmark
– hints at poor state-of-the-art in TPC-C benchmark
middleware fault handling
• Next:
–
–
–
–
expand metrics, including tests of ACID properties
consider other fault injection points besides disks
investigate clustered database designs
study issues in benchmarking layered systems
Slide 18
Thanks!
• Microsoft SQL Server group
– for generously providing access to SQL Server 2000
and the Microsoft TPC-C Benchmark Kit
– James Hamilton
– Jamie Redding and Charles Levine
Slide 19
Backup slides
Slide 20
Transient, correctable read fault
(system tolerates fault)
Sticky, uncorrectable read fault
(transaction is aborted with error)
140
140
120
120
Throughput, txn/min
Throughput, txn/min
Example results: failing data disk
100
80
60
40
fault
20
100
80
60
40
fault
20
Successful txns
Failed txns
Successful txns
Failed txns
0
0
0
1
2
3
4
5
6
7
8
9
0
10 11 12 13 14 15
1
2
3
4
5
Time (minutes)
140
140
120
120
100
80
60
40
Successful txns
Failed txns
fault
7
8
9
10 11 12 13 14 15
Disk hang during a data write
(DBMS hangs, middleware crashes)
Throughput, txn/min
Throughput, txn/min
Disk hang between SCSI commands
(DBMS hangs, middleware returns errors)
20
6
Time (minutes)
100
80
60
40
20
Successful txns
Failed txns
fault
0
0
0
1
2
3
4
5
6
7
8
9
Time (minutes)
10 11 12 13 14 15
0
1
2
3
4
5
6
7
8
9
Time (minutes)
10 11 12 13 14 15
Slide 21
Example results: failing log disk
Sticky, uncorrectable write fault
(DBMS recovers, middleware degrades)
140
140
120
120
Throughput, txn/min
Throughput, txn/min
Transient, correctable write fault
(system tolerates fault)
100
80
60
40
fault
20
100
80
60
40
20
Successful txns
Failed txns
Successful txns
Failed txns
0
0
0
1
2
3
4
5
6
7
8
9
0
10 11 12 13 14 15
1
2
3
4
5
140
140
120
120
100
80
60
40
Successful txns
Failed txns
fault
7
8
9
10 11 12 13 14 15
Disk hang between SCSI commands
(DBMS hangs, middleware hangs)
Throughput, txn/min
Throughput, txn/min
Simulated disk power failure
(DBMS aborts all txns with errors)
20
6
Time (minutes)
Time (minutes)
0
100
80
60
40
20
Successful txns
Failed txns
fault
0
0
1
2
3
4
5
6
7
8
9
Time (minutes)
10 11 12 13 14 15
0
1
2
3
4
5
6
7
8
9
Time (minutes)
10 11 12 13 14 15
Slide 22

Database Availability Benchmark

Transcript Database Availability Benchmark

Directory