TPC Benchmarks - Microsoft Research
Download
Report
Transcript TPC Benchmarks - Microsoft Research
TPC Benchmarks
Charles Levine
Microsoft
[email protected]
Modified by Jim Gray
Gray @ Microsoft.com
March 1997
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
Benchmarks: What and Why
What is a benchmark?
Domain specific
No single metric possible
The more general the benchmark, the less useful it is for
anything in particular.
A benchmark is a distillation of the essential attributes of a
workload
Benchmarks: What and Why
Desirable attributes
Relevant meaningful within the target domain
Understandable
Good metric(s) linear, orthogonal, monotonic
Scaleable applicable to a broad spectrum of
hardware/architecture
Coverage does not oversimplify the typical
environment
Acceptance Vendors and Users embrace it
Portable Not limited to one hardware/software
vendor/technology
Benefits and Liabilities
Good benchmarks
Define the playing field
Accelerate progress
Engineers do a great job once objective is
measurable and repeatable
Set the performance agenda
Measure release-to-release progress
Set goals (e.g., 10,000 tpmC, < 100 $/tpmC)
Something managers can understand (!)
Benchmark abuse
Benchmarketing
Benchmark wars
more $ on ads than development
Benchmarks have a Lifetime
Good benchmarks drive industry and technology forward.
At some point, all reasonable advances have been made.
Benchmarks can become counter productive by encouraging
artificial optimizations.
So, even good benchmarks become obsolete over time.
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
What is the TPC?
TPC = Transaction Processing Performance Council
Founded in Aug/88 by Omri Serlin and 8 vendors.
Membership of 40-45 for last several years
Everybody who’s anybody in software & hardware
De facto industry standards body for OLTP performance
Administered by:
Shanley Public Relations
777 N. First St., Suite 600
San Jose, CA 95112-6311
ph: (408) 295-8894
fax: (408) 295-9768
email: [email protected]
Most TPC specs, info, results on web page: www.tpc.org
TPC database (unofficial): www.microsoft.com/sql/tpc/
News: Omri Serlin’s FT Systems News (monthly magazine)
Two Seminal Events Leading to TPC
Anon, et al, “A Measure of Transaction Processing
Power”, Datamation, April fools day, 1985.
Anon, Et Al = Jim Gray (Dr. E. A. Anon) and 24 of his
closest friends
Sort: 1M 100 byte records
Mini-batch: copy 1000 records
DebitCredit: simple ATM style transaction
Tandem TopGun Benchmark
DebitCredit
212 tps on NonStop SQL in 1987 (!)
Audited by Tom Sawyer of Codd and Date (A first)
Full Disclosure of all aspects of tests (A first)
1987: 256 tps Benchmark
14 M$ computer (Tandem)
A dozen people
False floor, 2 rooms of machines
Admin expert
Hardware experts
A 32 node processor array
Simulate 25,600 clients
Network expert
Manager
Performance
expert
DB expert
A 40 GB disk array (80 drives)
Auditor
OS expert
1988: DB2 + CICS Mainframe 65 tps
IBM 4391
Simulated network of 800 clients
2m$ computer
Staff of 6 to do benchmark
2 x 3725
network controllers
Refrigerator-sized
CPU
16 GB disk farm
4 x 8 x .5GB
1997: 10 years later
1 Person and 1 box = 1250 tps
1 Breadbox ~ 5x 1987 machine room
23 GB is hand-held
One person does all the work
Cost/tps is 1,000x less
25 micro dollars per transaction
Hardware expert
OS expert
Net expert
DB expert
App expert
4x200 Mhz cpu
1/2 GB DRAM
12 x 4GB disk
3 x7 x 4GB
disk arrays
What Happened?
Moore’s law:
Things get 4x better every 3 years
(applies to computers, storage, and networks)
New Economics: Commodity
class
mainframe
minicomputer
microcomputer
price/mips software
$/mips
k$/year
10,000
100
100
10
10
1
price
GUI: Human - computer tradeoff
optimize for people, not computers
time
TPC Milestones
1989: TPC-A ~ industry standard for Debit Credit
1990: TPC-B ~ database only version of TPC-A
1992: TPC-C ~ more representative, balanced OLTP
1994: TPC requires all results must be audited
1995: TPC-D ~ complex decision support (query)
1995: TPC-A/B declared obsolete by TPC
Non-starters:
TPC-E ~ “Enterprise” for the mainframers
TPC-S ~ “Server” component of TPC-C
Both failed during final approval in 1996
TPC vs. SPEC
SPEC (System Performance Evaluation Cooperative)
SPEC ships code
Unix centric
CPU centric
TPC ships specifications
SPECMarks
Ecumenical
Database/System/TP centric
Price/Performance
The TPC and SPEC happily coexist
There is plenty of room for both
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
TPC-A Overview
Transaction is simple bank account debit/credit
Database scales with throughput
Transaction submitted from terminal
TPC-A Transaction
Read 100 bytes including Aid, Tid, Bid, Delta from terminal (see Clause 1.3)
BEGIN TRANSACTION
Update Account where Account_ID = Aid:
Read Account_Balance from Account
Set Account_Balance = Account_Balance + Delta
Write Account_Balance to Account
Write to History:
Aid, Tid, Bid, Delta, Time_stamp
Update Teller where Teller_ID = Tid:
Set Teller_Balance = Teller_Balance + Delta
Write Teller_Balance to Teller
Update Branch where Branch_ID = Bid:
Set Branch_Balance = Branch_Balance + Delta
Write Branch_Balance to Branch
COMMIT TRANSACTION
Write 200 bytes including Aid, Tid, Bid, Delta, Account_Balance to terminal
TPC-A Database Schema
Branch
B
Teller
B*10
10
100K
Account
History
B*100K
B*2.6M
10 Terminals per Branch row
10 second cycle time per terminal
1 transaction/second per Branch row
Legend
Table Name
<cardinality>
one-to-many
relationship
TPC-A Transaction
Workload is vertically aligned with Branch
15% of accounts non-local
Produces cross database activity
What’s good about TPC-A?
Makes scaling easy
But not very realistic
Easy to understand
Easy to measured
Stresses high transaction rate, lots of physical IO
What’s bad about TPC-A?
Too simplistic! Lends itself to unrealistic optimizations
TPC-A Design Rationale
Branch & Teller
Account
in cache, hotspot on branch
too big to cache requires disk access
History
sequential insert
hotspot at end
90-day capacity ensures reasonable ratio of disk to cpu
RTE SUT
RTE - Remote Terminal Emulator
SUT - System Under Test
Emulates real user behavior
Submits txns to SUT, measures RT
Transaction rate includes think time
Many, many users (10 x tpsA)
All components except for terminal
Model of system:
SUT
RTE
Host System(s)
T
T-C
Network*
T
C
L
I
E
N
T
Response Time Measured Here
C-S
Network*
S
E
R
V
E
R
S-S
Network*
TPC-A Metric
tpsA = transactions per second,
average rate over 15+ minute interval,
at which 90% of txns get <= 2 second RT
Av erage Response Time
Number of Transactions
90th Percentile
Response Time
0
1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20
Response time (seconds)
TPC-A Price
Price
5 year Cost of Ownership:
hardware,
software,
maintenance
Does not include development, comm lines,
operators, power, cooling, etc.
Strict pricing model one of TPC’s big contributions
List prices
System must be orderable & commercially available
Committed ship date
Differences between TPC-A and TPC-B
TPC-B is database only portion of TPC-A
TPC-B reduces history capacity to 30 days
No terminals
No think times
Less disk in priced configuration
TPC-B was easier to configure and run, BUT
Even though TPC-B was more popular with vendors,
it did not have much credibility with customers.
TPC Loopholes
Pricing
Client/Server
Package pricing
Price does not include cost of five star wizards needed to
get optimal performance, so performance is not what a
customer could get.
Offload presentation services to cheap clients, but report
performance of server
Benchmark specials
Discrete transactions
Custom transaction monitors
Hand coded presentation services
TPC-A/B Legacy
First results in 1990: 38.2 tpsA, 29.2K$/tpsA (HP)
Last results in 1994: 3700 tpsA, 4.8 K$/tpsA (DEC)
WOW! 100x on performance & 6x on price in 5 years !!
TPC cut its teeth on TPC-A/B; became functioning,
representative body
Learned a lot of lessons:
If benchmark is not meaningful, it doesn’t matter how many
numbers or how easy to run (TPC-B).
How to resolve ambiguities in spec
How to police compliance
Rules of engagement
TPC-A Established OLTP Playing Field
TPC-A criticized for being irrelevant,
unrepresentative, misleading
But, truth is that TPC-A drove performance, drove
price/performance, and forced everyone to clean up
their products to be competitive.
Trend forced industry toward one price/performance,
regardless of size.
Became means to achieve legitimacy in OLTP for
some.
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
TPC-C Overview
Moderately complex OLTP
The result of 2+ years of development by the TPC
Application models a wholesale supplier managing orders.
Order-entry provides a conceptual model for the benchmark;
underlying components are typical of any OLTP system.
Workload consists of five transaction types.
Users and database scale linearly with throughput.
Spec defines full-screen end-user interface.
Metrics are new-order txn rate (tpmC) and price/performance
($/tpmC)
Specification was approved July 23, 1992.
TPC-C’s Five Transactions
OLTP transactions:
New-order: enter a new order from a customer
Payment: update customer balance to reflect a payment
Delivery: deliver orders (done as a batch transaction)
Order-status: retrieve status of customer’s most recent order
Stock-level: monitor warehouse inventory
Transactions operate against a database of nine tables.
Transactions do update, insert, delete, and abort;
primary and secondary key access.
Response time requirement:
90% of each type of transaction
must have a response time 5 seconds,
except (queued mini-batch) stock-level which is 20 seconds.
TPC-C Database Schema
Warehouse
W
Stock
100K
W*100K
Item
W
100K (fixed)
Legend
10
Table Name
District
one-to-many
relationship
<cardinality>
W*10
secondary index
3K
Customer
W*30K
Order
1+
W*30K+
1+
10-15
History
Order-Line
W*30K+
W*300K+
New-Order
0-1
W*5K
TPC-C Workflow
1
Select txn from menu:
1. New-Order
2. Payment
3. Order-Status
4. Delivery
5. Stock-Level
45%
43%
4%
4%
4%
2
Input screen
3
Output screen
Cycle Time Decomposition
(typical values, in seconds,
for weighted average txn)
Measure menu Response Time
Menu = 0.3
Keying time
Keying = 9.6
Measure txn Response Time
Think time
Txn RT = 2.1
Think = 11.4
Average cycle time = 23.4
Go back to 1
Data Skew
NURand - Non Uniform Random
NURand(A,x,y) =
(((random(0,A) | random(x,y)) + C) % (y-x+1)) + x
Customer Last Name: NURand(255, 0, 999)
Customer ID: NURand(1023, 1, 3000)
Item ID: NURand(8191, 1, 100000)
bitwise OR of two random values
skews distribution toward values with more bits on
75% chance that a given bit is one (1 - ½ * ½)
data skew repeats with period “A” (first param of NURand())
NURand Distribution
TPC-C NURand function: frequency vs 0...255
0.09
0.08
0.07
0.06
0.05
cumulative
distribution
0.04
0.03
0.02
0.01
Record Identitiy [0..255]
250
240
230
220
210
200
190
180
170
160
150
140
130
120
110
100
90
80
70
60
50
40
30
20
10
0
0
Relative Frequency of Access
to This Record
0.1
ACID Tests
TPC-C requires transactions be ACID.
Tests included to demonstrate ACID properties met.
Atomicity
Consistency
Isolation
Verify that all changes within a transaction commit or abort.
ANSI Repeatable reads for all but Stock-Level transactions.
Committed reads for Stock-Level.
Durability
Must demonstrate recovery from
Loss of power
Loss of memory
Loss of media (e.g., disk crash)
Transparency
TPC-C requires that all data partitioning be fully
transparent to the application code. (See TPC-C Clause
1.6)
Both horizontal and vertical partitioning is allowed
All partitioning must be hidden from the application
Most DB do single-node horizontal partitioning.
Much harder: multiple-node transparency.
For example, in a two-node cluster:
Any DML operation must be
able to operate against the
entire database, regardless of
physical location.
Warehouses:
Node A
select *
from warehouse
where W_ID = 150
1-100
Node B
select *
from warehouse
where W_ID = 77
101-200
Transparency (cont.)
How does transparency affect TPC-C?
Payment txn: 15% of Customer table records are non-local
to the home warehouse.
New-order txn: 1% of Stock table records are non-local to
the home warehouse.
In a cluster,
cross warehouse traffic cross node traffic
2 phase commit, distributed lock management, or both.
For example, with distributed txns:
Number of nodes
1
2
3
n
% Network Txns
0
5.5
7.3
10.9
TPC-C Rules of Thumb
1.2 tpmC per User/terminal (maximum)
10 terminals per warehouse (fixed)
65-70 MB/tpmC priced disk capacity (minimum)
~ 0.5 physical IOs/sec/tpmC (typical)
300-700 KB main memory/tpmC
So use rules of thumb to size 10,000 tpmC system:
How many terminals?
How many warehouses?
How much memory?
How much disk capacity?
How many spindles?
Typical TPC-C Configuration (Conceptual)
Hardware
Emulated User Load
Driver System
Presentation Services
Term.
LAN
Client
Software
C/S
LAN
Database
Server
...
Response Time
measured here
RTE, e.g.:
Empower
preVue
LoadRunner
Database Functions
TPC-C application +
Txn Monitor and/or
database RPC library
e.g., Tuxedo, ODBC
TPC-C application
(stored procedures) +
Database engine +
Txn Monitor
e.g., SQL Server, Tuxedo
Competitive TPC-C Configuration Today
7,128 tpmC; $89/tpmC; 5-yr COO= 569 K$
2 GB memory, 85x9-GB disks (733 GB total)
6500 users
Demo of SQL Server + Web interface
User interface implemented w/ Web browser via
HTML
Client to Server via ODBC
SQL Server database engine
All in one nifty little box!
TPC-C Current Results
Best Performance is 30,390 tpmC @ $305/tpmC (Oracle/DEC)
Best Price/Perf. is 6,712 tpmC @ $65/tpmC (MS SQL/DEC/Intel)
graphs show
tpmC & Price Performance
high price of UNIX
diseconomy of UNIX scaleup
(only "best" data shown for each vendor)
DB2
400
DB2
Informix
MS SQL Server
Oracle
400
Sybase
350
Informix
tpmC vs $/tpmC
Oracle
300
Sybase
250
$/tpmC
300
$/tpmC
MS SQL Server
350
(ignores Oracle 30ktpmC on 4x12 Alpha)
250
200
150
200
150
100
100
50
0
0
5000
10000
tpmC
15000
20000
50
0
0
5000
10000
tpmC
15000
20000
Compare SMP Performance
tpm C vs CPS
SUN Scaleability
20,000
20,000
18,000
SUN Scaleability
16,000
15,000
SQL Server
14,000
tpmC
tpmC
12,000
10,000
10,000
8,000
6,000
5,000
4,000
2,000
0
0
0
5
10
CPUs
15
20
0
5
10
cpus
15
20
TPC-C Summary
Balanced, representative OLTP mix
Five transaction types
Database intensive; substantial IO and cache load
Scaleable workload
Complex data: data attributes, size, skew
Requires Transparency and ACID
Full screen presentation services
De facto standard for OLTP performance
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
TPC-D Overview
Complex Decision Support workload
The result of 5 years of development by the TPC
Benchmark models ad hoc queries
Workload consists of 17 queries and 2 update streams
extract database with concurrent updates
multi-user environment
SQL as written in spec
Database load time must be reported
Database is quantized into fixed sizes
Metrics are Power (QppD), Throughput (QthD), and
Price/Performance ($/QphD)
Specification was approved April 5, 1995.
TPC-D Schema
Customer
Nation
Region
SF*150K
25
5
Order
Supplier
Part
SF*1500K
SF*10K
SF*200K
Time
2557
LineItem
PartSupp
SF*6000K
SF*800K
Legend:
• Arrows point in the direction of one-to-many relationships.
• The value below each table name is its cardinality. SF is the Scale Factor.
• The Time table is optional. So far, not used by anyone.
TPC-D Database Scaling and Load
Database size is determined from fixed Scale Factors (SF):
Database is generated by DBGEN
1, 10, 30, 100, 300, 1000, 3000 (note that 3 is missing, not a typo)
These correspond to the nominal database size in GB.
(I.e., SF 10 is approx. 10 GB, not including indexes and temp tables.)
Indices and temporary tables can significantly increase the total disk
capacity. (3-5x is typical)
DBGEN is a C program which is part of the TPC-D spec.
Use of DBGEN is strongly recommended.
TPC-D database contents must be exact.
Database Load time must be reported
Includes time to create indexes and update statistics.
Not included in primary metrics.
TPC-D Query Set
17 queries written in SQL92 to implement business questions.
Queries are pseudo ad hoc:
Substitution parameters are replaced with constants by QGEN
QGEN replaces substitution parameters with random values
No host variables
No static SQL
Queries cannot be modified -- “SQL as written”
There are some minor exceptions.
All variants must be approved in advance by the TPC
TPC-D Update Streams
Update 0.1% of data per query stream
Implementation of updates is left to sponsor, except:
ACID properties must be maintained
Update Function 1 (UF1)
About as long as a medium sized TPC-D query
Insert new rows into ORDER and LINEITEM tables
equal to 0.1% of table size
Update Function 2 (UF2)
Delete rows from ORDER and LINEITEM tables
equal to 0.1% of table size
TPC-D Execution
Power Test
Queries submitted in a single stream (i.e., no concurrency)
Sequence:
Cache
Flush
Query
Set 0
UF1
Query
Set 0
UF2
(optional)
Warm-up, untimed
Throughput Test
Multiple concurrent query streams
Single update stream
Query Set 1
Sequence:
Query Set 2
...
Timed Sequence
Query Set N
Updates:
UF1 UF2 UF1 UF2 UF1 UF2
TPC-D Metrics
Power Metric (QppD)
Geometric queries per hour times SF
3600 SF
QppD@ Size
19
i 17
j 2
i 1
j 1
QI (i ,0)UI ( j ,0)
where
QI(i,0) Timing Interval for Query i, stream 0
UI(j,0) Timing Interval for Update j, stream 0
SF Scale Factor
Throughput (QthD)
Linear queries per hour times SF
QthD@ Size
S 17
TS
3600
SF
where:
S number of query streams
TS elapsed time of test (in seconds)
TPC-D Metrics (cont.)
Composite Query-Per-Hour Rating (QphD)
The Power and Throughput metrics are combined to get
the composite queries per hour.
QphD@ Size QppD@ Size QthD@ Size
Reported metrics are:
Power: QppD@Size
Throughput: QthD@Size
Price/Performance: $/QphD@Size
Comparability:
Results within a size category (SF) are comparable.
Comparisons among different size databases are strongly discouraged.
TPC-D Current Results
TPC-D 100 GB Results
$14,000
HP/Oracle
$12,000
NCR/Teradata
Sun/Oracle
$10,000
Tandem/NonStop SQL
$8,000
Digital/Oracle (9/96)
Digital/Oracle (11/96)
$6,000
IBM/DB2
$4,000
$2,000
$0
200
400
600
800
1000
1200
1400
1600
TPC-D 300 GB Results
14,000
12,000
Price/Performance
0
IBM/DB2
NCR/Teradata
10,000
Pyramid/Oracle
8,000
Sun/Oracle
Sun/Oracle
6,000
4,000
2,000
0
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Example TPC-D Results
Want to learn more about TPC-D?
TPC-D Training Video
Six hour video by the folks who wrote the spec.
Explains, in detail, all major aspects of the benchmark.
Available from the TPC:
Shanley Public Relations
777 N. First St., Suite 600
San Jose, CA 95112-6311
ph: (408) 295-8894
fax: (408) 295-9768
email: [email protected]
Outline
Introduction
History of TPC
TPC-A and TPC-B
TPC-C
TPC-D
TPC Futures
TPC Future Direction
TPC-Web
The TPC is just starting a Web benchmark effort.
TPC’s focus will be on database and transaction
characteristics.
The interesting components are:
Web Server
Web
Server
Appl.
DBMS Server
File
System
SQL
Engine
TCP/IP
...
Browser
Browser
Stored
Procs
Data
base
Rules of Thumb
Answer Set for TPC-C rules of Thumb (slide 38)
a 10 ktpmC system
» 8340
terminals ( = 5000 / 1.2)
» 834 warehouses ( = 8340 / 10)
» 3GB-7GB DRAM of memory (10,000 * [3KB..7KB])
» 650 GB disk space = 10,000 * 65
» # Spindles depends on MB capacity vs. physical IO.
Capacity: 650 / 4GB = 162 spindles
IO: 10,000*.5 / 140 = 31 IO/sec (OK!)
but 9GB or 23GB disks would be TOO HOT!
Reference Material
Jim Gray, The Benchmark Handbook for Database and
Transaction Processing Systems, Morgan Kaufmann, San
Mateo, CA, 1991.
Raj Jain, The Art of Computer Systems Performance Analysis:
Techniques for Experimental Design, Measurement, Simulation,
and Modeling, John Wiley & Sons, New York, 1991.
William Highleyman, Performance Analysis of Transaction
Processing Systems, Prentice Hall, Englewood Cliffs, NJ, 1988.
TPC Web site: www.tpc.org
Microsoft db site: www.microsoft.com/sql/tpc/
IDEAS web site: www.ideasinternational.com