TPC Benchmarks

Download Report

Transcript TPC Benchmarks

Benchmarks: What and Why


What is a benchmark?
Domain specific




No single metric possible
The more general the benchmark, the less useful it is for anything in
particular.
A benchmark is a distillation of the essential attributes of a workload
Desirable attributes





Relevant  meaningful within the target domain
Understandable
Scaleable  applicable to a broad spectrum of hardware/architecture
Coverage  does not oversimplify the typical environment
Acceptance  Vendors and Users embrace it
Benefits and Liabilities

Good benchmarks




Define the playing field
Accelerate progress
 Engineers do a great job once objective is
measurable and repeatable
Set the performance agenda
 Measure release-to-release progress
 Set goals (e.g., 10,000 tpmC, < 50 $/tpmC)
 Something managers can understand (!)
Benchmark abuse


Benchmarketing
Benchmark wars
 more $ on ads than development
Benchmarks have a Lifetime




Good benchmarks drive industry and technology forward.
At some point, all reasonable advances have been made.
Benchmarks can become counter productive by encouraging
artificial optimizations.
So, even good benchmarks become obsolete over time.
What is the TPC?



TPC = Transaction Processing Performance Council
Founded in Aug/88 by Omri Serlin and 8 vendors.
Membership of 40-45 for last several years

Everybody who’s anybody in software & hardware

De facto industry standards body for OLTP performance

Administered by:
Shanley Public Relations
777 N. First St., Suite 600
San Jose, CA 95112-6311

ph: (408) 295-8894
fax: (408) 295-9768
email: [email protected]
Most TPC specs, info, results are on the web page:
http://www.tpc.org
TPC-C Overview









Moderately complex OLTP
The result of 2+ years of development by the TPC
Application models a wholesale supplier managing orders.
Order-entry provides a conceptual model for the benchmark;
underlying components are typical of any OLTP system.
Workload consists of five transaction types.
Users and database scale linearly with throughput.
Spec defines full-screen end-user interface.
Metrics are new-order txn rate (tpmC) and price/performance
($/tpmC)
Specification was approved July 23, 1992.
TPC-C’s Five Transactions

OLTP transactions:








New-order: enter a new order from a customer
Payment: update customer balance to reflect a payment
Delivery: deliver orders (done as a batch transaction)
Order-status: retrieve status of customer’s most recent order
Stock-level: monitor warehouse inventory
Transactions operate against a database of nine tables.
Transactions do update, insert, delete, and abort;
primary and secondary key access.
Response time requirement: 90% of each type of transaction
must have a response time  5 seconds, except stock-level
which is  20 seconds.
TPC-C Database Schema
Warehouse
W
Stock
100K
W*100K
Item
W
100K (fixed)
Legend
10
Table Name
District
one-to-many
relationship
<cardinality>
W*10
secondary index
3K
Customer
W*30K
Order
1+
W*30K+
1+
10-15
History
Order-Line
W*30K+
W*300K+
New-Order
0-1
W*5K
TPC-C Workflow
1
Select txn from menu:
1. New-Order
2. Payment
3. Order-Status
4. Delivery
5. Stock-Level
45%
43%
4%
4%
4%
2
Input screen
3
Output screen
Cycle Time Decomposition
(typical values, in seconds,
for weighted average txn)
Measure menu Response Time
Menu = 0.3
Keying time
Keying = 9.6
Measure txn Response Time
Think time
Txn RT = 2.1
Think = 11.4
Average cycle time = 23.4
Go back to 1
Data Skew

NURand - Non Uniform Random




NURand(A,x,y) = (((random(0,A) | random(x,y)) + C) % (y-x+1)) + x
 Customer Last Name: NURand(255, 0, 999)
 Customer ID: NURand(1023, 1, 3000)
 Item ID: NURand(8191, 1, 100000)
bitwise OR of two random values
skews distribution toward values with more bits on
 75% chance that a given bit is one (1 - ½ * ½)
skewed data pattern repeats with period of smaller random number
NURand Distribution
TPC-C NURand function: frequency vs 0...255
0.09
0.08
0.07
0.06
0.05
cumulative
distribution
0.04
0.03
0.02
0.01
Record Identitiy [0..255]
250
240
230
220
210
200
190
180
170
160
150
140
130
120
110
100
90
80
70
60
50
40
30
20
10
0
0
Relative Frequency of Access
to This Record
0.1
ACID Tests



TPC-C requires transactions be ACID.
Tests included to demonstrate ACID properties met.
Atomicity



Consistency
Isolation



Verify that all changes within a transaction commit or abort.
ANSI Repeatable reads for all but Stock-Level transactions.
Committed reads for Stock-Level.
Durability

Must demonstrate recovery from
 Loss of power
 Loss of memory
 Loss of media (e.g., disk crash)
Transparency

TPC-C requires that all data partitioning be fully transparent
to the application code. (See TPC-C Clause 1.6)



Both horizontal and vertical partitioning is allowed
All partitioning must be hidden from the application
 Most DBMS’s do this today for single-node horizontal partitioning.
 Much harder: multiple-node transparency.
For example, in a two-node cluster:
Any DML operation must be
able to operate against the
entire database, regardless of
physical location.
Warehouses:
Node A
select *
from warehouse
where W_ID = 150
1-100
Node B
select *
from warehouse
where W_ID = 77
101-200
Transparency (cont.)

How does transparency affect TPC-C?




Payment txn: 15% of Customer table records are non-local to the home
warehouse.
New-order txn: 1% of Stock table records are non-local to the home
warehouse.
In a distributed cluster, the cross warehouse traffic causes
cross node traffic and either 2 phase commit, distributed lock
management, or both.
For example, with distributed txns:
Number of nodes
1
2
3
n 
% Network Txns
0
5.5
7.3
10.9
TPC-C Rules of Thumb






1.2 tpmC per User/terminal (maximum)
10 terminals per warehouse (fixed)
65-70 MB/tpmC priced disk capacity (minimum)
~ 0.5 physical IOs/sec/tpmC (typical)
250-700 KB main memory/tpmC (how much $ do you have?)
So use rules of thumb to size 10,000 tpmC system:





How many terminals?
How many warehouses?
How much memory?
How much disk capacity?
How many spindles?
» 8340 = 10000 / 1.2
» 834 = 8340 / 10
» 2.5 - 7 GB
» 650 GB = 10000 * 65
» Depends on MB capacity vs. physical IO.
Capacity: 650 / 8 = 82 spindles
IO: 10000*.5 / 82 = 61 IO/sec TOO HOT!
Typical TPC-C Configuration (Conceptual)
Hardware
Emulated User Load
Driver System
Presentation Services
Term.
LAN
Client
Software
C/S
LAN
Database
Server
...
Response Time
measured here
RTE, e.g.:
Empower
preVue
LoadRunner
Database Functions
TPC-C application +
Txn Monitor and/or
database RPC library
e.g., Tuxedo, ODBC
TPC-C application
(stored procedures) +
Database engine +
Txn Monitor
e.g., SQL Server, Tuxedo
Competitive TPC-C Configuration Today



8070 tpmC; $57.66/tpmC; 5-yr COO= 465 K$
2 GB memory, disks: 37 x 4GB + 48 x 9.1GB (560 GB total)
6,700 users
TPC-C Current Results

Best Performance is 30,390 tpmC @ $305/tpmC (Digital)
Best Price/Perf. is 7,693 tpmC @ $42.53/tpmC (Dell)
350
Price/Performance ($/tpmC)

300
250
Compaq
200
Dell
Digital
150
HP
IBM
100
NCR
50
SGI
Sun
-
5,000
10,000
15,000
20,000
Throughput (tpmC)
25,000
30,000
35,000
TPC-C Results (by OS)
TPC-C Results by OS
Price/Performance ($/tpmC)
400
Unix
350
Windows NT
300
250
200
150
100
50
-
5,000
10,000
15,000
Throughput (tpmC)
TPC-C results as of 5/9/97
20,000
25,000
30,000
TPC-C Results (by DBMS)
TPC-C Results by DBMS
Price/Performance ($/tpmC)
400
Informix
350
Microsoft
300
Oracle
250
Sybase
200
150
100
50
-
5,000
10,000
15,000
20,000
Throughput (tpmC)
TPC-C results as of 5/9/97
25,000
30,000
Analysis from 30,000 ft.

Unix results are 2-3x more expensive than NT.


Unix results are more scalable



Doesn’t matter which DBMS
Unix: 10, 12, 16, 24 way SMPs
NT: 4-way SMP w/ Intel & 8-way SMP on Digital Alpha
Highest performance is on clusters

only a few results (trophy numbers?)
TPC-C Summary

Balanced, representative OLTP mix







Five transaction types
Database intensive; substantial IO and cache load
Scaleable workload
Complex data: data attributes, size, skew
Requires Transparency and ACID
Full screen presentation services
De facto standard for OLTP performance
Reference Material






TPC Web site: www.tpc.org
TPC Results Database: www.microsoft.com/sql/tpc
IDEAS web site: www.ideasinternational.com
Jim Gray, The Benchmark Handbook for Database and
Transaction Processing Systems, Morgan Kaufmann, San
Mateo, CA, 1991.
Raj Jain, The Art of Computer Systems Performance Analysis:
Techniques for Experimental Design, Measurement, Simulation,
and Modeling, John Wiley & Sons, New York, 1991.
William Highleyman, Performance Analysis of Transaction
Processing Systems, Prentice Hall, Englewood Cliffs, NJ, 1988.
TPC-D
The Industry Standard Decision
Support Benchmark
TPC-D Overview



Complex Decision Support workload
The result of 5 years of development by the TPC
Benchmark models ad hoc queries



DSS Queries
extract database with concurrent updates
multi-user environment
Specification was approved April 5, 1995.
TPC-D
Business Analysis
Business Operations
TPC-A
TPC-B
TPC-C
OLTP Transactions
TPC-D Schema
Customer
Nation
Region
SF*150K
25
5
Order
Supplier
Part
SF*1500K
SF*10K
SF*200K
Time
2557
LineItem
PartSupp
SF*6000K
SF*800K
Legend:
• Arrows point in the direction of one-to-many relationships.
• The value below each table name is its cardinality. SF is the Scale Factor.
• The Time table is optional. So far, not used by anyone.
TPC-D Database Scaling and Load

Database size is determined from fixed Scale Factors (SF):




Database is generated by DBGEN




1, 10, 30, 100, 300, 1000, 3000, 10000 (note that 3 is missing, not a typo)
These correspond to the nominal database size in GB.
(I.e., SF 10 is approx. 10 GB, not including indexes and temp tables.)
Indices and temporary tables can significantly increase the total disk
capacity. (3-5x is typical)
DBGEN is a C program which is part of the TPC-D spec.
Use of DBGEN is strongly recommended.
TPC-D database contents must be exact.
Database Load time must be reported


Includes time to create indexes and update statistics.
Not included in primary metrics.
TPC-D Query Set


17 queries written in SQL92 to implement business questions.
Queries are pseudo ad hoc:





Substitution parameters are replaced with constants by QGEN
QGEN replaces substitution parameters with random values
No host variables
No static SQL
Queries cannot be modified -- “SQL as written”


There are some minor exceptions.
All variants must be approved in advance by the TPC
Sample Query Definition
2.3 Forecasting Revenue Query (Q6)
This query quantifies the amount of revenue increase that would have resulted from eliminating company-wide
discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for
ways to increase revenues.
2.3.1 Business Question
The Forecasting Revenue Change Query considers all the lineitems shipped in a given year with discounts
between DISCOUNT+0.01 and DISCOUNT-0.01. The query list the amount by which the total revenues would
have decreased if these discounts had been eliminated for lineitems with item quantities less than QUANTITY.
Note that the potential revenue increase is equal to the sum of (L_EXTENDEDPRICE * L_DISCOUNT) for all
lineitems with quantities and discounts in the qualifying range.
2.3.2 Functional Query Definition
SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM
WHERE L_SHIPDATE >= DATE ‘[DATE]]’
AND L_SHIPDATE < DATE ‘[DATE]’ + INTERVAL ‘1’ YEAR
AND L_DISCOUNTBETWEEN [DISCOUNT] - 0.01 AND [DISCOUNT] + 0.01
AND L_QUANTITY < [QUANTITY]
2.8.3 Substitution Parameters
Values for the following substitution parameters must be generated and used to build the executable query text.
1. DATE is the first of January of a randomly selected year within [1993-1997]
2. DISCOUNT is randomly selected within [0.02 .. 0.09]
3. QUANTITY is randomly selected within [24 .. 25]
Sample Query Definition (cont.)
2.8.4 Query Validation
For validation against the qualification database the query must be executed using the following values for the substitution parameters
and must produce the following output:
Values for substitution parameters:
1. DATE = 1994-01-01
2. DISCOUNT = 0.06
3. QUANTITY = 24
Query validation output data:
1 row returned
| REVENUE |
| 11450588.04 |

Query validation demonstrates the integrity of an implementation



Query phrasings are run against 100MB data set
Data set must mimic the design of the test data base
Answers sets must match those in the specification almost exactly
If the answer sets don’t match, the benchmark is invalid!
Query Variations
Formal Query Definitions are ISO-92 SQL
 EQT must match except for Minor Query Modification
Date/Time Syntax
Table Naming Conventions
Statement Terminators
AS clauses
Ordinal Group By/Order By
Coding Style (I.e., white space)
Any other phrasing must be a Pre-Approved Query Variant


Variants must be justifiable base on a criteria similar to 0.2
Approved variants are include in the specification
An implementation may use any combinations of Pre-Approved
Variants, Formal Query Definitions and Minor Query
Modifications.
TPC-D Update Functions

Update 0.1% of data per query stream


Implementation of updates is left to sponsor, except:



ACID properties must be maintained
The update functions must be a set of logically consistent transactions
New Sales Update Function (UF1)


About as long as a medium sized TPC-D query
Insert new rows into ORDER and LINEITEM tables
equal to 0.1% of table size
Old Sales Update Function (UF2)

Delete rows from ORDER and LINEITEM tables
equal to 0.1% of table size
TPC-D Execution Rules

Power Test



Queries submitted in a single stream (i.e., no concurrency)
Each Query Set is a permutation of the 17 read-only queries
Sequence:
Cache
Flush
Query
Set 0
UF1
Query
Set 0
UF2
(optional)
Warm-up, untimed
Throughput Test



Multiple concurrent query
streams
Query Set 1
Single update stream
Query Set 2
Sequence:
...

Timed Sequence
Query Set N
Updates:
UF1 UF2 UF1 UF2 UF1 UF2
TPC-D Execution Rules (cont.)

Load Test



Measures the time to go from an
empty database to reproducible
query runs
Not a primary metric; appears on
executive summary
Sequence:
DBMS
Initialized
DBGEN
Run
Preparation, Untimed
Data
Loaded
Indexes
Built
Timed Sequence
Stats
Gathered
Ready
for
Queries
Disclosure Requirements

All results must comply with standard TPC disclosure policies



Results must be reviewed by a TPC auditor certified for TPC-D
A Full Disclosure Report and Executive Summary must be on file with
the TPC before a result is publicly announced
All results are subject to standard TPC review policies



Once filed, result are “In Review” for sixty days
While in review, any member company may file a challenge against a
result that they think failed to comply with the specification
All challenges and compliance issues are handled by the TPC’s
judiciary, the Technical Advisory Board(TAB) and affirmed by the
membership
Do good, Do well and TO-DO
First, the good news…


TPC-D has improved products
 First real quantification of optimizer performance for some vendors
TPC-D has increased competition
Then some areas that bear watching…



Workload is maturing; indexing and query fixes are giving way to
engineering
SMP/MPP price barrier is disappearing, but so is some of the
performance difference
Meta knowledge of the data is becoming critical: better stats, smarter
optimizers, wiser data placement
Things we missed...
And finally the trouble spots…




No metric will please, customers, engineers, and marketing managers
TPC-D has failed to showcase multi-user decision support
No results yet on 10G or 30G
Decision support is moving faster than the TPC: OLAP, data marts,
data mining, SQL3, ADTs, Universal {IBM, Informix, Oracle}
TPC-D, version 2: Overview

Goal: define a workload to “take over” for TPC-D 1.x in time
with its lifecycle (~2 year from now)
Two areas of focus:
 Address the known deficiencies of the 1.x specification



Introduce data skew
Require multi-user executions
 What number of streams is interesting?
 Should updates scale with users? with data volume?
Broaden the scope of the query set and data set
 “Snowflake” schema
 Larger query set
 Batch and Trickle update models
An extensible TPC workload?
Make TPC-D extensible:
 Three types of extensions




Query: new question on the same schema
Schema: new representations and queries on the same data
Data: new data types and operators
Simpler adoption model than full specification




Mini-spec presented by three sponsors
Eval period for prototype/refinement (Draft status)
Acceptance as an extension
Periodic review for renewal, removal or promotion to base workload
The goal is an adaptive workload: more responsive to the market
and more inclusive of new technology without losing
comparability or relevance
Want to learn more about TPC-D?

TPC WWW site: www.tpc.org



TPC-D Training Video



The latest specification, tools, and results
The version 2 white paper
Six hour video by the folks who wrote the spec.
Explains, in detail, all major aspects of the benchmark.
Available from the TPC:
Shanley Public Relations
777 N. First St., Suite 600
San Jose, CA 95112-6311
ph: (408) 295-8894
fax: (408) 295-9768
email: [email protected]