SIGMOD ‘97 Industrial Session 5 Standard Benchmarks for Database Systems Chair: Dave DeWitt (Jim Gray, in absentia) TPC-C: The OLTP Benchmark Charles Levine Microsoft clevine@microsoft.com Benchmarks: What and Why What is a benchmark? Domain specific No single metric possible The more general the benchmark, the less useful it is for anything in particular. A benchmark is a distillation of the essential attributes of a workload Desirable attributes Relevant meaningful within the target domain Understandable Good metric(s) linear, orthogonal, monotonic Scaleable applicable to a broad spectrum of hardware/architecture Coverage does not oversimplify the typical environment Acceptance Vendors and Users embrace it Benefits and Liabilities Good benchmarks Define the playing field Accelerate progress Engineers do a great job once objective is measurable and repeatable Set the performance agenda Measure release-to-release progress Set goals (e.g., 10,000 tpmC, < 50 $/tpmC) Something managers can understand (!) Benchmark abuse Benchmarketing Benchmark wars more $ on ads than development Benchmarks have a Lifetime Good benchmarks drive industry and technology forward. At some point, all reasonable advances have been made. Benchmarks can become counter productive by encouraging artificial optimizations. So, even good benchmarks become obsolete over time. What is the TPC? TPC = Transaction Processing Performance Council Founded in Aug/88 by Omri Serlin and 8 vendors. Membership of 40-45 for last several years Everybody who’s anybody in software & hardware De facto industry standards body for OLTP performance Administered by: Shanley Public Relations 777 N. First St., Suite 600 San Jose, CA 95112-6311 ph: (408) 295-8894 fax: (408) 295-9768 email: td@tpc.org Most TPC specs, info, results are on the web page: http://www.tpc.org TPC-C Overview Moderately complex OLTP The result of 2+ years of development by the TPC Application models a wholesale supplier managing orders. Order-entry provides a conceptual model for the benchmark; underlying components are typical of any OLTP system. Workload consists of five transaction types. Users and database scale linearly with throughput. Spec defines full-screen end-user interface. Metrics are new-order txn rate (tpmC) and price/performance ($/tpmC) Specification was approved July 23, 1992. TPC-C’s Five Transactions OLTP transactions: New-order: enter a new order from a customer Payment: update customer balance to reflect a payment Delivery: deliver orders (done as a batch transaction) Order-status: retrieve status of customer’s most recent order Stock-level: monitor warehouse inventory Transactions operate against a database of nine tables. Transactions do update, insert, delete, and abort; primary and secondary key access. Response time requirement: 90% of each type of transaction must have a response time 5 seconds, except stock-level which is 20 seconds. TPC-C Database Schema Warehouse W Stock 100K W*100K Item W 100K (fixed) Legend 10 Table Name District one-to-many relationship <cardinality> W*10 secondary index 3K Customer W*30K Order 1+ W*30K+ 1+ 10-15 History Order-Line W*30K+ W*300K+ New-Order 0-1 W*5K TPC-C Workflow 1 Select txn from menu: 1. New-Order 2. Payment 3. Order-Status 4. Delivery 5. Stock-Level 45% 43% 4% 4% 4% 2 Input screen 3 Output screen Cycle Time Decomposition (typical values, in seconds, for weighted average txn) Measure menu Response Time Menu = 0.3 Keying time Keying = 9.6 Measure txn Response Time Think time Txn RT = 2.1 Think = 11.4 Average cycle time = 23.4 Go back to 1 Data Skew NURand - Non Uniform Random NURand(A,x,y) = (((random(0,A) | random(x,y)) + C) % (y-x+1)) + x Customer Last Name: NURand(255, 0, 999) Customer ID: NURand(1023, 1, 3000) Item ID: NURand(8191, 1, 100000) bitwise OR of two random values skews distribution toward values with more bits on 75% chance that a given bit is one (1 - ½ * ½) skewed data pattern repeats with period of smaller random number NURand Distribution TPC-C NURand function: frequency vs 0...255 0.09 0.08 0.07 0.06 0.05 cumulative distribution 0.04 0.03 0.02 0.01 Record Identitiy [0..255] 250 240 230 220 210 200 190 180 170 160 150 140 130 120 110 100 90 80 70 60 50 40 30 20 10 0 0 Relative Frequency of Access to This Record 0.1 ACID Tests TPC-C requires transactions be ACID. Tests included to demonstrate ACID properties met. Atomicity Consistency Isolation Verify that all changes within a transaction commit or abort. ANSI Repeatable reads for all but Stock-Level transactions. Committed reads for Stock-Level. Durability Must demonstrate recovery from Loss of power Loss of memory Loss of media (e.g., disk crash) Transparency TPC-C requires that all data partitioning be fully transparent to the application code. (See TPC-C Clause 1.6) Both horizontal and vertical partitioning is allowed All partitioning must be hidden from the application Most DBMS’s do this today for single-node horizontal partitioning. Much harder: multiple-node transparency. For example, in a two-node cluster: Any DML operation must be able to operate against the entire database, regardless of physical location. Warehouses: Node A select * from warehouse where W_ID = 150 1-100 Node B select * from warehouse where W_ID = 77 101-200 Transparency (cont.) How does transparency affect TPC-C? Payment txn: 15% of Customer table records are non-local to the home warehouse. New-order txn: 1% of Stock table records are non-local to the home warehouse. In a distributed cluster, the cross warehouse traffic causes cross node traffic and either 2 phase commit, distributed lock management, or both. For example, with distributed txns: Number of nodes 1 2 3 n % Network Txns 0 5.5 7.3 10.9 TPC-C Rules of Thumb 1.2 tpmC per User/terminal (maximum) 10 terminals per warehouse (fixed) 65-70 MB/tpmC priced disk capacity (minimum) ~ 0.5 physical IOs/sec/tpmC (typical) 250-700 KB main memory/tpmC (how much $ do you have?) So use rules of thumb to size 10,000 tpmC system: How many terminals? How many warehouses? How much memory? How much disk capacity? How many spindles? » 8340 = 10000 / 1.2 » 834 = 8340 / 10 » 2.5 - 7 GB » 650 GB = 10000 * 65 » Depends on MB capacity vs. physical IO. Capacity: 650 / 8 = 82 spindles IO: 10000*.5 / 82 = 61 IO/sec TOO HOT! Typical TPC-C Configuration (Conceptual) Hardware Emulated User Load Driver System Presentation Services Term. LAN Client Software C/S LAN Database Server ... Response Time measured here RTE, e.g.: Empower preVue LoadRunner Database Functions TPC-C application + Txn Monitor and/or database RPC library e.g., Tuxedo, ODBC TPC-C application (stored procedures) + Database engine + Txn Monitor e.g., SQL Server, Tuxedo Competitive TPC-C Configuration Today 8070 tpmC; $57.66/tpmC; 5-yr COO= 465 K$ 2 GB memory, disks: 37 x 4GB + 48 x 9.1GB (560 GB total) 6,700 users TPC-C Current Results Best Performance is 30,390 tpmC @ $305/tpmC (Digital) Best Price/Perf. is 7,693 tpmC @ $42.53/tpmC (Dell) 350 Price/Performance ($/tpmC) 300 250 Compaq 200 Dell Digital 150 HP IBM 100 NCR 50 SGI Sun - 5,000 10,000 15,000 20,000 Throughput (tpmC) 25,000 30,000 35,000 TPC-C Results (by OS) TPC-C Results by OS Price/Performance ($/tpmC) 400 Unix 350 Windows NT 300 250 200 150 100 50 - 5,000 10,000 15,000 Throughput (tpmC) TPC-C results as of 5/9/97 20,000 25,000 30,000 TPC-C Results (by DBMS) TPC-C Results by DBMS Price/Performance ($/tpmC) 400 Informix 350 Microsoft 300 Oracle 250 Sybase 200 150 100 50 - 5,000 10,000 15,000 20,000 Throughput (tpmC) TPC-C results as of 5/9/97 25,000 30,000 Analysis from 30,000 ft. Unix results are 2-3x more expensive than NT. Unix results are more scalable Doesn’t matter which DBMS Unix: 10, 12, 16, 24 way SMPs NT: 4-way SMP w/ Intel & 8-way SMP on Digital Alpha Highest performance is on clusters only a few results (trophy numbers?) TPC-C Summary Balanced, representative OLTP mix Five transaction types Database intensive; substantial IO and cache load Scaleable workload Complex data: data attributes, size, skew Requires Transparency and ACID Full screen presentation services De facto standard for OLTP performance Reference Material TPC Web site: www.tpc.org TPC Results Database: www.microsoft.com/sql/tpc IDEAS web site: www.ideasinternational.com Jim Gray, The Benchmark Handbook for Database and Transaction Processing Systems, Morgan Kaufmann, San Mateo, CA, 1991. Raj Jain, The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling, John Wiley & Sons, New York, 1991. William Highleyman, Performance Analysis of Transaction Processing Systems, Prentice Hall, Englewood Cliffs, NJ, 1988. TPC-D The Industry Standard Decision Support Benchmark Jack Stephens Informix jms@informix.com Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future TPC-D Overview Complex Decision Support workload The result of 5 years of development by the TPC Benchmark models ad hoc queries DSS Queries extract database with concurrent updates multi-user environment Specification was approved April 5, 1995. TPC-D Business Analysis Business Operations TPC-A TPC-B TPC-C OLTP Transactions Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future TPC-D Schema Customer Nation Region SF*150K 25 5 Order Supplier Part SF*1500K SF*10K SF*200K Time 2557 LineItem PartSupp SF*6000K SF*800K Legend: • Arrows point in the direction of one-to-many relationships. • The value below each table name is its cardinality. SF is the Scale Factor. • The Time table is optional. So far, not used by anyone. Schema Usage comment name regionkey comment regionkey name nationkey comment shippriority clerk orderpriority orderdate totalprice orderstatus custkey orderkey comment shipmode shipinstruct receiptdate commitdate shipdate linestatus returnflag tax discount extendedprice quantity linenumber suppkey partkey orderkey comment mktsegment acctbal phone nationkey address name custkey comment supplycost availqty suppkey partkey comment acctbal phone nationkey address name suppkey comment retailprice container size type brand mfgr name partkey QUERY 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 N ATIONREGION ORDDER LIN EITEM CUSTOMER PARTSUPP SUPPLIER PART TPC-D Database Scaling and Load Database size is determined from fixed Scale Factors (SF): Database is generated by DBGEN 1, 10, 30, 100, 300, 1000, 3000, 10000 (note that 3 is missing, not a typo) These correspond to the nominal database size in GB. (I.e., SF 10 is approx. 10 GB, not including indexes and temp tables.) Indices and temporary tables can significantly increase the total disk capacity. (3-5x is typical) DBGEN is a C program which is part of the TPC-D spec. Use of DBGEN is strongly recommended. TPC-D database contents must be exact. Database Load time must be reported Includes time to create indexes and update statistics. Not included in primary metrics. Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future TPC-D Query Set 17 queries written in SQL92 to implement business questions. Queries are pseudo ad hoc: Substitution parameters are replaced with constants by QGEN QGEN replaces substitution parameters with random values No host variables No static SQL Queries cannot be modified -- “SQL as written” There are some minor exceptions. All variants must be approved in advance by the TPC Sample Query Definition 2.3 Forecasting Revenue Query (Q6) This query quantifies the amount of revenue increase that would have resulted from eliminating company-wide discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for ways to increase revenues. 2.3.1 Business Question The Forecasting Revenue Change Query considers all the lineitems shipped in a given year with discounts between DISCOUNT+0.01 and DISCOUNT-0.01. The query list the amount by which the total revenues would have decreased if these discounts had been eliminated for lineitems with item quantities less than QUANTITY. Note that the potential revenue increase is equal to the sum of (L_EXTENDEDPRICE * L_DISCOUNT) for all lineitems with quantities and discounts in the qualifying range. 2.3.2 Functional Query Definition SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM WHERE L_SHIPDATE >= DATE ‘[DATE]]’ AND L_SHIPDATE < DATE ‘[DATE]’ + INTERVAL ‘1’ YEAR AND L_DISCOUNTBETWEEN [DISCOUNT] - 0.01 AND [DISCOUNT] + 0.01 AND L_QUANTITY < [QUANTITY] 2.8.3 Substitution Parameters Values for the following substitution parameters must be generated and used to build the executable query text. 1. DATE is the first of January of a randomly selected year within [1993-1997] 2. DISCOUNT is randomly selected within [0.02 .. 0.09] 3. QUANTITY is randomly selected within [24 .. 25] Sample Query Definition (cont.) 2.8.4 Query Validation For validation against the qualification database the query must be executed using the following values for the substitution parameters and must produce the following output: Values for substitution parameters: 1. DATE = 1994-01-01 2. DISCOUNT = 0.06 3. QUANTITY = 24 Query validation output data: 1 row returned | REVENUE | | 11450588.04 | Query validation demonstrates the integrity of an implementation Query phrasings are run against 100MB data set Data set must mimic the design of the test data base Answers sets must match those in the specification almost exactly If the answer sets don’t match, the benchmark is invalid! Query Variations Formal Query Definitions are ISO-92 SQL EQT must match except for Minor Query Modification Date/Time Syntax Table Naming Conventions Statement Terminators AS clauses Ordinal Group By/Order By Coding Style (I.e., white space) Any other phrasing must be a Pre-Approved Query Variant Variants must be justifiable base on a criteria similar to 0.2 Approved variants are include in the specification An implementation may use any combinations of Pre-Approved Variants, Formal Query Definitions and Minor Query Modifications. TPC-D Update Functions Update 0.1% of data per query stream Implementation of updates is left to sponsor, except: ACID properties must be maintained The update functions must be a set of logically consistent transactions New Sales Update Function (UF1) About as long as a medium sized TPC-D query Insert new rows into ORDER and LINEITEM tables equal to 0.1% of table size Old Sales Update Function (UF2) Delete rows from ORDER and LINEITEM tables equal to 0.1% of table size Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future TPC-D Execution Rules Power Test Queries submitted in a single stream (i.e., no concurrency) Each Query Set is a permutation of the 17 read-only queries Sequence: Cache Flush Query Set 0 UF1 Query Set 0 UF2 (optional) Warm-up, untimed Throughput Test Multiple concurrent query streams Query Set 1 Single update stream Query Set 2 Sequence: ... Timed Sequence Query Set N Updates: UF1 UF2 UF1 UF2 UF1 UF2 TPC-D Execution Rules (cont.) Load Test Measures the time to go from an empty database to reproducible query runs Not a primary metric; appears on executive summary Sequence: DBMS Initialized DBGEN Run Preparation, Untimed Data Loaded Indexes Built Timed Sequence Stats Gathered Ready for Queries TPC-D Metrics Power Metric (QppD) Geometric Mean 3600 SF QppD@ Size 19 i 17 j 2 i 1 j 1 QI (i ,0)UI ( j ,0) where QI(i,0) Timing Interval for Query i, stream 0 UI(j,0) Timing Interval for Update j, stream 0 SF Scale Factor Throughput (QthD) Arithmetic Mean Both Metrics represent “Queries per Gigabyte Hour” QthD@ Size S 17 TS 3600 SF where: S number of query streams TS elapsed time of test (in seconds) TPC-D Metrics (cont.) Composite Query-Per-Hour Rating (QphD) The Power and Throughput metrics are combined to get the composite queries per hour. QphD@ Size QppD@ Size QthD@ Size Reported metrics are: Power: QppD@Size Throughput: QthD@Size Price/Performance: $/QphD@Size Comparability: Results within a size category (SF) are comparable. Comparisons among different size databases are strongly discouraged. Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future Disclosure Requirements All results must comply with standard TPC disclosure policies Results must be reviewed by a TPC auditor certified for TPC-D A Full Disclosure Report and Executive Summary must be on file with the TPC before a result is publicly announced All results are subject to standard TPC review policies Once filed, result are “In Review” for sixty days While in review, any member company may file a challenge against a result that they think failed to comply with the specification All challenges and compliance issues are handled by the TPC’s judiciary, the Technical Advisory Board(TAB) and affirmed by the membership QppD @100G QthD @100G TPC-D results as of 5/9/97 QppD @300G QthD @300G QppD @1000G QthD @1000G $/QphD @100G $/QphD @300G 2-May 28-Mar-97 0 21-Mar-97 2000 4-Mar-97 500 22-Jan-97 4000 22-Jan-97 1000 26-Nov-96 6000 25-Nov-96 1500 4-Nov-96 8000 23-Sep-96 2000 23-Sep-96 10000 17-Sep-96 2500 4-Jun-96 12000 30-Apr-96 3000 16-Apr-96 14000 15-Apr-96 3500 0 $/QphD @1000G Price/Performance Performance TPC-D Current Results Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future Do good, Do well and TO-DO First, the good news… TPC-D has improved products First real quantification of optimizer performance for some vendors TPC-D has increased competition Then some areas that bear watching… Workload is maturing; indexing and query fixes are giving way to engineering SMP/MPP price barrier is disappearing, but so is some of the performance difference Meta knowledge of the data is becoming critical: better stats, smarter optimizers, wiser data placement Things we missed... And finally the trouble spots… No metric will please, customers, engineers, and marketing managers TPC-D has failed to showcase multi-user decision support No results yet on 10G or 30G Decision support is moving faster than the TPC: OLAP, data marts, data mining, SQL3, ADTs, Universal {IBM, Informix, Oracle} Outline Overview The Database The Queries The Execution Rules The Results Early Lessons The Future TPC-D, version 2: Overview Goal: define a workload to “take over” for TPC-D 1.x in time with its lifecycle (~2 year from now) Two areas of focus: Address the known deficiencies of the 1.x specification Introduce data skew Require multi-user executions What number of streams is interesting? Should updates scale with users? with data volume? Broaden the scope of the query set and data set “Snowstorm” schema Larger query set Batch and Trickle update models An extensible TPC workload? Make TPC-D extensible: Three types of extensions Query: new question on the same schema Schema: new representations and queries on the same data Data: new data types and operators Simpler adoption model than full specification Mini-spec presented by three sponsors Eval period for prototype/refinement (Draft status) Acceptance as an extension Periodic review for renewal, removal or promotion to base workload The goal is an adaptive workload: more responsive to the market and more inclusive of new technology without losing comparability or relevance Want to learn more about TPC-D? TPC WWW site: www.tpc.org TPC-D Training Video The latest specification, tools, and results The version 2 white paper Six hour video by the folks who wrote the spec. Explains, in detail, all major aspects of the benchmark. Available from the TPC: Shanley Public Relations 777 N. First St., Suite 600 San Jose, CA 95112-6311 ph: (408) 295-8894 fax: (408) 295-9768 email: td@tpc.org