The Making of TPC-DS Meikel Poess Oracle Corporation Raghunath Othayoth Nambiar Hewlett-Packard Company Agenda 1. Industry standard benchmark development 2. Limitations of TPC-H 3. Key elements of TPC-DS 4. Current state of the specification 5. Q&A August 31, 2006 32nd International Conference on Very Large Data Bases 2 Benchmark Categories • Industry standard benchmarks − Transaction Processing Performance Council (TPC) − Standard Performance Evaluation Corporation (SPEC) • Application benchmarks − SAP, Oracle Apps, JD Edwards, Exchange, Domino • Special purpose benchmarks − Dhrystone, Whetstone, Linpak, Iozone, Netperf, Stream August 31, 2006 32nd International Conference on Very Large Data Bases 3 Industry Standard Benchmarks − Broad Industry representation (all decision taken by the board) − Verifiable (audit process) − Domain specific standard tests − Resolution of disputes and challenges TPC Benchmarks SPEC Benchmarks August 31, 2006 •TPC-C (OLTP), TPC-E (New OLTP) •TPCH (DSS), TPC-DS (New DSS) •TPC-App - Dynamic WEB •SPEC CPU – Integer and Floating Point •SPEC SFS - System File Server •SPECweb – Web Server •SPECPower – Power Consumption (New) 32nd International Conference on Very Large Data Bases 4 Why Benchmarks Are Important • Vendor point of view − Define the playing field (measurable, repeatable) − Enable competitive analysis − Monitor release to release progress − Result understood by engineering, sales and customers − Accelerate focused technology development • Customer point of view − Cross-vendor comparisons (performance, TCO) − Evaluate new technologies − Eliminate costly in-house characterization August 31, 2006 32nd International Conference on Very Large Data Bases 5 Tracking Release to Release Progress, Example SPEC CPU2000 benchmark results on HP ProLiant DL380, 2002-todate 140 120 SPECfp®_rate2000 100 SPECint®_rate2000 120 82 80 59 60 43 40 18 20 22 25 32 33 40 11 0 DL380 G2 (01/ 02) DL380 G3 (07/ 03) DL380 G4 (08/ 04) DL380 G4 (09/ 05) DL380 G4 Dual-core (10/ 05) DL380 G5 Dual-core (07/ 06) All SPEC® CPU2000 benchmark results stated above reflect results published as of July 25, 2006. For the latest SPEC® CPU2000 benchmark results, visit www.spec.org/cpu2000/. August 31, 2006 32nd International Conference on Very Large Data Bases 6 Tracking Product-line Progress, Example TPC-C benchmark on HP ProLiant servers over 10 years. 240,000 $300.00 Oct-95 May-96 May-97 Oct-97 Jul-99 Feb-02 Mar-04 Oct-04 Dec-04 Feb-05 Apr-05 Oct-05 220,000 202,551 $241.64 200,000 $250.00 187,296 180,000 160,000 $200.00 143,367 130,623 120,000 115,110 $135.68 $150.00 $/tpmC tpmC 140,000 95,163 100,000 80,000 $100.00 69,170 $78.17 60,000 40,001 40,000 $39.25 $50.00 $18.86 20,000 5,677 2,455 11,056 9,029 $9.43 $2.93 $3.96 $2.62 $2.80 $2.04 $2.40 C ia n ro L P ro L ia n tD L5 8 5/ D L5 8 5/ D C 6 32nd International Conference on Very Large Data Bases P P ro L ia n tD tD tD L5 8 L7 6 5/ 2. 0G 2 5 L5 8 ia n ro L P ro L ia n P P ro L ia n tD ia n ro L tD L5 8 L7 6 tD t8 ia n P August 31, 2006 0G 2 0 0 50 0 P ro L ia n t7 00 0 00 ro L P P ro L ia n t6 00 t5 ia n ro L P P ro L ia n t4 50 0 $- 0 0 7 Competitive Analysis, Example Top Ten 3000GB TPC-H by Performance, As of 12-Sept-2006. August 31, 2006 32nd International Conference on Very Large Data Bases 8 Transaction Processing Performance Council (TPC) August 31, 2006 32nd International Conference on Very Large Data Bases 9 What makes the TPC unique • TPC is the only benchmark organization that requires priceperformance scores across all of its benchmarks • All tests require full documentation of the components and applications under test, so that the test can be replicated • The TPC requires an independent audit of results prior to publication • TPC tests the whole system performance, not just a piece • TPC is database agnostic: Oracle, IBM DB2, Sybase, Microsoft SQL Server, NonStop SQL/MX and other databases • TPC provides cross-platform performance comparisons, a view of processor versus real performance, technology comparisons and actual cost of performance comparisons August 31, 2006 32nd International Conference on Very Large Data Bases 10 TPC Business Model • TPC follows the philosophy of real world benchmarks, so that its customers can: − relate their business to the benchmark business model − relate their workload to the workload of the benchmark − Understand the benchmark August 31, 2006 32nd International Conference on Very Large Data Bases 11 TPC Members • 20 Member companies • Page August 31, 2006 12 4 Associate members 32nd International Conference on Very Large Data Bases 12 TPC • TPC Organization − Subcommittee • • • • OLTP DSS Web Pricing − Technical Advisory Board − Steering Committee • Benchmark development/maintenance − 6 Face-to-face meetings per year − Weekly conference calls August 31, 2006 32nd International Conference on Very Large Data Bases 13 New Benchmark Development August 31, 2006 32nd International Conference on Very Large Data Bases 14 Industry Standard Benchmark Development Activities • Development of a new benchmark in a new domain • Refinement of existing benchmarks • New Benchmark in an existing domain − Good benchmarks drive industry and technology forward − At some point, all reasonable advances have been made − Benchmarks can become counter productive by encouraging artificial optimizations − So, even good benchmarks become obsolete over time − As technology and user environment evolve, so should the benchmark August 31, 2006 32nd International Conference on Very Large Data Bases 15 Benchmark Lifecycle new idea Requirements feasibility forms Subcommittee debate defines Draft Spec feasibility implements Prototypes evaluation publishes Benchmark Spec refinement Vendors Publish Benchmark Results Benchmark becomes obsolete and results in new requirements August 31, 2006 32nd International Conference on Very Large Data Bases 16 Industry Standard Benchmark Development, Challenges • Development cycle − Benchmark development can take years • Technology and business could change significantly • Members have their own agenda − Hardware vs. software − Scale-out vs. scale-up − Proprietary vs. industry standard August 31, 2006 32nd International Conference on Very Large Data Bases 17 New Industry Standard Benchmarks, Challenges • Unknowns − How does my product perform under new load and metric • Risk factors − There are failed benchmarks • Investments − TPC Benchmark publications are expensive − Vendors want to keep their existing publications − Need to train engineers, sales and marketing, customers August 31, 2006 32nd International Conference on Very Large Data Bases 18 TPC-H August 31, 2006 32nd International Conference on Very Large Data Bases 19 TPC-H • Measures generally applicable aspects of a Decision Support System • Its basic ideas have been a standard since 1994 • Currently about 100 results from 14 vendors (system/ database) on website • Has served the industry and academia very well • Shortcomings in: − Data model − Workload model − Metric August 31, 2006 32nd International Conference on Very Large Data Bases 20 Data Model Shortcomings • Database Schema − − − − • 3rd Normal Form 8 tables On average 10 columns per table Commonly used database technologies are restricted Dataset − − − − − Uniform data distributions Synthetic data No null values Linear scaling with scale factor of almost all tables Unrealistic table sizing • Scale factor 100,000 20 Billion parts sold to 15 Billion customers at a rate of 150 Billion orders a year August 31, 2006 32nd International Conference on Very Large Data Bases 21 Workload Model Shortcomings • Query workload − 22 SQL92 queries − Simple structure − Only ad-hoc queries • Update workload − − − − August 31, 2006 Simple insert and delete operations No data transformations Only two tables are maintained Random inserts and deletes based on non-contiguous keys in the dataset 32nd International Conference on Very Large Data Bases 22 Metric Shortcomings • Primary performance metric − Very complex − Mix of geometric mean and arithmetic mean 3600 * SF QphH= 24 August 31, 2006 22 2 i 1 i 1 ( Qi * RFi ) * S * 22 * 3600 * SF max( T1 ..TS ) 32nd International Conference on Very Large Data Bases 23 Objectives for TPC-DS • Realistic data model • Complex workload − Large query set − ETL like update model • Simple and comprehensible metric • Understandable business model August 31, 2006 32nd International Conference on Very Large Data Bases 24 Data Model August 31, 2006 32nd International Conference on Very Large Data Bases 25 Fact Tables Catalog Returns Web Returns Store Returns Inventory Catalog Sales August 31, 2006 Web Sales Store Sales 3 sales channels: Catalog - Web - Store 7 fact tables 2 fact tables for each sales channel 24 tables total Basic auxiliary data structure are allowed on all tables Complex auxiliary data structures are only allowed on Catalog Sales and Catalog Returns 32nd International Conference on Very Large Data Bases 26 Snow Flake Date_Dim Store Item Store_Sales Time_Dim Promotion Customer_ Demographics Customer_ Household_ Demographics Address Customer August 31, 2006 32nd International Conference on Very Large Data Bases Income_ Band 27 Schema: Store Channel w/ Dimensions Date_Dim Store Item Store_Sales Time_Dim Promotion Customer_ Demographics Customer_ Household_ Demographics Address Customer August 31, 2006 32nd International Conference on Very Large Data Bases Income_ Band 28 Schema: Store Channel w/ Dimensions Date_Dim Store Item Store_Sales Time_Dim Promotion Customer_ Demographics Customer_ Household_ Demographics Address Customer August 31, 2006 32nd International Conference on Very Large Data Bases Income_ Band 29 Data Model Advantages • • • • • August 31, 2006 Complex relationships Fact to fact table relations Large number of tables (24) Large number of columns (18) Auxiliary data structures are allowed on a subset of the schema complex queries star and “traditional” executions satisfies hardware and software vendors extents lifetime of the benchmark due to complexity 32nd International Conference on Very Large Data Bases 30 Dataset August 31, 2006 32nd International Conference on Very Large Data Bases 31 Database Scaling • • • August 31, 2006 Database size is defined in scalefactors Scale factor indicates raw data size in GB Auxiliary data structures and temporary storage are not included Scale Factor Database Size 1 1 GB 100 100 GB 300 300 GB 1000 1 TB 3000 3 TB 10000 10 TB 30000 30 TB 100000 100 TB 32nd International Conference on Very Large Data Bases 32 Fact Table Scaling • Fact tables scale linearly with the scale factor Fact Table Scaling # Rows in Fact Table 1.00E+12 1.00E+11 1.00E+10 1.00E+09 1.00E+08 1.00E+07 1.00E+06 1.00E+05 Scale Factor 1 10 Sto re Sales August 31, 2006 100 Catalo g Sales 1000 Web Sales 32nd International Conference on Very Large Data Bases 10000 100000 Invento ry 33 Database Scaling (Dimensions) • Scale sub-linearly • Amount for a fraction of the fact tables Dimension Table Scaling # Rows in Dimension Table 10000 1000 100 10 August 31, 2006 Date Item 10 00 0 10 00 10 0 10 00 00 Stores 10 1 1 Scale Factor 32nd International Conference on Very Large Data Bases 34 Table Sizes at SF 100GB August 31, 2006 Table #Rows Percent of Total Store Sales 288 Million 39 Store Returns 28.8 Million 3.4 Catalog Sales 144 Million 30 Catalog Returns 14.4 Million 2.4 Web Sales 72 Million 15 Web Returns 7.2 Million 1 Inventory 390 Million 9 Customer 2 Million 0.5 Item 100,000 0.1 Catalog Page 24,000 0.002 Remaining 3,3 Million 0.005 32nd International Conference on Very Large Data Bases 35 Data Content • Some data has “real world” content: − Last name “Sanchez”, “Ward”, “Roberts” − Addresses “630 Railroad, Woodbine, Sullivan County,MO-64253” • Data is skewed − Sales are modeled after US census data − More green items than red − Small and large cities August 31, 2006 32nd International Conference on Very Large Data Bases 36 Sales Distribution Distribution of Store Sales over Month 600000 Store Sales 500000 14 % of all sales happen between January and July 28 % of all sales 58% of all sales happen happen between in November and August and October December Group 3 400000 Group 2 300000 Group 1 200000 100000 0 1 2 3 4 5 6 7 8 9 10 11 12 Month August 31, 2006 32nd International Conference on Very Large Data Bases 37 Color Distribution Number of Occurences 8 % of all colors 24% of all colors are are in Group 1 in group 2 1400 68% of all colors are in Group 3 1200 Group 3 1000 800 600 400 Group 2 Group 1 200 al ch mo ar nd tre us e flo ra l a b u zu rly re co wo rn od flo w e kh r ak i la ce gh m os ag t en ta m m i nt ar o se on as h po ell wd e th r is pa tle pa ya pl u sm m ok pu e rp le 0 Color August 31, 2006 32nd International Conference on Very Large Data Bases 38 Dataset Advantages • Realistic table scaling • Real world data content • Non-uniform distributions challenging for: − statistics collection − query optimizer August 31, 2006 32nd International Conference on Very Large Data Bases 39 Query Model August 31, 2006 32nd International Conference on Very Large Data Bases 40 Query Model • Queries are designed to be realistic. They: − Answer real world questions − Cover system’s functionality relevant to decision support applications − Only allow tuning methods available to a DBA − Queries cover all data so that unrealistic creation of auxiliary data structures is unlikely − Yet, they impose a controlled and repeatable workload August 31, 2006 32nd International Conference on Very Large Data Bases 41 Query Templates • TPC-DS requires a large query set − E.g. 100TB benchmarks runs 1089 queries • Queries are written in a query template language • Each query template is unique • Queries are automatically generated using query templates • More information about the query generator at: Meikel Poess, John M. Stephens: Generating Thousand Benchmark Queries in Seconds. VLDB 2004: 1045-1053 August 31, 2006 32nd International Conference on Very Large Data Bases 42 Query Model Query Language: SQL99 + OLAP extensions Query needs to be executed “as is” • • − No hints or rewrites allowed, except when approved by TPC • • 99 different query templates 4 different query types: Type simulate Implemented via Templates Reporting Finely tuned reoccurring queries Access catalog sales channel tables 38 Ad-hoc Sporadic queries, minimal tuning Access Store and Web Sales Channel tables 47 Iterative Users issuing sequences of queries Sequence of queries where each query adds SQL elements 4 Data Mining Queries feeding Data Mining Tools for further processing Return large number of rows 10 August 31, 2006 32nd International Conference on Very Large Data Bases 43 Ad Hoc Query select i_item_id, s_state, grouping(s_state) g_state, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, store, item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_store_sk = s_store_sk and ss_cdemo_sk = cd_demo_sk and cd_gender = '[GEN]' and cd_marital_status = '[MS]' and cd_education_status = '[ES]' and d_year = [YEAR] and s_state in ('[STATE_A]','[STATE_B]', '[STATE_C]', '[STATE_D]','[STATE_E]', '[STATE_F]') group by rollup (i_item_id, s_state); August 31, 2006 32nd International Conference on Very Large Data Bases 44 Reporting Query select count(distinct cs_order_number) as "order count" ,sum(cs_ext_ship_cost) as "total shipping cost" ,sum(cs_net_profit) as "total net profit" from catalog_sales cs1 ,date_dim ,customer_address ,call_center where d_date between '[YEAR]-[MONTH]-01' and (cast('[YEAR]-[MONTH]-01' as date) + 60 ) and cs1.cs_ship_date_sk = d_date_sk and cs1.cs_ship_addr_sk = ca_address_sk and ca_state = '[STATE]' and cs1.cs_call_center_sk = cc_call_center_sk and cc_county in ('[COUNTY_A]','[COUNTY_B]','[COUNTY_C]‘ ,'[COUNTY_D]','[COUNTY_E]') and exists (select * from catalog_sales cs2 where cs1.cs_order_number = cs2.cs_order_number and cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk) and not exists(select * from catalog_returns cr1 where cs1.cs_order_number = cr1.cr_order_number); August 31, 2006 32nd International Conference on Very Large Data Bases 45 Iterative Query Part # 1 with frequent_ss_items as (select substr(i_item_desc,1,30) itemdesc,i_item_sk item_sk,d_date solddate,count(*) cnt from store_sales ,date_dim ,item where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and d_year in ([YEAR],[YEAR]+1,[YEAR]+2,[YEAR]+3) group by substr(i_item_desc,1,30),i_item_sk,d_date having count(*) >4), max_store_sales as (select max(csales) cmax from (select c_customer_sk,sum(ss_quantity*ss_sales_price) csales from store_sales ,customer ,date_dim where ss_customer_sk = c_customer_sk and ss_sold_date_sk = d_date_sk and d_year in ([YEAR],[YEAR]+1,[YEAR]+2,[YEAR]+3) group by c_customer_sk) x), best_ss_customer as (select c_customer_sk,sum(ss_quantity*ss_sales_price) ssales from store_sales,customer where ss_customer_sk = c_customer_sk group by c_customer_sk having sum(ss_quantity*ss_sales_price) > 0.95 * (select * from max_store_sales)) select sum(sales) from ((select cs_quantity*cs_list_price sales from catalog_sales ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and cs_sold_date_sk = d_date_sk and cs_item_sk in (select item_sk from frequent_ss_items) and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer)) union all (select ws_quantity*ws_list_price sales from web_sales ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and ws_sold_date_sk = d_date_sk and ws_item_sk in (select item_sk from frequent_ss_items) and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer))) y; August 31, 2006 32nd International Conference on Very Large Data Bases 46 Iterative Query Part # 2 select c_last_name,c_first_name,sales from ((select c_last_name,c_first_name,sum(cs_quantity*cs_list_price) sales from catalog_sales ,customer ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and cs_sold_date_sk = d_date_sk and cs_item_sk in (select item_sk from frequent_ss_items) and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer) and cs_bill_customer_sk = c_customer_sk group by c_last_name,c_first_name) union all (select c_last_name,c_first_name,sum(ws_quantity*ws_list_price) sales from web_sales, customer ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and ws_sold_date_sk = d_date_sk and ws_item_sk in (select item_sk from frequent_ss_items) and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer) and ws_bill_customer_sk = c_customer_sk group by c_last_name,c_first_name)) y; August 31, 2006 32nd International Conference on Very Large Data Bases 47 Iterative Query Part # 3 select c_last_name,c_first_name,sales from ((select c_last_name,c_first_name,sum(cs_quantity*cs_list_price) sales from catalog_sales ,customer ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and cs_sold_date_sk = d_date_sk and cs_item_sk in (select item_sk from frequent_ss_items) and cs_bill_customer_sk in (select c_customer_sk from best_ss_customer) and cs_bill_customer_sk = c_customer_sk and cs_bill_customer_sk = cs_ship_customer_sk group by c_last_name,c_first_name) union all (select c_last_name,c_first_name,sum(ws_quantity*ws_list_price) sales from web_sales ,customer ,date_dim where d_year = [YEAR] and d_moy = [MONTH] and ws_sold_date_sk = d_date_sk and ws_item_sk in (select item_sk from frequent_ss_items) and ws_bill_customer_sk in (select c_customer_sk from best_ss_customer) and ws_bill_customer_sk = c_customer_sk and ws_bill_customer_sk = ws_ship_customer_sk group by c_last_name,c_first_name)) y; August 31, 2006 32nd International Conference on Very Large Data Bases 48 Query Model Advantages • SQL99 + OLAP extensions • Query templates allow for the generation of thousands of different queries • Combining of different query classes − − − − • August 31, 2006 Ad-hoc Reporting Iterating Data mining Star schema and “traditional” query execution 32nd International Conference on Very Large Data Bases 49 Execution Rules August 31, 2006 32nd International Conference on Very Large Data Bases 50 Benchmark Execution System Setup Database Setup Database Load Query Run #1 Un-timed Data Maintenance Query Run #2 Timed •Setup of: ••Creation Runs of n streams •Load raw of: dataconcurrently, running •each Load intoOperating fact tables •Servers/ System •System tables •Creation of auxiliary data • 99 queries • Delete fact tablesRAID •Storagefrom Arrays including structures •Table spaces different, substitutions •Repeat •• Maintain of random Query slowlyRun changing #1 •Networks • dimensions simulates n concurrent users •File Groups •Statistics gathering •Database Software •Flat•Log filesfiles (optional) August 31, 2006 32nd International Conference on Very Large Data Bases 51 Benchmark Execution Database Setup Database Load Query Run #1 Un-timed Data Maintenance Timed Scale Factor Number Streams 1 n.a 100 3 300 5 1000 7 3000 9 Q47 7 99 Stream 2: Q1,Q55,Q4,Q1430000 ,Q9, … ,13Q12,Q3 Q369 … Stream 1: Q3,Q21,Q11,Q3,Q Q47,Q99 8, … , 11 10000 100000 15 Stream n: Q94,Q3,Q1,Q8,Q84, … , Q34,Q23 August 31, 2006 Query Run #2 32nd International Conference on Very Large Data Bases … System Setup Q7 52 Benchmark Execution System Setup Database Setup Database Load Query Run #1 Un-timed Data Maintenance Query Run #2 Timed Flat File read Transformation load Data Warehouse Tables DBMS August 31, 2006 32nd International Conference on Very Large Data Bases 53 Benchmark Execution System Setup Database Setup Un-timed Database Load Query Run #1 Data Maintenance Timed Q7 Stream 2: Q1,Q55,Q4,Q14,Q9, … , Q12,Q3 Q69 … … Stream 1: Q3,Q21,Q11,Q3,Q8, … , Q47,Q99 Stream n: Q94,Q3,Q1,Q8,Q84, … , Q34,Q23 August 31, 2006 Query Run #2 32nd International Conference on Very Large Data Bases Q7 54 Database Load • Simulates data warehouse reload • Measures the system’s ability to: − Load data − Create auxiliary data structures − Gather statistics • Is part of metric because − Data warehouses get recreated − It prevents using unrealistic auxiliary data structures August 31, 2006 32nd International Conference on Very Large Data Bases 55 Execution Query Run #1 • Simulates execution of queries by multiple concurrent users • Measures the system’s ability to: − Process concurrent query executions in the least amount of time − Allocate resources efficiently among multiple concurrent users • Minimum number of streams required • Maximum number of streams not limited August 31, 2006 32nd International Conference on Very Large Data Bases 56 Data Maintenance • Simulates incremental raw data feeds from an OLTP system • Costs auxiliary data structures • Amount of data loaded is linear to the number of streams guarantees significance of DM • Approach is database centric no ETL tools August 31, 2006 32nd International Conference on Very Large Data Bases 57 Execution Query Run #2 • Rerun of Query Run #1 • Measures the system’s ability to repeat the results of Query Run #1 after Data Maintenance August 31, 2006 32nd International Conference on Very Large Data Bases 58 Metric August 31, 2006 32nd International Conference on Very Large Data Bases 59 Primary Metrics • Three primary metrics − − − • Queries per hour Price per Query System Availability Queries per Hour S *198 * 3600 * SF QphDS= S * 0.01* TLOAD TQ1 TDM TQ 2 − − − − − August 31, 2006 S: Number of query streams SF: Scale Factor TQ1 and TQ2: elapsed times to complete query run #1 and #2 TDM is the elapsed time to complete the data maintenance TLOAD is the total elapsed time to complete the database load 32nd International Conference on Very Large Data Bases 60 Metric Explanation S *198 * 3600 * SF QphDS= S * 0.01* TLOAD TQ1 TDM TQ 2 • Numerator: − S*198 normalizes result to queries − 3600 normalizes result to hours − SF normalizes result to scale factor • Denominator − 0.01 costs load with 1% − S do avoid diminishing the costing of load August 31, 2006 32nd International Conference on Very Large Data Bases 61 Current Status of TPC-DS Spec August 31, 2006 32nd International Conference on Very Large Data Bases 62 Status TPC-DS new idea Requirements feasibility forms Subcommittee debate defines Draft Spec feasibility implements Prototypes evaluation publishes Benchmark Spec refinement Vendors Publish Benchmark Results Benchmark becomes obsolete and results in new requirements August 31, 2006 32nd International Conference on Very Large Data Bases 63 More Information • Specification: http://www.tpc.org/tpcds/default.asp • Benchmark tools: − Dbgen − Qgen − Query templates • August 31, 2006 will be available on website soon 32nd International Conference on Very Large Data Bases 64 Q&A August 31, 2006 32nd International Conference on Very Large Data Bases 65