Uploaded by Weining Qian

PeakBench

advertisement
Chunxi Zhang, Yuming Li, Rong Zhang, Weining Qian, Aoying Zhou
School of Data Science and Engineering,
East China Normal University
{cxzhang, liyuming}@stu.ecnu.edu.cn
{rzhang,wnqian,ayzhou}@dase.ecnu.edu.cn
Motivation and Background
Architecture
Implementation
Results
2
Phenomenal workloads: Single’s Day Promotion
3
 12306 (train ticket buying during the Spring Festival period)
 Black Friday
 Banking systems to support various promotion and second-
kill apps.
 ...
 The key problem is transaction processing in backend
paying systems
4
 Transaction processing
 Guarantees ACID properties
 Scalability
 Peak workload >> standard workload in terms of
throughput (80x ~ 100x is common)
 Usually depends on scaling-out architecture in
stead of scaling-up architecture
5
Homepage: https://github.com/daseECNU/CEDAR
Email: cedar.tp@gmail.com
v0.5 with HA
2010
OceanBase
project
started
2012.11
v0.4
with SQL
(limited
transaction
support)
2015.3
v0.4.2
open
source
(GPLv2)
2016.1
CEDAR 0.1
v1 for Aliyun (cloud)
2016.9
2017.9
CEDAR 0.2 CEDAR 0.3
new
new
Transaction
Query
Engine
Engine
2018.9
CEDAR
Milestone
6
History repository
Class-A app.
Bank notes recording
Massive datasets
Supply-chain finance
Replacing IBM DB2
Netpay
Debt-Credit
Internet-scale
HA
7
Database systems
• DB2, Oracle, PostgreSQL, ...
TP Monitors
• CICS, Tuxedo, HP NonStop, ...
NewSQL systems
• H-Store, HANA, MS Hekaton, OceanBase
8
Benchmarks
Year
People/Organizations
Descriptions
DebitCredit
1985
David DeWitt et al.
TPC-C
1992
TPC
TPC-E
2007
TPC
TATP
2009
IBM Software Group
TP benchmark, Caller location system
SmallBank
2009
University of Sydney
TP benchmark, Bank application
YCSB/YCSB++
CH-benCHmark
Yahoo! Research/
2010/2011 Carnegie Mellon University,
National Security Agency
2011
Dagstuhl Germany
TP benchmark, Bank application
TP benchmark, Warehouse-centric
application
TP benchmark, stock brokerage
application
TP benchmark
Hybrid workload, TPC-C & TPC-H
9
 Architecture
 Scalable TP is not achieved only by the TP system
 Workloads
 Should be configurable and scalable
 Should be: Concurrency, Contention, Skewness, and
Dynamics
 Measurements
 Scalability is a new measure
10
 SecKill is an application where a large number of users snap up scarce
resources in a short period of time, such as "11.11" of Tmall, "ticket
booking" during China Spring Festival and "Stock Exchange" applications.
 Three phase of SecKill
 ReadPhase: Before SecKill start time point, all SecKill items can only be browsed
(Read) since customers are searching or keeping monitoring the interested ones.
It will be read-heavy.
 CriticalityPhase: It reaches read peak closing SecKill time. Almost all customers
start focusing on the status changing of items and continuously refreshing, when
SecKill time approaches.
 KillPhase: It turns almost all browsers (R) into buy (W) at the point of kill start. It
will be write-heavy and contention-intensive, especially for hot items.
11
Access
Generation
control
Control
Client Thread
DB data Parameters
-Item quantity
-Record size
-Table separation
- ...
DB Generator
Workload Parameters
-R/W quantity
-Workload distribution
-Workload contension
-Workload dynamics
- ...
Adapter
Workload
Generator
Workload
Executor
DB Layer
Performance
Evaluator
Configuration
Configuration
Generator
Generator
Storage
Storage
12
 Modeling of a SecKill
application
 Key: separation of Read
DB and Write DB
13
Generation Control for Extensibility
• Our data/workload generators can run on threads, supporting distributed deployment.
Configuration for Factuality
• Our allows to declare both the static and dynamic demands on data and workloads
characteristics.
Executor for Adaptability
• Data generator and workload generator are designed based on requirements assigned.
• We design new evaluation metrics to show system stability under dynamic environment.
Storage for Generality:
• Our tool can apply, run and test on different databases with general interfaces.
14
Queues of Read DBs
Queue
Queue
Info query
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item1
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
Queue
Control flow of user
Control flow of system
15
Queues of Read DBs
Queue
Queue
Info query
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
SecKill
Queue of item1
Hot
Queue of item2
Refresh
...
Set
Write DB
Cold
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
Queue
Control flow of user
Control flow of system
16
Queues of Read DBs
Queue
Queue
Info query
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item1
Queue of item2
Refresh
Set
...
Separation of Read/Write
Write DB
SecKill
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
Queue
Control flow of user
Control flow of system
17
Queues of Read DBs
Queue
DBMS
...
Queue
...
Read DBs
Queue
Info query
Scaling-out
...
DBMS
Queues for SecKill
DBMS
Queue of item1
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
Queue
Control flow of user
Control flow of system
18
Queues of Read DBs
Queue
...
Queue
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item1
Queue of item2
Refresh
...
Set
Write DB
Queue of itemk
...
Read DBs
Corelated transactions
Queue
Info query
SecKill
Cancel
Order query
Pay
DBMS
Queue of Write DB
Queue
Control flow of user
Control flow of system
19
Queues of Read DBs
Queue
Queue
Info query
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of item1
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
orderitem
oi_orderkey
oi_itemkey
oi_count
oi_price
...
int
<pk,fk1>
int
<pk,fk2>
int
decimal(10, 2)
FK_ORDERITE_REFERENCE_ITEM
FK_SECKILLP_REFERENCE_ITEM
FK_ORDERITE_REFERENCE_ORDERS
orders
o_orderkey
o_custkey
o_skpkey
o_price
o_orderdate
o_paydate
o_state
...
int
<pk>
int
<fk>
int
decimal(10, 2)
datetime
datetime
int
Queue
seckillplan
item
i_itemkey
i_suppkey
i_name
i_count
i_price
i_detail
i_type
i_popularity
...
int
<pk>
int
<fk>
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
sl_skpkey
sl_itemkey
sl_price
sl_starttime
sl_endtime
sl_plancount
sl_skpcount
sl_popularity
...
int
<pk>
int
<fk>
decimal(10, 2)
datetime
datetime
int
int
int
FK_SECKILLP_REFERENCE_SECKILLP
FK_ORDERS_REFERENCE_CUSTOMER
seckillpay
customer
c_custkey int
<pk>
c_name
varchar(50)
c_address varchar(100)
...
s_suppkey int
<pk>
s_name
varchar(50)
s_address varchar(100)
...
DB for Write
Control flow of system
item
FK_ITEM_REFERENCE_SUPPLIER
supplier
Control flow of user
sa_skpkey
sa_paycount
sa_starttime
sa_endtime
...
int
<pk,fk>
int
datetime
datetime
itemkey
suppkey
name
count
price
detail
type
popularity
skpkey
paycount
starttime
endtime
...
DB for Read
int
<pk>
int
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
int
int
datetime
datetime
20
Queues of Read DBs
Queue
Info query
Queue
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of item1
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
orderitem
oi_orderkey
oi_itemkey
oi_count
oi_price
...
int
<pk,fk1>
int
<pk,fk2>
int
decimal(10, 2)
FK_ORDERITE_REFERENCE_ITEM
FK_SECKILLP_REFERENCE_ITEM
FK_ORDERITE_REFERENCE_ORDERS
orders
o_orderkey
o_custkey
o_skpkey
o_price
o_orderdate
o_paydate
o_state
...
int
<pk>
int
<fk>
int
decimal(10, 2)
datetime
datetime
int
Queue
seckillplan
item
i_itemkey
i_suppkey
i_name
i_count
i_price
i_detail
i_type
i_popularity
...
int
<pk>
int
<fk>
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
sl_skpkey
sl_itemkey
sl_price
sl_starttime
sl_endtime
sl_plancount
sl_skpcount
sl_popularity
...
int
<pk>
int
<fk>
decimal(10, 2)
datetime
datetime
int
int
int
FK_SECKILLP_REFERENCE_SECKILLP
FK_ORDERS_REFERENCE_CUSTOMER
seckillpay
customer
c_custkey int
<pk>
c_name
varchar(50)
c_address varchar(100)
...
s_suppkey int
<pk>
s_name
varchar(50)
s_address varchar(100)
...
DB for Write
Control flow of system
item
FK_ITEM_REFERENCE_SUPPLIER
supplier
Control flow of user
sa_skpkey
sa_paycount
sa_starttime
sa_endtime
...
int
<pk,fk>
int
datetime
datetime
itemkey
suppkey
name
count
price
detail
type
popularity
skpkey
paycount
starttime
endtime
...
DB for Read
int
<pk>
int
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
int
int
datetime
datetime
21
Queues of Read DBs
Queue
Info query
Queue
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of item1
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
orderitem
oi_orderkey
oi_itemkey
oi_count
oi_price
...
int
<pk,fk1>
int
<pk,fk2>
int
decimal(10, 2)
FK_ORDERITE_REFERENCE_ITEM
FK_SECKILLP_REFERENCE_ITEM
FK_ORDERITE_REFERENCE_ORDERS
orders
o_orderkey
o_custkey
o_skpkey
o_price
o_orderdate
o_paydate
o_state
...
int
<pk>
int
<fk>
int
decimal(10, 2)
datetime
datetime
int
Queue
seckillplan
item
i_itemkey
i_suppkey
i_name
i_count
i_price
i_detail
i_type
i_popularity
...
int
<pk>
int
<fk>
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
sl_skpkey
sl_itemkey
sl_price
sl_starttime
sl_endtime
sl_plancount
sl_skpcount
sl_popularity
...
int
<pk>
int
<fk>
decimal(10, 2)
datetime
datetime
int
int
int
FK_SECKILLP_REFERENCE_SECKILLP
FK_ORDERS_REFERENCE_CUSTOMER
seckillpay
customer
c_custkey int
<pk>
c_name
varchar(50)
c_address varchar(100)
...
s_suppkey int
<pk>
s_name
varchar(50)
s_address varchar(100)
...
DB for Write
Control flow of system
item
FK_ITEM_REFERENCE_SUPPLIER
supplier
Control flow of user
sa_skpkey
sa_paycount
sa_starttime
sa_endtime
...
int
<pk,fk>
int
datetime
datetime
itemkey
suppkey
name
count
price
detail
type
popularity
skpkey
paycount
starttime
endtime
...
DB for Read
int
<pk>
int
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
int
int
datetime
datetime
22
Queues of Read DBs
Queue
Info query
Queue
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of item1
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
orderitem
oi_orderkey
oi_itemkey
oi_count
oi_price
...
int
<pk,fk1>
int
<pk,fk2>
int
decimal(10, 2)
FK_ORDERITE_REFERENCE_ITEM
FK_SECKILLP_REFERENCE_ITEM
FK_ORDERITE_REFERENCE_ORDERS
orders
o_orderkey
o_custkey
o_skpkey
o_price
o_orderdate
o_paydate
o_state
...
int
<pk>
int
<fk>
int
decimal(10, 2)
datetime
datetime
int
Queue
seckillplan
item
i_itemkey
i_suppkey
i_name
i_count
i_price
i_detail
i_type
i_popularity
...
int
<pk>
int
<fk>
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
sl_skpkey
sl_itemkey
sl_price
sl_starttime
sl_endtime
sl_plancount
sl_skpcount
sl_popularity
...
int
<pk>
int
<fk>
decimal(10, 2)
datetime
datetime
int
int
int
FK_SECKILLP_REFERENCE_SECKILLP
FK_ORDERS_REFERENCE_CUSTOMER
seckillpay
customer
c_custkey int
<pk>
c_name
varchar(50)
c_address varchar(100)
...
s_suppkey int
<pk>
s_name
varchar(50)
s_address varchar(100)
...
DB for Write
Control flow of system
item
FK_ITEM_REFERENCE_SUPPLIER
supplier
Control flow of user
sa_skpkey
sa_paycount
sa_starttime
sa_endtime
...
int
<pk,fk>
int
datetime
datetime
itemkey
suppkey
name
count
price
detail
type
popularity
skpkey
paycount
starttime
endtime
...
DB for Read
int
<pk>
int
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
int
int
datetime
datetime
23
Queues of Read DBs
Queue
Info query
Queue
...
Queue
...
Read DBs
Queues for SecKill
DBMS
...
DBMS
DBMS
Queue of item2
Refresh
...
Set
Write DB
SecKill
Queue of item1
Queue of itemk
Cancel
Order query
Pay
DBMS
Queue of Write DB
orderitem
oi_orderkey
oi_itemkey
oi_count
oi_price
...
int
<pk,fk1>
int
<pk,fk2>
int
decimal(10, 2)
FK_ORDERITE_REFERENCE_ITEM
FK_SECKILLP_REFERENCE_ITEM
FK_ORDERITE_REFERENCE_ORDERS
orders
o_orderkey
o_custkey
o_skpkey
o_price
o_orderdate
o_paydate
o_state
...
int
<pk>
int
<fk>
int
decimal(10, 2)
datetime
datetime
int
Queue
seckillplan
item
i_itemkey
i_suppkey
i_name
i_count
i_price
i_detail
i_type
i_popularity
...
int
<pk>
int
<fk>
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
sl_skpkey
sl_itemkey
sl_price
sl_starttime
sl_endtime
sl_plancount
sl_skpcount
sl_popularity
...
int
<pk>
int
<fk>
decimal(10, 2)
datetime
datetime
int
int
int
FK_SECKILLP_REFERENCE_SECKILLP
FK_ORDERS_REFERENCE_CUSTOMER
seckillpay
customer
c_custkey int
<pk>
c_name
varchar(50)
c_address varchar(100)
...
s_suppkey int
<pk>
s_name
varchar(50)
s_address varchar(100)
...
DB for Write
Control flow of system
item
FK_ITEM_REFERENCE_SUPPLIER
supplier
Control flow of user
sa_skpkey
sa_paycount
sa_starttime
sa_endtime
...
int
<pk,fk>
int
datetime
datetime
itemkey
suppkey
name
count
price
detail
type
popularity
skpkey
paycount
starttime
endtime
...
DB for Read
int
<pk>
int
varchar(30)
int
decimal(10, 2)
varchar(1000)
int
int
int
int
datetime
datetime
24
Non-SecKill Tx.
SecKill Tx.
 Submit a transaction
 Submit a transaction
 Payment
 Payment
 Cancel a transaction
 Query a transaction
 Sync. the ReadDB with the
WriteDB
 Cancel a transaction
 Query a transaction
 Expiring a SecKill transaction
 Sync. the ReadDB with the
WriteDB
 Clean the SecKill DBs
25
 Read Workload
 Customers can search based on item id (Q1), item categories
(Q2), keywords (Q3), range of prices (Q4) or shops for
viewing the detailed item information (Q5).
 Q6 is applied only on R/W separation architecture model for
synchronizing between RDB and WDB guaranteeing
consistency.
 We create the following secondary index table on the Item (or
RItem) table to improve query efficiency and optimize query
response delay.
26
 Update workload
 Submit Order Transaction
 Pay Order Transaction
 Cancel Order Transaction
 Select Order Transaction
 Overdue Order Transaction
 Our workloads are all abstracted from Alibaba's
actual seconds kill business.
27
 Set 20% items in the Table Item_ID as hot items, while other 80%
ones are common items.
 80% workloads on those 20% hot items are hot ones.
 Workloads satisfies Zipfian distribution.
 Warm the ReadDB 5min before the SecKill. Throw the peak workload
to the ReadDB when SecKill begins, and then decay it (update every
5sec).
 The WriteDB is only accessed after the SecKill begins. The
distribution of workload is similar to that for the ReadDB.
28
QPS
TPS
Stability
29
mysqld CPU
450%-550%
400%-500%
300%-400%
每次更新数据量
0
500
1000
1500
2000
2500
3000
5000
10000
mysqld CPU
400% - 500%
400% - 500%
500% - 600%
mysqld CPU
100% - 200%
250% - 350%
300% - 400%
300% - 400%
300% - 400%
300% - 400%
300% - 400%
300% - 400%
300% - 400%
Update time
207ms
198ms
200ms
QPS
4135
4133
4128
4118
4124
4117
4113
4100
avg time2
6222 us
6324 us
6485us
7155 us
avg time1
3252us
3150us
3216us
Update time
110ms
200ms
291ms
387ms
479ms
573ms
954ms
Update time
198ms
199ms
187ms
QPS
2105
3448
4071
4132
4139
4081
4149
4140
4146
avg time1
3216us
3229 us
2916us
avg time1
3213 us
3213us
3216us
3229 us
3238 us
3243us
3258 us
3252us
已无法完成数据更新
avg time2
6485us
6405 us
5530us
avg time1
388us
635us
1509us
3210us
4574us
5460us
6739us
8089us
9480us
avg time5
5926 us
5988 us
6022us
6010 us
avg time2
6498us
6556us
6485us
avg time5
6049us
6087us
6022us
avg time2
6496 us
6475us
6485us
6489 us
6473 us
6484us
6493 us
6489us
avg time5
6044us
6026us
6022us
6028 us
6033 us
6050us
6048 us
6060us
avg time3
avg time4
55033us
10571 us
9852us
avg time2
1522us
2241us
3287us
6462us
9953us
18565us
35342us
85098us
121549us
avg time5
6022us
5967 us
5165us
avg time5
1417us
2073us
3095us
6054us
9246us
16978us
31965us
76409us
108623us
5
读负载线程数
2
5
10
20
30
50
100
200
300
QPS
4128
4115
4184
QPS
4103
4115
4128
avg time1
3218 us
3214 us
3216us
3321 us
4
读负载类型
三种
四种
五种
mysqld CPU
300%-400%
300%-400%
400% - 500%
400% - 500%
400% - 500%
400% - 500%
400% - 500%
550% - 650%
Update time
196 ms
195 ms
200ms
196 ms
3
Some results
更新周期
0.5s
1s
2s
QPS
4222
4216
4128
4189
2
mysqld CPU
300%-400%
300%-400%
300%-400%
300%-400%
1
item表规模
200万
500万
1000万
2000万
30
 Clusters with 8 nodes configured in RAID-5 on CentOS v.6.5,
which are equipped with 2 Intel Xeon E5-2620 @ 2.13 GHz
CPUs, 130GB memory and 3 TB HDD disk.
 Database: MySQL14.14、PostgreSQL 9.5.1 and VoltDB 7.8.2.
 Configurations of PeakBench
31
MySql
PostgreSql
VoltDB
10000
100000
8000
2200
90000
2000
80000
1800
TPS
6000
TPS
TPS
MySql
PostgreSql
VoltDB(3)
VoltDB(20)
70000
MySql
PostgreSql
VoltDB
1600
4000
60000
1400
2000
50000
1200
40000
0
0
10
20
30
40
50
60
70
80
1000
90
0
Warehouse count
100
200
300
400
500
600
700
800
0
900
200
400
TPS@TPCC
TPS@YCSB
MySql
PostgreSql
VoltDB
120000
1600
60000
40000
MySql
PostgreSql
VoltDB
35000
Latency (us)
1800
80000
1000
45000
140000
100000
800
TPS@Smallbank
MySql
PostgreSql
VoltDB
VoltDB(20)
2000
Latency (us)
Latency (us)
160000
600
Record size
Table size (10000)
1400
1200
30000
25000
1000
20000
800
40000
600
15000
20000
400
0
10
20
30
40
50
60
70
80
Warehouse count
Latency@TPCC
90
0
100
200
300
400
500
600
700
800
Table size (10000)
Latency@YCSB
900
0
200
400
600
800
1000
Record size
Latency@Smallbank
32
Contention Rate
Contention Intensity
MySql
PostgreSql
VoltDB
22000
20000
25
18000
MySql
PostgreSql
VoltDB
20
16000
TPS (103)
TPS
14000
12000
10000
15
10
8000
6000
5
4000
2000
0
0
2
4
6
8
0
10
15
20
25
CI
MySql
PostgreSql
VoltDB
50
35000
30000
40
Latency (ms)
Latency (us)
5
MySql
PostgreSql
VoltDB
CR (%)
40000
10
25000
20000
15000
10000
30
20
10
5000
0
2
4
6
CR (%)
8
10
0
0
5
10
15
CI
20
25
33
34
35
36
MySql
PostgreSql
VoltDB
25000
MySql
PostgreSql
VoltDB
20000
18000
16000
14000
Latency (us)
TPS
20000
15000
10000
12000
10000
8000
6000
4000
5000
2000
0
No optimization
cache
0
cashe+hash
No optimization
Category
200000
180000
160000
140000
cache
cashe+hash
Category
MySql TPS
PostgreSql TPS
VoltDB TPS
MySql write TPS
PostgreSql write TPS
VoltDB write TPS
MySql
PostgreSql
VoltDB
40000
35000
30000
Latency
TPS
120000
100000
80000
25000
20000
15000
60000
40000
10000
20000
5000
0
0
one database
one machine of R/W two machine of R/W
Category
one database
one machine of R/W two machine of R/W
Category
37
 Scalable TP benchmark is needed for various new
TP applications
 Not only for single systems, but also for
applications/architectures
 We present a benchmark proposal for such
applications
 https://github.com/daseECNU/DB-Testing
38
39
Download