BG: A Benchmark to Evaluate Interactive Social Networking Actions

advertisement
Benchmarking Cloud Serving
Systems with YCSB
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears
Yahoo! Research
Presenter Duncan
Benchmarking Cloud Serving Systems
with YCSB
• Benchmarking vs Testing
• Any difference?
• My opinion
– Benchmarking: Performance
– Testing: usability test, security test, performance
etc…
Motivation
• A lot of new systems in Cloud for data storage and
management
– MongoDB, MySQL, Asterix, etc..
• Tradeoff
– E.g. Append update to a sequential disk-log
• Good for write, bad for read
– Synchronous replication
• copies up to date, but high write latency
• How to choose?
– Use benchmark to model your scenario!
Evaluate Performance =?
• Latency
– Users don’t want to wait!
• Throughput
– Want to serve more requests!
• Inherent tradeoff between latency and
throughput
– More requests => more resource contention=>
higher latency
Which system is better?
• “Typically application designers must decide on an acceptable latency,
and provision enough servers to achieve the desired throughput”
• achieve the desired latency and throughput
with fewer servers.
– Desired latency:0.1 sec, 100 request/sec
– MongoDB, 10 server
– Asterix DB, 15 server
What else to evaluate?
• Cloud platform
• Scalability
– Good scalability=>performance proportional to #
of servers
• Elasticity
– Good elasticity=>performance improvement with
small disruption
A Short Summary
• Evaluate performance = evaluate latency,
throughput, scalability, elasticity
• A better system= less machine to achieve the
performance goal
YCSB
• Data generator
• Workload generator
• YCSB client
– Interface to communicate with DB
YCSB Data Generator
• A table with F fields and N records
• Each field => a random string
• E.g. 1,000 byte records, F=10, 100 bytes per
field
Workload Generator
• Basic operations
– Insert, update, read, scan
– No join, aggregate etc.
• Able to control the distributions of:
• Which operation to perform
– E.g. 0.95 read, 0.05 update, 0 scan => read-heavy workload
• Which record to read or write
– Uniform
– Zipfian: some records are extremely popular
– Latest: recent records are more popular
YCSB Client
• A script
– Use the script to run the benchmark
• Workload parameter files
– You can change the parameter
• Java program
• DB interface layer
– You can implement the interface for your DB system
Experiments
• Experiment Setup:
– 6 servers
– YCSB client on another server
– Cassandra, HBase, MySQL, PNUTS
• Update heavy, read heavy, read only, read
latest, short range scan workload.
Future Work
• Availability
– Impact of failure on the system performance
• Replication
– Impact to performance when increase replication
4 criteria
• Author’s 4 criteria for a good benchmark:
– Relevance to application
– Portability
• Not just for 1 system!
– Scalability
• Not just for small system, small data!
– simplicity
Reference
•
•
•
•
Benchmarking Cloud Serving Systems with YCSB, Brian F. Cooper, Adam Silberstein,
Erwin Tam, Raghu Ramakrishnan, Russell Sears, SOCC 10
BG: A Benchmark to Evaluate Interactive Social Networking Actions, Sumita
Barahmand, Shahram Ghandeharizadeh, CIDR 13
http://en.wikipedia.org/wiki/Software_testing
http://en.wikipedia.org/wiki/Benchmark_(computing)
• Thank You!
• Questions?
Why a new benchmark?
• Most cloud systems do not have a SQL
interface => hard to implement complex
queries
• Benchmark only for specific applications
– TPC-W for E-commerce
– TPC-C for apps that mange, sell, distribute
product/service
Download