slides - PDCC - Nanyang Technological University

advertisement
When Data Management Systems Meet Approximate
Hardware: Challenges and Opportunities
Author: Bingsheng He (Nanyang
Technological University, Singapore)
Speaker: Jiong He (Nanyang
Technological University, Singapore)
1
What is Approximate Hardware?
• Approximate hardware can trade off the
accuracy of results for increased performance,
reduced energy consumption, or both.
• Computer architecture researchers have
proposed various emerging designs on
approximate hardware, such as CPU, main
memory and storage.
2
What is Approximate Hardware?
Precise CPU
Precise SSD
Multiplier using 8-bit
mantissas reduces over 78%
of energy than a full 24-bit
Approximate CPU multiplier.
Approximate SSD
Improves write latencies by
1.7 × on average by
trading off less than 10% of
accuracy
3
A Radical Vision: Data Management on
Approximate Hardware
• ApproxiDB is a radical data management system
with its design, implementation and optimization
aware of approximate hardware.
– ApproxiDB will run on a hybrid machine consisting of
both approximate hardware and precise hardware.
– Enabling techniques: approximate query processing
and precise query processing.
– Several examples demonstrate the effectiveness of
ApproxiDB in performance and energy consumption.
4
Outline
•
•
•
•
Motivations
Our proposal: ApproxiDB
Open problems
Summary
5
Hardware Evolution Drives Database
Architectural Innovations
Disk-based databases
• Open-source: PostgreSQL, MySQL
• Most commercial databases
• Taught in current textbooks
Main memory databases
• Built on large-sized main memory
• More efficient in data accessing
and algorithm optimization
• Hot research topic
6
Hardware Evolution Drives Database
Architectural Innovations (con’t)
(new) Parallel databases
• Emerge with CMP, SMP and SMT
• Hardware-conscious optimizations
Query co-processor
• Hardware: GPUs, FPGA, etc.
• Efficient query processing with
massively parallel processors
• Examples: GPUQP, Ocelot, etc.
7
Prediction is Always Difficult,
especially about the Future
What is the next?
Approximate hardware!
8
Approximate Hardware: An Example
with Solid State Storage
(a) Precise MLC
1.
2.
Guard bands are to separate different analog values so that they can
safely represent digital values.
Solid state storages adopt iterative program-and-verify (P&V).
Figures are reproduced from [14] in our paper.
9
Approximate Hardware: An Example
with Solid State Storage
(a) Precise MLC
The guard band is reduced -> a smaller number of
P&V iterations to achieve the acceptable accuracy
(b) Approximate MLC
Figures are reproduced from [14] in our paper.
10
What’s New?
• O1: The data are inherently imprecise, and thus can
tolerate loss of accuracy.
– For example, the reading from a temperature sensor may
not need the accuracy to “last decimal”.
• O2: The query processing itself can tolerate loss of
accuracy, and the result can be imprecise/approximate.
– Approximate query processing
• O3: Although the query processing requires precise
final result, a hybrid execution on precise hardware
and approximate hardware could have better
performance/energy consumption than the execution
with precise hardware only.
– The focus of this talk.
11
Outline
•
•
•
•
Motivations
Our proposal: ApproxiDB
Open problems
Summary
12
Our Proposal: Approximate and Refine
• Design for hybrid hardware (including both
precise hardware and approximate hardare).
• Approximate-and-refine consists of two steps:
– Step 1: use approximate hardware to obtain
intermediate results (superset of the final results)
within some query processing steps.
– Step 2: use precise hardware to refine the
intermediate results and obtain the final precise
results.
• We show two examples (selection and merge
sort) to illustrate this paradigm.
13
Example 1: Selection
Select tuples where:
R.x > 4.5 and R.x < 5.9
(a) cost: 1 * 8 = 8
Precise storage
Precise execution
Approximate execution
(b) cost: 0.5 * 8 + 1 * 2 = 6
Approximate storage
5.7
5.7
5.7
R.x
10.5 10.1 1.1
12.5 5.7
(a) Selection on precise CPU
8.5
4.3
8.2
R.x
10.5 10.1 1.1
4.3
12.5 5.7
8.5
(b) Selection on hybrid processor
4.3
8.2
Example 2: Merge Sort
Sort input in ascending order
Precise storage
Precise execution
Approximate execution
Approximate storage
1.1 4.3
r3
1.1 4.3
r3
r7
5.7 8.2 8.5 10.1 10.5 12.5
r5
r6
r8
1.1 10.1 10.5 12.5 4.3
r3
r2
r1
r4
r1
r2
r7
5.7 8.2 8.5
r1
r3
r5
r4
r8
r5
10.1 10.5 1.1 12.5 5.7 8.5
r2
r4
r6
4.3 8.2
r7
r6
r8
10.5 10.1 1.1 12.5 5.7 8.5 4.3 8.2
r1
r2
r3
r4
r5
(c) Sort on precise storage
r6
r7
r8
r7
0.9 4.4
r3
r7
5.7 8.2 8.5 10.1 10.5 12.5
r5
r6
r8
r2
r5
r8
r6
r1
r7
r4
r2
r1
r4
r3
r5
r4
r1
r4
5.7 8.4 8.3
r5
10.0 10.4 1.2 12.4 5.8 8.4
r2
r1
5.5 8.2 8.4 9.8 10.2 12.4
1.0 9.9 10.3 12.2 4.4
r3
r2
r6
r8
r6
4.4 8.3
r7
r8
10.5 10.1 1.1 12.5 5.7 8.5 4.3 8.2
r1
r2
r3
r4
r5
(d) Sort on hybrid storage
r6
r7
r8
Initial Design of ApproxiDB
CPU
Precise CPU
Approximate CPU
Query
optimizer
Query operators
(e.g., joins and sort)
Cost
estimation
Access methods
(e.g., scans)
Precise storage
Other
Components
in DBMS
Approximate
storage
DBMS
(ApproxiDB)
Storage
16
Extensions to existing DBMS
• Allow users to specify what should be stored
in approximate storage as well as accuracy
requirement.
• We propose four query processing modes:
– Precise storage + precise query processing
– Approximate storage + precise query processing
– Precise storage + approximate query processing
– Approximate storage + approximate query
processing
17
Extensions to Existing DBMS (con’t)
• The cost model needs to consider new factors
like the tradeoff between performance/energy
consumption and accuracy.
• Query optimizer should revisit the physical
operator implementation and query
processing executions to optimally utilize the
hybrid system.
18
Outline
•
•
•
•
Motivations
Our proposal: ApproxiDB
Open problems
Summary
19
Open Problems
• We are facing a lot of open problems in
ApproxiDB.
– Automatic physical design to ease burden of
users.
– Problems in multi-level approximate hardware.
– Synthetize techniques in probabilistic databases in
ApproxiDB.
– Query-level tradeoff between accuracy and
performance/energy.
20
Outline
•
•
•
•
Motivations
Our proposal: ApproxiDB
Open problems
Summary
21
Summary
• We sketch a radical vision of ApproxiDB on
hybrid hardware with both approximate
hardware and precise hardware.
• We demonstrate our initial design of
ApproxiDB to exploit those optimization
opportunities.
• We conjecture that approximate hardware can
be one of the interesting driving forces in
database community in the future.
22
Q&A
• Thank you.
• Our research group: Xtra Computing Group
http://pdcc.ntu.edu.sg/xtra/
23
Download