Computer Science iBigTable: Practical Data Integrity for BigTable in
Public Cloud
CODASPY 2013
Wei Wei, Ting Yu , Rui Xue
1/40
iBigTable – Overview
BigTable – Scalable Storage System o Store large data sets with petabytes or even more
Business transactions, software logs, social network messages o Benefits from processing large data sets
Identify business opportunities, find software bugs, mine social relationship o Widely used in Google, Facebook, Twitter
However, small companies and researchers usually lack of capabilities to deploy BigTable o Large cluster required o Technical difficulties o High maintenance cost
Deploying BigTable in a public cloud is an economic solution.
However, one may not always trust the public cloud provider.
Computer Science
2/40
iBigTable – Overview
Our Focus o Provide integrity assurance for BigTable in public cloud
Basic Idea o Build Merkle Hash Tree based Authenticated Data Structure o Decentralize integrity verification across multiple nodes
Computer Science
3/40
Agenda
Introduction
System Model
System Design
Experimental Evaluation
Related Work
Conclusion
Computer Science
4/40
Merkle Hash Tree (MHT)
• s root
=S(h root
)
• h root
=H(h
12
|h
34
)
• h
12
=H(h
1
|h
2
) • h
34
=H(h
3
|h
4
)
• h
1
=H(d
1
) • h
2
=H(d
2
) • h
1
=H(d
3
) • h
1
=H(d
4
)
Verification Object (VO) o Data returned along with result and used to authenticate the result
Example o Authenticate data d
1
, and the VO for d
1 is {h
2 and h
34
}
Computer Science
5/40
BigTable – Data Model
A table is a sparse, distributed, persistent multidimensional sorted map (OSDI 2006).
Data Model o Table schema only defines its column families
Each family consists of any number of columns
Each column consists of any number of versions
Columns only exist when inserted, NULLs are free
Columns within a family are sorted and stored together o Table contains a set of rows sorted based on row key
Row: a set of column families
Column Family: a set of columns
Cell: arbitrary string (uninterpreted string)
Computer Science
6/40
BigTable – Data Organization
Tablet o Root tablet o Metadata tablet o User tablet
Tablet Server o Each tablet is only stored in a tablet server o Multiple tablets can be stored in a tablet server
Master
Responsible for load balancing and assigning tablets
Computer Science
7/40
BigTable – Data Operations
Queries o Single row query by specify the row key o Range query by specifying start and end row keys o Projection query to retrieve specific column, column family
Changes o Data insert, update, and delete o Tablet split & merge
Computer Science
8/40
System Model
Similar to Database Outsourcing o Host data in untrusted party and support data retrieval o Principle ideas of integrity verification
Different from Database Outsourcing o Distributed data among large number of nodes
How to handle authenticated data structures during tablet merging or splitting
Impractical to store authenticated structures in a single node
Not scalable to adopt a centralized integrity verification scheme at a single point o Simple data model and query interfaces
Design much simpler and efficient authenticated structures and protocols to verify data integrity
The actual design and deployment of authentication schemes are significantly different
Computer Science
9/40
System Model
Assumptions o The public cloud is not trusted, and BigTable is deployed in the public cloud, including the master and tablet servers o The data owner has a public/private key pair, and public key is known to all o The data owner is the only party who can update data o Public communications are through a secure channel
Attacks from The Public Cloud o Return incorrect data by tampering some data o Return incomplete data result by discarding some data o Report that data doesn’t exist or return old data
Computer Science
10/40
System Model cont’d
Goal o Deploy BigTable over Public Cloud with Practical Integrity Assurance
Design Goals o Security (Integrity)
Correctness, completeness, freshness o Practicability
Simplicity, flexibility, efficiency
Computer Science
11/40
System Design
Basic Idea o Embed a MHT-based Authenticated Data Structure in each tablet
Computer Science
12/40
Distributed Merkle Hash Tree
Data Owner
Root hash
Root Tablet
Pros
Authenticated data distributed across nodes
Only maintain one hash for all data
Cons
Require update propagation
Meta Tablet
Concurrent update could cause issues
Hard to synchronize hash tree update
Complicate protocols between tablet servers
User Tablet User Tablet
• …
• …
Computer Science
13/40
Our Design
Data Owner
Root hash
Root Tablet
User Tablet
Computer Science
Meta Tablet
• …
User Tablet
14/40
• …
Our Design
Root Tablet
Data Owner
Root hash
Root hash
Root hash
Root hash
User Tablet
Computer Science
Meta Tablet
• …
User Tablet
15/40
• …
System Design
Basic Idea o Embed a MHT-based Authenticated Data Structure in each tablet o Store the root hash of each MHT in a trusted party (e.g., data owner) o Decentralize the integrity verification across multiple tablet servers
Data integrity is guaranteed by the correctness of the root hash of the MHT in each tablet.
Computer Science
16/40
Decentralized Integrity Verification
•
1.2 generate VO
•
1.1 meta key (root, meta, table name, start row key)
•
Client
• Client
•
Client
•
1.3 meta row (meta tablet location, start and end keys)
•
,
•
Tablet Server
• serving ROOT tablet
•
2.2 generate VO
•
2.1 meta key (meta, table name, start row key)
•
2.3 meta row (user tablet location, start and end keys)
•
, VO
• Tablet Server
• serving META tablet
•
2.2 generate VO
•
3.1 start and end row keys
•
3.3 rows within the start and end row keys
•
, VO
Computer Science
•
Tablet Server
• serving USER tablet
17/40
iBigTable – Authenticated Data Structure
Signature Aggregation Compared with Merkle Hash Tree o Both of them can guarantee correctness and completeness o Incur significant computation cost in client side and large storage cost in server side o Not clear how to address freshness
MHT-based Authenticated Data Structure o SL-MBT: A single-level Merkle B+ tree
Build a Merkle B+ tree based on all key value pairs in a tablet
Each leaf is a hash of a key value pair o ML-MBT: A multi-level Merkle B+ tree
Builds multiple Merkle B+ trees in three different levels o TL-MBT: A two-level Merkle B+ tree (adopted)
Computer Science
18/40
iBigTable – TL-MBT
Index Level o Only one tree – index tree o Each leaf points to a data tree
Data Level o Row Tree: generate hashes for all rows and each leaf is a hash of a row
Computer Science o Column Family Tree: generate hashes for a column family of all rows and each leaf is a hash of a column family of a row o Column Tree: generate hashes for a column of all rows and each leaf is a hash of a column of a row
19/40
iBigTable – TL-MBT
Verification Object Generation o Find the data tree(s) based on the specific query o Use the data tree(s) to generate VO based on the query range
Pros o Performance is comparable to ML-MBT for row-based query o Much more efficient than SL-MBT and ML-MBT for projection query o Flexible authenticated data structure
Cons o Update cost may increase by 3 times o Large storage cost if column trees are created
Computer Science
20/40
iBigTable – Data Access
Range query within tablet o Find metadata tablet, user tablet, data through specific tablet server
Range query across tablets o Break a large range into small sub-ranges
Based on the end key of each tablet
Sub-range falls in a tablet o Execute the sub-range queries
Computer Science
21/40
iBigTable – Single Row Update
Partial Tree Verification Object (VO) o Data included
Only keys and hashes of data for two boundaries
Hashes of nodes for computing the root hash
Keys in related inner nodes o Used for direct update within the range of partial tree
•
3.4 verify and update tablet root hash
•
3.1 new row
•
3.3 partial tree VO
•
Data Owner
Computer Science
•
3.2 generate VO
•
Tablet Server
• serving USER tablet
22/40
iBigTable – Single Row Update cont’d
30 60
50 10 70 80
0 10 20 30 40 50 60 70 80 90
Initial MB+ row tree of a tablet in a tablet server.
Computer Science
23/40
iBigTable – Single Row Update cont’d
30 60
50
30 40 50
45
•
New Key 45
Insert a row with key 45 into partial tree VO
Computer Science
30 60
40 50
30 40 45 50
•
New Key 45
•
Partial tree VO after 45 is inserted
24/40
iBigTable – Efficient Batch Update
Single row update is inefficient o one verification for single row
Range query is efficient o One verification for multiple rows
How can we do batch update like range query?
•
3.4 verify and update tablet root hash
•
3.1 request partial tree VO for a range
•
Data Owner
•
3.3 partial tree VO
•
3.4 new rows
• … … …
•
3.n new rows
•
3.2 generate VO
•
Tablet Server
• serving USER tablet
Computer Science
25/40
iBigTable – Tablet Changes
Tablet split o Grow too large o Load balancing o Better management
Tablet merge o Only a few data in a tablet o Improve query efficiency
How to guarantee data integrity?
o Make sure the root hash of each tablet is correctly updated
Computer Science
26/40
iBigTable – Tablet Split
30 60
50 10 70 80
0 10 20 30 40 50 60 70 80 90
•
(a) A MBT of a tablet in a tablet server, and split tablet at key 45.
Computer Science
27/40
iBigTable – Tablet Split cont’d
30 60
50 10 70 80
10 20
•
Left boundary node
30 40 50 60
•
Two boundary keys
•
Right boundary node
•
(b) Partial tree returned to the data owner.
Computer Science
28/40
iBigTable – Tablet Split cont’d
10
30 60
50
•
Split
30 60
50 70 80
10 20 30 40
•
Left Partial Tree
50 60
•
Right Partial Tree
•
(c) Split it into two partial trees by data owner.
Computer Science
29/40
iBigTable – Tablet Split cont’d
10
30 60
50
10 20 30 40
10 30
10 20 30 40
•
(d) Data owner adjusts left partial tree and computes the new root hash for the new tablet.
Computer Science
30/40
iBigTable – Tablet Split cont’d
30 60
50
50 60
70 80 60
50 60
70
80
•
(e) Data owner adjusts right partial tree and computes the new root hash for the new tablet.
Computer Science
31/40
iBigTable – Tablet Merge
70 50 70
•
Merge
10 30 60 10 30 60
30 40
•
Left Partial Tree
50
•
Right Partial Tree
30 40 50
•
Merged Tree
•
Data owner merges two partial trees sent from tablet servers into one for the new merged tablet
Computer Science
32/40
iBigTable – Experimental Evaluation
System Implementation o Implementation based on HBase o Extend some interfaces to specify integrity options o Add new interfaces to support efficient batch updates
Experiment Setup o 5 hosts in Virtual Computing Lab (VCL) o Intel(R) Xeon(TM) CPU 3.00GHz
o Red Hat Enterprise 5.1, Hadoop-0.20.2, and HBase-0.90.4
o Client network with 30Mbps download and 4Mbps upload
Computer Science
33/40
iBigTable – Baseline
Observations o It almost takes the same time to transmit data less than 4k o Time is doubled from 4k to 8k till around 64k.
o After 64k, the time dramatically increases.
o The VO size increases as the range increases, but the VO size per row actually decreases.
•
Ex 1. Time to receive data from server
Computer Science
•
Ex 2. VO size vs # of rows
34/40
iBigTable – Write
Observations o The performance overhead ranges from 10% to 50%.
o iBigTable with Efficient Batch Update only causes a performance overhead about 1.5%.
o Communication cost is high, but computation cost is small about 2~5%.
•
Ex 3. Write performance.
Computer Science
•
Ex 4. The breakdown of write cost
35/40
iBigTable – Read
Observations o The read performance overhead is small, which ranges from 1% to 8%.
o The total computation cost of both client and servers is about 1%.
o The major part of performance downgrade is caused by communication.
•
Ex 5. Read performance
Computer Science
•
Ex 6. The breakdown of read cost
36/40
iBigTable – TL-MBT
Observations o As the number of trees that need to be updated increases, the performance decreases dramatically.
o For different data size, we see the large performance variation for different cases.
•
Ex 7. TL-MBT update performance.
Computer Science
•
Ex 8. Projection query with TL-MBT
37/40
iBigTable – Related Work
Research related to BigTable o Performance evaluation [Carstoiu et al., NISS 2010] o High performance OLAP analysis [You et al., IMSCCS 2008] o BigTable in a hybrid cloud [Ko et al., HotCloud 2011] o Integrity layer for cloud storage [Kevin et al., CCS 2009]
Outsourcing Database o Different authenticated data structures [DASFAA 2006] o Probabilistic approaches [Xie et al.VLDB 2007] o Approaches to address complex queries [Yang et al., SIGMOD 2009] o Partitioned MHT (P-MHT) [Zhou et al., MS-CIS 2010]
Computer Science
38/40
iBigTable – Conclusion
Contributions o Explore the practicability of different authenticated data structures
Focus on Merkle Hash Tree based authenticated data structures o Design a set of efficient mechanisms to handle authenticated data structure changes
Efficient data batch update
Handle tablet split and merge o Implement a prototype of iBigTable based on Hbase, an open source implementation of BigTable o Conduct experimental evaluation of performance overhead
Computer Science
39/40
Computer Science
40/40