MegaStore

advertisement
Megastore
Scalable Highly Available Storage for Interactive
Systems
Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin,
James Larson,Jean
Michel L´eon,Yawei Li, Alexander Lloyd, Vadim Yushprakh
Presented By:
Hamid Seyedmoradi
Ayoub Hamidi
Ehsan Mohamad Nezamian
Advanced Database Systems
SRBIAU, Kurdistan Campus
10May2012
Megastore
 Motivation
 Introduction
 NoSQL & RDBMS
 Megastore
 Paxos
2
SRBIAU, Kurdistan Campus
Advanced Database Systems
Megastore
wow! more than 3 billion write
and 20 billion read daily
 key Contribution
 Data Model and Storage System
 Paxos Replication
 Report on Experience
3
SRBIAU, Kurdistan Campus
Advanced Database Systems
AVAILABILITY
& SCALE
 Replication
For Availability, we implemented a synchronous, fault tolerant
log replicator optimized for long distance links
 Partitioning and Locality
small databases
4
SRBIAU, Kurdistan Campus
for scale, we partitioned data into a vast space of
Advanced Database Systems
AVAILABILITY
& SCALE
 Replication
 Strategies
 Asynchronous Master/Slave
 Synchronous Master/Slave
 Optimistic Replication
We decided to use Paxos
5
SRBIAU, Kurdistan Campus
Advanced Database Systems
Technology Options
6
SRBIAU, Kurdistan Campus
Advanced Database Systems
Technology Options
7
SRBIAU, Kurdistan Campus
Advanced Database Systems
AVAILABILITY
& SCALE
 Partitioning and Locality
 Replication
Datacenters
ACID semantics
within an entity group
Entity Groups
Partition the
datastore
Looser consistency across
entity groups
Each entity group is
synchronously replicated across
datacenters
8
SRBIAU, Kurdistan Campus
Advanced Database Systems
Entity group data and
replication metadata stored
in scalable NoSQL
datastores
AVAILABILITY
& SCALE
 Partitioning and Locality
 Operations:
Entities (Units of data)
Most transactions
are within a single
entity group
Cross Entity group
transactions supported via
Two – Phase Commit
Asynch communication
between entity groups
supported by Queues
9
SRBIAU, Kurdistan Campus
Entity Group 1
Local
Index
receive
Send
queue
Local
Index
Advanced Database Systems
Global Indexes
span entity
groups but have
weaker
consistency
Entity Group 2
AVAILABILITY
& SCALE
 Partitioning and Locality
 Entity Groups
 Selecting Entity Group Boundaries

Example
• Email
• Blogs
 Physical Layout
10
SRBIAU, Kurdistan Campus
Advanced Database Systems
Megastore
 API Design Philosophy
 Data Model
 Pre-Joining with Keys

SCATTER
 Indexes



Storing Clause
Repeated Indexes.
Inline Indexes
 Mapping to Bigtable
11
SRBIAU, Kurdistan Campus
Advanced Database Systems
Megastore
12
SRBIAU, Kurdistan Campus
Advanced Database Systems
Megastore
 Transactions and Concurrency Control
 Read
 current
 snapshot
 inconsistent
Transaction Lifecycle
1-Read
2-Application logic
13
SRBIAU, Kurdistan Campus
3-Commit
4-Apply
5-Clean up
Advanced Database Systems
Megastore
 Queues

14
Two
Phase
Commit
SRBIAU, Kurdistan Campus
Advanced Database Systems
REPLICATION
 Brief Summary of Paxos
 Megastore’s Approach
 Fast Reads
 Fast Writes
 Replica Types
 Witness Replica
Architecture
15
SRBIAU, Kurdistan Campus
Advanced Database Systems
Architecture
16
SRBIAU, Kurdistan Campus
Advanced Database Systems
Data Structures and
Algorithms
 Replicated Logs
17
SRBIAU, Kurdistan Campus
Advanced Database Systems
Data Structures and
Algorithms
 Reads
 Query Local
 Find Position
 Local read
 Majority read
 Catchup
 Validate
 Query Data
18
SRBIAU, Kurdistan Campus
Advanced Database Systems
Data Structures and
Algorithms
19
SRBIAU, Kurdistan Campus
Advanced Database Systems
Data Structures and
Algorithms
 Writes
 Accept Leader
 Prepare
 Accept
 Invalidate
 Apply
20
SRBIAU, Kurdistan Campus
Advanced Database Systems
Feedback
21
SRBIAU, Kurdistan Campus
Advanced Database Systems
END
With Thanks
Question
?
22
SRBIAU, Kurdistan Campus
Advanced Database Systems
Download