Google File System

advertisement
Google File System
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung
Google∗
Overview

NFS

Introduction-Design Overview

Architecture

System Interactions

Master Operations

Fault tolerance

Conclusion
NFS

Is build RPC’s

Low performance

Security Issues
Introduction
Need For GFS:

Large Data Files

Scalability

Reliability

Automation

Replication of data

Fault Tolerance
Design Overview:
Assumptions:

Component’s Monitoring

Storing of huge data

Reading and writing of data

Well defined semantics for multiple clients

Importance of Bandwidth
Interface:

Not POSIX compliant

Additional operations
o
Snapshot
o
Record append
Architecture:
Cluster Computing

Single Master

Multiple Chunk Servers
 Stores 64 bit file chunks

Multiple clients
Single Master , Chunk size & Meta data
Single Master:
Minimal Master Load.
 Fixed chunk Size.
 The master also predicatively provide chunk
locations immediately following those requested
by unique id.

Chunk Size :
64 MB size.
 Read and write operations on same chunk.
 Reduces network overhead and size of metadata
in the master.

Metadata :


Types of Metadata:
o
File and chunk namespaces
o
Mapping from files to chunks
o
Location of each chunks replicas
In-memory data structures:
o
Master operations are fast.
o
Periodic scanning entire state is easy and
efficient


Chunk Locations:
o
Master polls chunk server for the information.
o
Client request data from chunk server.
Operation Log:
o
Keeps track of activities.
o
It is central to GFS.
o
It stores on multiple remote locations.
System Interactions:

Leases And Mutation order:
o Leases maintain consistent
mutation order across the replicas.
o
Master picks one replica as primary.
o
Primary defines serial order for
mutations.
o
Replicas follow same serial order.
o
Minimize management overhead at
the master.

Atomic Record Appends:
o
GFS offers Record Append .
o
Clients on different machines append to the same file
concurrently.
o
The data is written at least once as an atomic unit.
 Snapshot:
o
It creates quick copy of files or a directory .
o
Master revokes lease for that file
o
Duplicate metadata
o
On first write to a chunk after the snapshot
operation
o
All chunk servers create new chunk
o
Data can be copied locally
Master Operation

Namespace Management and Locking:
o
GFS maps full pathname to Metadata in a table.
o
Each master operation acquires a set of locks.
o
Locking scheme allows concurrent mutations in same directory.
o
Locks are acquired in a consistent total order to prevent deadlock.

Replica Placement:
o
Maximizes reliability, availability and network bandwidth utilization.
o
Spread chunk replicas across racks
Creation, Re-replication, Rebalancing

Create:
o
Equalize disk utilization.
o
Limit the number of creation on chunk server.
o
Spread replicas across racks.

Re-replication:
o
Re-replication of chunk happens on priority.

Rebalancing:
o
Move replica for better disk space and load balancing.
Remove replicas on chunk servers with below average free space.
o

Garbage Collection:
o
Makes system Simpler and more reliable.
o
Master logs the deletion, renames the file to a hidden name.

Stale Replica detection:
o
Chunk version number identifies the stale replicas.
o
Client or chunk server verifies the version number.
Fault Tolerance

High availability:
o
Fast recovery.
o
Chunk replication.
o
Shadow Masters.

Data Integrity:
o
Check sum every 64 kb block in each chunk.
Conclusion
GFS meets Google storage requirements:

Incremental growth

Regular check of component failure

Data optimization from special operations

Simple architecture

Fault Tolerance
Download