Google File System Simulator

Google File System Simulator Pratima Kolan Vinod Ramachandran Google File System • • • • Master Manages Metadata Data Transfer Happens directly between client and chunk server Files broken into 64 MB chunks Chunks replicated across three machines for safety Event Based Simulation Component 1 Get Next High Priority Event from Queue Place Event in Priority Queue Simulator Priority Queue Component 2 Event 1 Event 2 Event 3 Output of simulated event Component 3 Simplified GFS Architecture Client Switch: Infinite Bandwidth Master Server Switch Represent Network Queues Network Disk 1 Network Disk 2 Network Disk 3 Network Disk 4 Network Disk 5 Data Flow The client queries the master server for a Chunk ID it wants to read. The master server returns a set of disks ids that contain the Chunk. The client requests a disk for the Chunk The disk transfers the data to the client Experiment Setup • We have a client whose bandwidth can be varied from 0…..1000 Mbps • We have 5 disks each a having a per disk bandwidth of 40 Mbps • We have 3 chunk replicas per chunk of data as a baseline • Each client request is for 1 Chunk of data from a disk Simplified GFS Architecture Client Bandwidth varied from 0…..1000 Mbps Client Switch: Infinite Bandwidth Master Server Switch Represent Network Queues Network Disk 1 Network Disk 2 Network Disk 3 Network Disk 4 Network Disk 5 Chunk ID: 0-1000 0-1000 0-2000 Per Disk Bandwidth : 40 Mbps 1001-2000 1001-2000 Experiment 1 • Disk Requests Served With out Load Balancing – In this case we pick the first chunk server from the list of available chunk servers that contains the disk block. • Disk Requests Served With Load Balancing – In this case we apply a greedy algorithm and balance the load of incoming requests across the 5 disks Expectation • In the Non load balancing case we expect the effective request/data rate to reach a peak value of 2 disks(80 Mbps) • In the load balancing case we expect the effective request/data rate to reach a peak value of 5 disks(200 Mbps) Load Balancing Graph This graph plots the data rate at client vs. client bandwidth Experiment 2 • Disk Requests Served With No Dynamic Replication – In this case we have a fixed number of replicas(3 in our case) and the server does not create more replication based on statistics for read requests. • Disk Requests Served With Dynamic Replication – In this case the server replicates certain chunks based on the frequency of the chunk requests. – We define a replication factor , which is fraction < 1 – No of Replicas For Chunk = (replication factor) * No of requests For The Chunk – We Cap the Max No of Replicas by the Number of disks Expectation • Our Requests are all aimed on the chunks placed in disk 0,disk 1 , disk2. • In the non replication case we expect the effective data rate at the client to me limited by the bandwidth provided by 3 disks(120 Mbps) • In the replication case we expect the effective data rate at the client to me limited by the bandwidth provided by 5 disks(200 Mbps) Replication Graph This graph plots the data rate at client vs. client bandwidth Experiment 3 • Disk Requests Served with no Rebalancing – In this case we do not implement any rebalancing of read requests based on frequency of chunk requests • Disk Requests Served with Rebalancing – In this case we perform rebalancing of read requests by picking a request with highest frequency and transferring it to a disk with a lesser load Graph 3 Request Distribution Graph 5000 4500 4000 No of Requests 3500 In each disk 3000 No Re-Balancing,No Replication 2500 No Re-Balancing,Replication 2000 Re-Balancing,No Replication 1500 1000 500 0 Disk 0 Disk 1 Disk 2 Disk 3 Disk 4 Conclusion and Future Work • GFS is a simple file system for large-data intensive applications • We studied the behavior of certain read workloads on this file system • In the future we would like to come up with optimizations that could fine tune GFS

Google File System Simulator

Related documents

Products

Support

Google File System Simulator

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib