8803 AIAD Project Report Chetna Kaur (GT ID: 902428550)

advertisement
8803 AIAD Project Report
Chetna Kaur (GT ID: 902428550)
Contents:
1.
Abstract
2.
Introduction
1.
Objectives
2.
Motivation
3.
Our approach
3.
Design Principles and Challenges
4.
Architecture
5.
System Interface & Functionality
6.
Detailed System Design
7.
Evaluation
8.
Strengths & Limitations
9.
Main Contributions
10. Future Work
Acknowledgements
I would like to thank Dr. Ling Liu for her guidance and for several
useful suggestions that have helped shape this work.
I also take this opportunity to thank Mahesh S.B., my ex-System
Architect at Huawei Technologies, for helping me find answers to the
various challenges that arose during the course of this project.
Introduction
1.1.
Objective
With the popularity of mobile internet devices growing exponentially every year,
mobile computing is emerging as a new platform in the evolution of internet
computing. A key area that needs to evolve in this paradigm shift is the
development of efficient storage services that meets the demands of a modern
mobile internet device user.
While many corporations have focused on providing explicit storage space to the
traditional end user PC applications, most of them do not provide means to
extend this available storage to the new generation mobile internet applications.
Therefore, what is required is a storage service available on the internet that can
be used to deliver content to the end user mobile internet device keeping in mind
the limitations that are inherent to the mobile internet device platform.
The service thus provided could then be used in such a way so as to extend the
storage capability of internet enabled devices and will pave the way for new
generations of applications to be written that will provide rich content to all
Internet user regardless of their memory limitations and geographical location.
1.2.
Approach
In this work we propose an approach that will let the user access files from the
remote store without having to download the file locally in its entirety.
We
propose the use of a block level filing system to meet these requirements. A
block level filing system, serves to maintain a logical view of each of the user’s
files in the form of several smaller sized Blocks. The use of this approach allows
for easy download/upload of blocks of files, as needed by the user, consuming
both less bandwidth and memory. Further, we propose architecture for a
distributed storage platform, the detailed design of the block level file system,
and the API’s that need to be provided by such a platform.
Requirements and Challenges
Efficiency: Currently, data rate dictates the cost of internet access from a mobile
device. Though this trend will change in the future, efficient techniques for
accessing the storage must be designed. An example would be to provide APIs
that will let the user access only part of the file from the remote store without
having to download the file locally in its entirety.
Client Memory and Bandwidth Limitations: The main requirement was to be able
to cater to the clients with resource constraints. The storage service APIs had to
be designed bearing in mind this limitation while providing seamless access to
storage that will enhance the user experience of operating in a mobile
environment.
Scalability: The storage service must be scalable to meet the requirements of
ever increasing mobile internet users. The systems that provide the actual
storage could be servers residing in different physical locations. The storage
service must be capable of running on this distributed system of servers.
Simple API interface: The APIs should follow a simple model of JSON over HTTP
protocol.
System Architecture
Meta File System Server
WWW
Distributed Storage Server
Access Server
Fig: System Architecture
Detailed System Design
1.1.
Block Level File System
The Meta File System and Access Server together form the Block Level Filing
System.
A block level file system serves two purposes. First, it represents data to end
users and applications. This data is organized in directories or folders typically in
some hierarchical fashion. The second thing it does is organize where data is
placed in storage. These filing systems have to scatter the data around the
storage servers to make sure that all data can be accessed with reasonable
performance. They do this by directing the storage block addresses where the
data is going to be placed. These are actually all logical block addresses as the
disk drives keep their own internal block translation tables.
So the filing system sends commands to "slave" storage to write data to certain
blocks and retrieve it from certain blocks. This is what is commonly called blocklevel storage. Storing functions are based on master/slave relationships, not
client server.
It is also possible for systems to request data using the user-level data
representation interfaces (File level storage). This is done by the client using the
data's filename, its directory location, URL, or whatever. This is a client/server
model of communicating. The access server in this case receives the filing
request and then looks up the data storage locations where the data is stored
and retrieves it using storing level functions (block level storage).
1.2.
Meta File System
Meta File System abstracts the User created directories and the files. This is the
central data store that maintains information such as user currently connected,
files are being used and the blocks are being written, Storage Servers on which
file are being created. Meta File System runs as a separate component on a
dedicated server.
Following Diagram show the basic design of Meta File System,
1.3.
Access Server
Access Sever acts as user gateway to internet file system. This is responsible for
controlling the flow of all user action and the file system function like file open,
close, read, write, create file/directory etc. Access Server makes use of Meta File
System which mimics the file system. When a User file needs to be created, it
gets Meta File System for the Storage Sever information on which it has to be
created and requests Storage Server to create the file. The Access Server also
contains a file cache, and carries out caching of blocks that a client requests or
may request in the near future.
Fig: Access Server Design
1.4.
Storage Server
The Storage Server is the data store where all the files are stored. It is essentially
has a flat hierarchy. And all the meta-data for each file is maintain by the Meta
File System. A single file, may be split up and stored in several different actual
files on the storage server.
1.5.
System Interface
The system interface essentially consists of a set of APIs to provide access to
the functions provided by the storage server. Some of the interfaces provided by
this platform have been listed below.
1. Simple API for user login (password based authentication).
2. APIs to create, and delete files and directories.
3. APIs to read and write to files at a block level granularity.
4. API to read a directory, and display it contents.
1.6.
Web Client
This is essentially a reference implementation using the API set will be
developed on an internet platform. It has primarily been developed to
demonstrate the capabilities of this storage solution.
Screenshots from the Web Client:
Fig: User chetna’s Root directory.
Fig:
File Editor to edit files. The links, previous and next are to get the previous & next
blocks of data if more data is available on the server.
Important Design Decisions
This section discusses the important design decisions made during the course of
this project to meet the design challenges outlined in Section x.x.
1. Scalability – This was one of the most interesting requirements to design for. I
took several decisions based on an attempt to make my system more
scalable. One of these was to separate the Meta File System from the Access
server and to allow support for several access servers to be able to access
the meta file system. While this is not something that I have tested, my design
was trying to ensure that this was a requirement that was easy to add.
2. What Meta-Data do I capture? The Data Structures selected were guided by
the following considerations. The Meta file system should support
a. Fast Indexing
b. User Based Views (of directory structure)
c. Ease of building a file sharing application (or other simple applications)
with minimal effort.
3. Efficiency – To ensure that the file upload and download time is reasonable,
a. Introduced a File caching mechanism at the Access Server.
b. Designed Efficient Algorithms for creation of new files and allocation of
new data blocks.
System Interface – Designing the interfaces required for block level
4.
access by a client as well as for regular file operations such as create,
delete etc.
Technology – What technology to use was an important question that
5.
I had to spend quite some time figuring out. After considering several
alternatives, I choose to use JSON over HTTP for communication between
the various sub-systems. This is because it is one of the most light weight of
data encoding formats.
Evaluation
1.7.
Evaluation and Testing Method
1. Functional test: This covers the basic functionality of the product as a
whole. The tester was able to provision storage space on the server,
create, store and retrieve data files onto the web client application.
2. Performance test: Response time have been measured for file read
and write access. These have been compared against the file access
times of another standard internet file storage application, the Gmail
drive, to show the distinct speed and memory gains achieved with the
block level approach as opposed to the traditional file level approach.
Main Contributions
1. Proposed a Scalable, Highly-Available Architecture for a Distributed
Internet File System.
2. Design and implement an efficient Meta-File structure for a block level file
system implementation.
3. Implemented a simple file caching and pre-fetching mechanism, to
improve efficiency of block level file access.
4. Design and implement APIs for accessing files block by block for client
with resource constraints.
5. Implement a proof of concept system, and built a basic web based file
storage application on this platform to demonstrate its capabilities.
Future Work
1. Persistence of Meta File System structure
2. Improved File Caching Mechanism in the Access Server
3. Improve Block Allocation Algorithm in MFS.
4. Move towards a Google Docs type of implementation (Multiple users can
write different blocks of data in the same file simultaneously.)
Bibliography
1. Introduction
to
the
Open
XDrive
API
(http://dev.aol.com/article/2007/xdrive_api)
2. The Storage Delivery Network (http://nirvanix.com/platform.aspx)
3. IFS – an Internet File System implementation based on Web services and
peer-to-peer
technology
by
Stoyan
(http://www.codeproject.com/KB/webservices/ifs.aspx)
4. OpenDHT: http://www.opendht.org/
Damov
Download