Department of Computer Science and Engineering
University of Notre Dame
Notre Dame, IN 46556, USA
Email: crobins9@nd.edu, ykim15@nd.edu
I. I
Cloud and distributed storage may be appealing for small enterprises due to the scalable nature of the cloud. One particularly interesting aspect of the move to the cloud is storage. Is it possible to provide a shared storage system without the expense of running a server locally? Some of the major issues with the cloud are reliability and network speed.
In this project we will investigate the feasibility of cloud
storage for Goliath National Bank [1]. GNB employs
500 people who only require 1 GB of space since they are only there to fill up space and 50 people who 50 GB to wrangle their legal documents. We explore three types of solutions: local storage server, pre-existing cloud solutions, and a custom solution.
II. L OCAL S ERVER
The traditional, and most obvious solution, is to put a server in the building. This would remove the concern of network link speed and allow for stricter security. The specifications most relevant to a local server are total capacity and price.
Our system requires a minimum capacity of 3TB and minimal price. One other item of concern is the party responsible for maintenance.
There are two approaches for obtaining a server, buy or build. Buying will provide peace of mind for the administrator by knowing that the vendor is responsible for hardware failures. Building the server can save money but places the responsibility for maintenance on the system administrator. A
4 TB RAID 5
with three year on-site service can be
3 ,
similar components will cost $ 1 , 400 in parts. The components all have 3 year warranties, so the only additional cost is paying the system administrator to build the machine and replace broken parts.
One component that should be added to the local server is backup. This will increase reliability and provide some level of fault tolerance for user errors. There are many different methods for backing up data. We could buy two servers and run a time-based job to copy the changed files. We could also
1 http://netscale.cse.nd.edu/twiki/bin/view/Edu/GradOSF12MiniProjects
2
No explicitly JBOD servers available
3
Prices updated as of 13 December 2012 upload the data to a cloud storage system like Amazon’s S3.
A third solution is to purchase external hard drives to store copies of the data. Solution number three is the cheapest, and arguably the simplest, choice. Since we have a maximum of 3
TB of data we can use single hard drives for backup. A 3 TB drive costs $ 150 and our backup strategy requires two drives.
We will also need a method for attaching the drives to the machine. The simplest solution is to use a USB 3 .
0 hard drive dock which can be purchased for $ 50 . We recommend that the drives be rotated for maximum data protection; one should be in the dock copying data and the other should be stored offsite in case of emergency. Rotating backups to external hard drives would add $ 350 to the total cost of a local storage server.
By adding the cost of backup, a local server will cost a maximum of $ 3 , 350 . This will serve as a base line for comparison with other solutions.
III. P RE EXISTING S OLUTIONS
Currently, there are several cloud storage services available that can offer proper systems to enhance the reliability of remote storage. Dropbox, Google Drive and SkyDrive are the most popular cloud storage services nowadays. They can interact with various platforms and provide online storage that can keep users files safe and users have remote access using various mobile devices. These services allow files in a folder to be continually synchronized allowing for off-site backup and remote access. Dropbox, SkyDrive, and Google Drive are superficially similar. They offer both a native app for Mac and Windows as well as a free web portal which is integrated with the provider’s representation of the stored files. Users are provided several gigabytes of free storage and have the ability to selectively synchronize the storage with the hard drive on their PC which can be managed using the internet.
Users can synchronize various types of files such as image files and document files and folders with other PCs and Macs and share them with other people publicly and privately. However, there are functional differences between the three services.
Dropbox offers relatively less free storage ( 2 GB) compared to that of other services with higher prices for advanced options. In the Dropbox app, users can navigate through folders and mark things as favorites which will store a file locally for quick viewing later, even if online. Users can also set Dropbox to cache a certain storage amount locally for
quicker access to regularly viewed files. Users can view many common file types from iPhone or iPad such as Word, Excel,
PDF and many image file types. Dropbox can vary the speed for file transfer so that other running programs do not lose any
bandwidth [3]. The clients support a feature called Lan Sync
which you can use to synchronize data in a local area network between two systems. While you still need Internet access as well on all systems, LAN sync speeds up the syncing of data between computers significantly by transferring files locally instead of remotely. Users can share files and folders online, create share folders publicly, and track document versions.
Dropbox users can share links with non-dropbox users and they can view shared files without logging in. Dropbox has great apps but it doesn’t have online editors like Microsoft or Google. Dropbox can view common formats but requires third-party apps to enable the editing. Dropbox can offer a system providing 500 users with a 1 GB each at free charge and 50 users with 50 GB by charging $ 9 .
99 /month or total of around $ 15 , 000
Google Drive is a combination of Google Docs and a webbased hard drive that might replace the need for desktop sync clients. Other than collections replaced by folders and a new
My Drive link to browse the contents of files, everything looks the same as Google Docs. For existing Google Docs users,
Google Drive is a convenient way to add backup and sync features to a service they are using.Files created and stored in Google Docs can also be synced to your Drive folders, although you can elect not to sync selected subfolders in
Drive. Google Drive is ideal for backup but it has limited sharing outside of Google Docs. GoogleDrive offers 5 GB for free storage and 25 GB storage for any paid account. With its online enabled editing, people can make changes and add comments together for a shared file. One thing lacking for
GoogleDrive is an application for iPhone or iPad. Google has the best desktop clients but their lack of an iOS app damages the overall workflow model. Google Drive can offer a system providing 500 users with 1 GB each at free charge and 50 users with 50 GB by charging $ 4 .
99 /month or total of around
$ 9 , 000
Skydrive is a Microsoft software offering with the most free storage space at 7 GB compared to Google drive and
Dropbox. It allows the ability to find any un-synced file from outside the synced folders on connected PCs. If you work with
Microsoft Office Files regularly and need 100% compatibility,
SkyDrive may be the best choice due to its tight connection with the Office Web Apps. Users can edit files online via the
Microsoft Office web apps or from users computers via the traditional Microsoft Office suite. Also, SkyDrive allows users to remotely connect to a PC where users installed the SkyDrive
PC client and fetch files that are not in the SkyDrive folder.
Any changes made to a file are updated to the same file online and across various PCs. The SkyDrive page also displays a list of PCs in the network that user has connected to their account. Clicking on a powered on PC provides users a view and access to files on its hard drive in its libraries. Files can also be mass-distributed easily to users in Hotmail contact lists.
Yet, there are several things to improve for SkyDrive. SkyDrive treats automatically synced files in folders differently than files users upload directly into SkyDrive. Also, SkyDrive does not seem to perform well during the peak time. The only way to share files with other users is through the SkyDrive website.
SkyDrive can offer a system providing 500 users with 1 GB storage at free charge and 50 users with 50 GB storage by charging $ 25 /year or total of around $ 3 , 750
When the reliability of Dropbox, Skydrive and Google
Drive were directly compared, they all showed very high
reliability [7]. Google Drive showed
100% reliability but
Dropbox and SkyDrive also showed almost perfect reliability of 99 .
97% uptime. Overall, all three had 99 .
97% of uptime or above in this survey. The small amount of downtime is acceptable. No service will have perfect availability in the long run. The important thing is that outages are dealt with swiftly and don’t happen too often.
Free
TABLE I
P AID STORAGE OPTIONS FOR ONE YEAR
Add 20GB
Add 50GB
Add 100GB
Dropbox Google Drive Skydrive
2GB
-
$99
$199
5GB
-
-
$60
7GB
$10
$25
$50
For these cloud storage services, there are no system administrators since they are user-centric services. By having a system administrator, these services may provide more protections for stored data. Because these services offer storage to millions of users, the number of users is not an issue. All three cloud storage services can handle 500 users with 1 GB storage without charging users. To handle 50 users providing them 50
GB storage, SkyDrive provides the best choice by charging the least amount of money from users view not thinking about the file transfer speed since the reliability was about the same for all three services. To have the fastest file transfer speed,
Google Drive can provide the fastest file transfer speed with lower cost than Dropbox requires.
IV. C USTOM S OLUTION
Now that we have explored purchasing a local server and using a pre-existing solution let’s take a look at building a customized shared file system. We will break this option into three components; storage, upload and retrieve, and multiple access. This system will be a combination of AFS and Ceph.
Our custom solution has three major components: local client, cloud based storage, and metadata server.
The local client installed on each machine will be a fuse application which displays the user’s files as a volume mounted on the local workstation. The list of available files will be provided by the metadata server.
The cloud based storage will be treated as a hard drive accessible over the network. There are options for the provider which we will explore later.
It is possible to have a basic version of this system working with only the local client and cloud storage. However we
are more expensive and do not provide a method for verifying
. That leaves a local solution.
R EFERENCES
[1] (2012, Dec.) Goliath National Bank - Your Financial Wingman. [Online].
Available: http://goliathbank.com
[2] “The Dell Online Store,” Dec. 2012. [Online]. Available: http:
//configure.us.dell.com/dellstore/config.aspx?c=us&cs=04&l=en&model id=powervault-nx200&oc=brctzy2&s=bsd&fb=1&vw=classic
[3] [Online].
Available: http://lifehacker.com/5904731/ desktop-file-syncing-faceoff-dropbox-vs-google-drive
[4] [Online]. Available: https://www.dropbox.com/pricing
[5] [Online]. Available: https://support.google.com/drive/bin/answer.py?hl= en&answer=2375123
[6] [Online].
Available: http://windows.microsoft.com/en-US/skydrive/ compare
[7] [Online].
Available: http://royal.pingdom.com/2012/06/21/ cloud-storage-shoot-out-google-drive-vs-dropbox-vs-skydrive-vs-box-com
Fig. 1.
Overview of custom solution wish to provide more advanced features. To enable access control, per user quota enforcement, and file sharing we need
to introduce a metadata server. See figure 1 for a graphical
representation of the system.
This framework provides the basis for implementing our custom storage solution. The exact implementation details can be resolved if the client decides this is the best option.
Now that we have discused the basics of our solution let’s investigate pricing.
For the cloud based storage there are two main options:
Amazon’s S3 and Google’s Cloud Storage. Both services offer similar reliability and features, while Cloud Storage is slightly cheaper. With Cloud Storage the first terabyte of data costs
$ 0 .
085 per GB and the next two cost $ 0 .
076 per GB for a monthly total of $ 242 .
69 . And multiplied over 3 years the total cost for storage is $ 8 , 736 .
77 .
V. C ONCLUSION
Each solution has unique benefits. With a custom solution it is possible to tailor the programs and services involved to the exact needs of GNB. In the pre-existing solutions the system administrator needs to provide minimal oversight, sharing files is trivial, and off-site access is included. The local solution allows for future expansion, maximal control over backups, and removes any questions about network speed.
While the possibilities with the custom solution are appealing the custom solution has many negative traits. It is the most expensive, time consuming, and potentially bug ridden solutions. Even if we were to ignore the development costs this is still the most expensive solution. The customized solution is only effective for extremely large and technologically focused
companies e.g. Amazon and Google
the customized solution for GNB.
After eliminating the custom solution we are left with two possibilities: local or pre-existing. The pre-existing solutions
4
The same companies that provide cloud storage
5
Or any other regulation entity