ronbun3

advertisement
平成 19 年度
筑波大学第三学群情報学類
卒業研究論文
An Internet File System for Random Accessing Protected Data
主専攻
計算機システム
著者名
AHMAD SYAHIR BIN CHE ABDULLAH
指導教員
板野肯三、新城 靖、佐藤聡、中井央
1i
Abstract
As the Internet become ubiquitous in1 these days, 2video collaboration over the Internet
becomes very common. For example, in the sports coaching 4area, video data is shared among
scattered organizations. Most of 5this video data are confidential and need to be protected.
Typically, to distribute protected video data, Digital Rights Management (DRM) is often used.
Representative DRM products include Windows Media DRM, Helix DRM, FairPlay DRM, and
DReaM.
Existing DRM mechanisms for video have a problem in the random access capability.
Random access is really important for video seeking. This problem is 7inherent for these
existing DRM mechanisms because they are optimized for major users who watch video
sequentially.
In this paper, we 8propose a new data protection mechanism to access protected data through
the Internet which allows random access usage.
3
i
Table of Contents
Chapter 1
Introduction............................................................................................................... 1
Chapter 2
Related Works ........................................................................................................... 3
2.1
Digital Right Management ............................................................................................ 3
2.1.1
Windows Media DRM ........................................................................................... 3
2.1.2
Apple’s FairPlay ..................................................................................................... 4
2.3
DirectShow API .............................................................................................................. 4
2.4
Shell Namespace Extension........................................................................................... 5
Chapter 3
Overview of our protected data distribution ............................................................ 7
Chapter 4
VFS Module ............................................................................................................... 9
4.1
Dokan Library ................................................................................................................ 9
4.1.1
4.2
Dokan’s component ............................................................................................ 10
User-level Module for Dokan ...................................................................................... 11
4.2.1
Global variables ................................................................................................... 11
4.2.1
Items class ........................................................................................................... 12
4.2.2
CreateFile() function............................................................................................ 12
4.2.3
OpenDirectory() function .................................................................................... 13
4.2.4
GetFileInformation() function ............................................................................. 13
4.2.5
ReadFile() function .............................................................................................. 15
4.3
Secure Channels .......................................................................................................... 16
4.3.1
Named Pipe ......................................................................................................... 16
4.3.2
Filename Extension ............................................................................................. 17
4.4
Performance Issues ..................................................................................................... 18
4.4.1
Pooled Connections............................................................................................. 18
4.4.2
Cache and Prefetch ............................................................................................. 19
Chapter 5
Server-side Programs .............................................................................................. 20
5.1
Persistent Connections................................................................................................ 20
5.2
Byte serving ................................................................................................................. 20
Chapter 6
Encryption and Decryption...................................................................................... 21
6.1
Advanced Encryption Standard ................................................................................... 21
6.2
Counter mode ............................................................................................................. 21
ii
6.1.1
Padding ................................................................................................................ 22
6.3
Token ........................................................................................................................... 22
6.4
Encryption software .................................................................................................... 23
6.3.1
6.5
Chapter 7
Multithreads ........................................................................................................ 24
Decryption ................................................................................................................... 25
Experiments............................................................................................................. 27
7.1
Experiment environment ............................................................................................ 27
7.2
Encryption Rate ........................................................................................................... 27
7.3
Transfer rate ................................................................................................................ 28
Chapter 9
Integration with Movie Database System of Japan Institute of Sports Sciences 31
9.1
Current data distribution mechanism in JISS movie database system ....................... 31
9.2
New data distribution mechanism propose to JISS movie database system .............. 31
9.3
The advantage compare to current mechanism ......................................................... 32
Chapter 10 Conclusions ............................................................................................................... 34
Acknowledgements ..................................................................................................................... 35
References................................................................................................................................... 36
iii
Table of Figures
Figure 1: Data flow and token distribution ........................................................................... 7
Figure 2: Dokan library working in Windows kernel ........................................................... 10
Figure 3: Global variables .................................................................................................... 11
Figure 4: Items class source code ........................................................................................ 12
Figure 5: CreateFile() function code .................................................................................... 13
Figure 6: GetFileInformation() code .................................................................................... 14
Figure 7: ReadFile() function code ...................................................................................... 15
Figure 8: Partial of GetNamedPipe() code .......................................................................... 16
Figure 9: CreateFile() function code when using filename extension................................. 17
Figure 10: Counter mode encryption .................................................................................. 22
Figure 11: Portion of Encrypt() function ............................................................................. 23
Figure 12: Graph showing encryption rate using different thread count ........................... 28
Figure 13: Graph of transfer rate using different methods ................................................ 29
Figure 14: Graph of CPU usage of different method .......................................................... 29
Figure 15: Current data distribution mechanism in movie database system of JISS .......... 32
Figure 16: New data distribution mechanism proposed to movie database system of JISS
..................................................................................................................................... 33
iv
Table of Tables
Table 1: Multithreads calculation........................................................................................ 24
v
Chapter 1 Introduction
Internet is ubiquitous on these days and has been essential for collaborating people. Therefore,
sharing data on the Internet 9is really common for easy access. For example, in the sports
coaching 10area, video data is shared with annotations among scattered organizations [1]. Most
of these video data are confidential 11which need to be protected. Typically, to distribute
12
protected video data, Digital Rights Management (DRM) is often used. Representative DRM
products include Windows Media DRM, Helix DRM, FairPlay DRM, and DReaM.
Existing DRM mechanisms for 13video have a problem in the random access capability.
Random access is really important for video seeking. This problem 14is inherent for these
existing DRM mechanisms because they are optimized for major users who watch video
sequentially. In concrete, these existing DRM mechanisms use a large buffer and 15often
download entire video data in advance. This feature is nice to mitigate jitters, high latency, and
slow throughput16. However, this feature is not suitable for applications that 17need random
access.
In this paper, we propose a new data protection mechanism to access protected data through the
Internet. 18In this mechanism, instead creating a whole new player or video format, we 19reuse
existing media players for the Windows operating system (OS) without 20large modifications to
applications and no modification the OS. Instead, we 21extend the OS in the Virtual File System
(VFS) layer. To simplify the implementation of a VFS module, we 22use a framework for userlevel file systems called Dokan [2]. Upon 23execution VFS module 24mount a web server as a
new logical drive. The logical drive acts as a pipe to access the protected file on the web server.
Some media players 25not support opening file on remote server. Even 26it supported, there are
no random access capability for the remote file. From the 27side of the application which
28
accessing the files on the logical drive, it only sees the files as local files and this enable
almost any application to open it with random access capability.
However, just to give random access capability is not the whole goal for this research. We also
want to provide access control to the accessible file in case the file is protected or classified. In
other words, we don’t want any malicious access 29the file on the mounted logical drive. This
can be done by 30hide the logical drive or 31reports there is no file in the drive except to the
legitimate application. To make it security comparable with other DRMs, the file placed on the
web server is 32decrypted. This will prevent malicious users to download the files on the web
server even he or she knows the file URL. 33Even malicious users can download it, he/she
cannot open the file without 34decrypt it first. The entire 35necessary job: accessing file from
web server, decrypt it and pass it to the legitimate application 36done seamlessly by the VFS
module that we will introduce in this paper.
To provide encryption while maintain the random access property, 37special mode called
Counter Mode with Advanced Encryption Standard (AES) method 38used in this research. 39It
not only fast to do the encryption process but also provides 41unbreakable result. The VFS
11
module obtains 41legit key from the legitimate application using 42secure channel and decrypts
the file, block by block.
Our protected data distribution mechanism 43also has the scalability. In case the protected data is
44
media file, 45it no longer limited to any proprietary file format which depend on proprietary
software of 46media player to play the media file. For example, when using Microsoft DRM,
47
user must use 48file with format WMV or WMA and can only be played on Windows Media
Player. Other media 49player also supported only if the developer got 50proper authorization
from the Microsoft. 51It goes same to Apple FairPlay which 52is need media player make use of
QuickTime. In other 53word most of this DRM mechanisms is closed source, limited and not
customizable.
The rest of the thesis is organized as follows. Chapter 2 54discuss about related works including
Digital Right Management, Microsoft DirectShow API, Shell Namespase Extension and File
system in User Space (FUSE). Chapter 3 discusses the overview of our protected data
distribution mechanism more detail. Chapter 4 elaborates the VFS module of our mechanism
and performance issues. Chapter 5 55is talking about server side software and their requirements.
Chapter 6 elaborates more about encryption and decryption used in this mechanism. Chapter 7
examines the system functionality and performance. In chapter 8 we propose the integration of
this mechanism into SmartSystem of Japan Institute of Sports Sciences. Chapter 9 concludes
this thesis and 56discusses future work of this thesis.
2
Chapter 2 Related Works
2.1
Digital Right Management
Digital Right Management (DRM) 57is always refers to access control technologies used by
publishers and copyright holders to limit usage of digital media or devices. It may also refer to
restrictions associated with specific instances of digital works or devices. To some extent, DRM
overlaps with copy protection, but DRM is usually applies to creative media (music, films, etc.)
whereas copy protection typically refers to software. In DRM, data files (usually audio and
video) are encrypted or just wrapped in encrypted containers. The authentic player will receive
the key to decrypt the data.
However there are several problems using this kind of DRM. Most of 59this DRM need
proprietary file format and special software or API to be used. For example Microsoft DRM
also known as Windows Media DRM need the user to use their property codec like WMV or
WMA and this protected media can only be played on 60their property media player. Other
media player may support the DRM if the developer got the authorization from Microsoft.
61
This also the same for Apple’s FairPlay, which 62limited the file format to MP4 and AAC
codec, and only allow to be played using Quicktime. 63This limited to the audio and video file,
and not to other format such as document file.
64
There also problem with the encryption. Most of these DRM mechanisms are targeting the
download data. Some of them also support direct streaming from the server, but usually use
large buffer to 66promise smooth playback. Each time 67user seeking the video, the media player
buffers the data before start playing. Also random access 68only supported for limited file format.
This is 69become disadvantage as user needs to convert the file to the specific file format. Some
of 70DRMs also only has 71one key or a limited set of keys for all file, which is really unsecure.
Typical DRM 72also built on purpose to restrict user playing the media 73not to protect it. For
example it prohibits user from playing the media on 74the unregistered machine, or limits play
count. Most well-known DRMs are Microsoft’s Windows Media DRM and Apple’s FairPlay:
65
2.1.1
Windows Media DRM
Windows Media DRM is a Digital Rights Management service for the Windows Media
platform. It is designed to provide secure delivery of audio and/or video content over an IP
network to a PC or other playback device in such a way that the distributor can control how that
content is used. It using a combination of elliptic curve cryptography key exchange, the DES
3
block cipher, a custom block cipher dubbed MultiSwap, the RC4 stream cipher, and the SHA-1
hashing function.
Windows Media DRM is designed to be renewable, that is, it is designed on the assumption that
it will be cracked and must be constantly updated by Microsoft. The result is that while the
scheme has been cracked several times, it has usually not remained cracked for long.
2.1.2
Apple’s FairPlay
FairPlay is a DRM technology created by Apple Inc., based on technology created by the
company Veridisc. FairPlay is built into the QuickTime multimedia software and used by the
iPhone, iPod, iTunes, and iTunes Store. Any protected song purchased from the iTunes Store
with iTunes is encoded with FairPlay. FairPlay digitally encrypts AAC audio files and prevents
users from playing these files on unauthorized computers.
FairPlay protected files are just regular MP4 container files with an encrypted AAC audio
stream. The audio stream is encrypted using the AES algorithm in combination with MD5
hashes. The master key required to decrypt the encrypted audio stream is also stored in
encrypted form in the MP4 container file. The key required to decrypt the master key is called
the "user key."
Each time a customer uses iTunes to buy a track a new random user key is generated and used
to encrypt the master key. The random user key is stored, together with the account information,
on Apple’s servers, and also sent to iTunes. iTunes stores these keys in its own encrypted key
repository. Using this key repository, iTunes is able to retrieve the user key required to decrypt
the master key. Using the master key, iTunes is able to decrypt the AAC audio stream and play
it.
When a user authorizes a new computer, iTunes sends a unique machine identifier to Apple’s
servers. In return it receives all the user keys that are stored with the account information. This
ensures that Apple is able to limit the number of computers that are authorized and makes sure
that each authorized computer has all the user keys that are needed to play the tracks that it
bought.
2.3
DirectShow API
DirectShow is a multimedia framework and API provided by Microsoft for software developers
to perform various operations with media files [3]. Based on the Microsoft Windows
Component Object Model (COM) framework, DirectShow provides a common interface for
4
media across many of Microsoft's programming languages, and is an extensible filter-based
framework that can render media files on demand by applications.
DirectShow divides the processing of multimedia tasks such as video playback into a set of
steps known as filters. Each filter represents a stage in the processing of the data. Filters have a
number of input and output pins which connect them together. The generic design of the
connection mechanism means that filters can be connected in many different ways for different
tasks to build a filter graph, and developers can add custom effects or other filters at any stage in
the graph then render the file, URL or camera.
Most video-related applications on Windows, not only Microsoft's Windows Media Player but
also most third-party applications use DirectShow to manage multimedia data. However,
DirectShow has a problem with a random access capability. Applications that use DirectShow
API can perform random access for local files but cannot when opening files over HTTP. In
other words, the video seeking does not work when applications open files over HTTP.
2.4
Shell Namespace Extension
In Windows environment, 75there several ways to implement a new file system. The easiest way
is through user land Shell namespace extensions [4]. With a namespace extension, software
developer can take any data and have Windows Explorer present it to the user as a virtual folder.
When a user browses into this folder, the data is presented as a tree-structured hierarchy of
folders and files, much like the rest of the Shell namespace. Users and applications are able to
interact with the contents of this virtual folder in much the same way as with any other
namespace object.
Behind the scenes, every folder that Windows Explorer displays is represented by a Component
Object Model (COM) object called a folder object. Each time the user interacts with a folder or
its contents, the Shell communicates with the associated folder object through one of a number
of standard interfaces. The folder object then does whatever is necessary to respond to the user's
action, and the Shell updates the Windows Explorer display. The majority of the files and
folders that users interact with are part of the file system or a system virtual folder such as the
Recycle Bin.
To implement a namespace extension, the information must be organized as a tree-structured
namespace. The namespace root is presented as a virtual folder in the Shell namespace. The root
folder, and all its subfolders and data items, becomes part of the Shell namespace, and Windows
Explorer becomes the user interface. Developer can thus present their information to the user in
a familiar and readily accessible way with much less UI programming than would be required
for a custom application.
5
The availability of shell namespace extension file system toolkits [5], lighten the process of
implementation file system using namespace extension. The most notable file system using
namespace extension is GMail Drive [6], which is a Namespace Extension that creates a virtual
file system around Google Mail account, allowing user to use Gmail as a storage medium.
However, despite of this easiness, file system that implemented using this method does not
support the lowest-level file system access API in Windows, including DirectShow API. The
file system cannot be mapped as a drive letter. The file system also inaccessible through
command line tools. It can be accessible using Windows Explorer. So not all applications are
able to access file systems that are implemented as namespace extensions.
2.5
File system in User Space
File system in User space (FUSE) [] is a free Unix kernel module, released under the GPL and
the LGPL, that allows non-privileged users to create their own file systems without editing the
kernel code. This is achieved by running the file system code in user space, while the FUSE
module only provides a "bridge" to the actual kernel interfaces. FUSE was officially merged
into the mainstream Linux kernel tree in kernel version 2.6.14.
FUSE is particularly useful for writing virtual file systems. Unlike traditional file systems which
essentially save data to and retrieve data from disk, virtual file systems do not actually store data
themselves. They act as a view or translation of an existing file system or storage device. In
principle, any resource available to FUSE implementation can be exported as a file system.
FUSE is available for Linux, FreeBSD, NetBSD (as PUFFS), OpenSolaris and Mac OS X.
76
Dokan library used in this mechanism is similar to FUSE, instead it running in Windows
environment.
6
Chapter 3 Overview of our protected data distribution
Authentication server
User PC
Windows
kernel
Kernel
level
VFS
module
Filename
Token
Application
Internet
Secure
channel
Web server
Admin PC
Encryption
software
User-level VFS
module
Encrypted files
Encrypted
file
Original
file
Figure 1: Data flow and token distribution
In this mechanism, we make use of existing application or media player 77with small
modification and no modification to OS. Instead, we extend the OS using virtual file system
(VFS) layer. We 78try to overcome the most problem exist in existing DRM mechanisms by
providing the protection on the file system level instead. We also try to make workaround for
random access problem when accessing remote file that exist in DirectShow API by not using
the URL accessibility API in DirectShow, instead 79make the remote file as 80local file virtually.
All the process to access the remote file is done by the VFS module. VFS module introduced in
this paper also 81not using the namespace extension approach. In its place, we use Installable
File System (IFS), a file system API in Microsoft Windows that enables the OS to recognize
and load drivers for file systems.
Our mechanism consists of several components: encryption software, authentication server, web
server and virtual file system module as shown in Figure 1. When the administrator wants to
upload a file to the web server, he/she need to use the encryption software. The encryption
software generates random key and nonce and encrypts the given file. After finish uploads, the
software registers the filename, original file, file’s key, and file’s nonce onto the authentication
server. All this information packs as a token stored in authentication server. We will elaborate
details of token in Chapter 6
Encryption and Decryption. When a user runs the legitimate
application, the application logs into the authentication server to receive the filename/token pair.
The applications also trigger the execution of VFS module, which is mounts a virtual logical
drive. Upon opening the file, the application passes the token to the VFS module. The VFS
module accesses the data from the web server and decrypts it using the received token from the
application through a secure channel before sends it back to the application using file system
API.
7
This paper focuses on the following programs: The VFS module, encryption software and
applications. We will not discuss about authentication server as we assume that we can use
some token distribution mechanism such as Smart System [1].
8
Chapter 4 VFS Module
The virtual file system (VFS) layer is an abstraction layer on top of more concrete file systems.
The purpose of VFS is to allow applications to access different types of concrete file systems in
a uniform way. VFS specifies an interface (or a "contract") between the kernel and a concrete
file system. Therefore, it enables to add new file system types to the kernel by fulfilling the
contract. For this mechanism, instead of using a limited shell namespace extension, we use a
more reliable approach, Installable File System (IFS). IFS is a file system API in Microsoft
Windows that enables the OS to recognize and load kernel module for file systems. IFS
implementation in Windows is really 83hard work because it involved kernel programming. To
simplify this we make use of Dokan library. IFS used in this research 84has a difference compare
to normal IFS as it separated into 2 components: a kernel level module and a user-level module.
4.1
Dokan Library
In this research, we use Dokan library [2] for simplifying kernel level programming. Dokan
library contains a user mode dynamic library link (DLL), dokan.dll and a kernel mode file
system driver, dokan.sys. Once Dokan file system driver is installed, user can create file systems
which is seen as normal file systems in Windows. In this paper, we refer the application that
creates file systems using Dokan library as user-level module for Dokan. File operation requests
from user programs (e.g., CreateFile, ReadFile, WriteFile, …) will be sent to the Windows I/O
subsystem (runs in kernel mode) which will subsequently forward the requests to the Dokan file
system driver (dokan.sys). By using functions provided by the Dokan user mode library
(dokan.dll), file system applications are able to register callback functions to the file system
driver. The file system driver will invoke these callback routines in order to response to the
requests it received. The results of the callback routines will be sent back to the user program.
For example, when a Windows application requests to open a directory, the OpenDirectory
request will be sent to Dokan file system driver and the driver will invoke the OpenDirectory
callback provided by the user-level module. The results of this routine are sent back to the
Windows application as the response to the OpenDirectory request. Therefore, the Dokan file
system driver acts as a proxy between user programs and file system applications. Dokan is
written in C and the user-level module can be written in C or Ruby and C# using provided
language binding support.
9
Application
User-level module
Windows kernel
Dokan file system driver
Figure 2: Dokan library working in Windows kernel
4.1.1
Dokan’s component
Dokan itself consists of several main component:







4.1.2
dokan.dll
Dokan user mode library. It provides functions to the user-level module.
dokan.sys
Dokan File System Driver. It stays in kernel-level to invoke call-back function provided
by user-level module.
mounter.exe
Dokan mounter service. It run as service to mount a virtual drive when the mount
function invoked.
dokanctl.exe
Dokan control program. User may use this program to dismount the mounted drive if
the user-level module ends unexpectedly.
dokan.lib
Dokan import library
dokan.h
Dokan library header
DokanNet.dll
Library for .NET binding. This is required to write user-level module in C#.
Callback function in Dokan library
Dokan library provide necessaries callback function to create a full features of file system,
however in this mechanism we only use several functions.
10
Function name
CreateFile
Parameters
string filename, FileAccess access, FileShare share, FileMode mode,
FileOptions options, DokanFileInfo info
string filename, DokanFileInfo info
string filename, DokanFileInfo info
string filename, DokanFileInfo info
string filename, byte[] buffer, ref uint readBytes, long offset,
DokanFileInfo info
string filename, byte[] buffer, ref uint writtenBytes, long offset,
WriteFile
DokanFileInfo info
string filename, DokanFileInfo info
FlushFileBuffers
GetFileInformation string filename, FileInformation fileinfo, DokanFileInfo info
string filename, ArrayList files, DokanFileInfo info
FindFiles
DokanFileInfo info
Unmount
OpenDirectory
Cleanup
CloseFile
ReadFile
4.2
User-level Module for Dokan
We implement the user-level module of Dokan in the C# language. As this file-system focuses
on read-only capability, we make use only several needed callback function. Functions related
with creating folder and writing file will return -1 or error. Typically when an application
opening a file from the file system, CreateFile(), OpenDirectory(), GetFileInformation(),
ReadFile(), Cleanup(), and CloseFile() in sequence, so we only focus on these callback
functions and some others specific functions.
4.2.1
1
2
3
4
5
6
7
8
Global variables
private int count_;
private string host_ = "http://server";
public static Hashtable filetable = new Hashtable();
public static TcpClient c = null;
public static Stream s = null;
public static SimpleEncoding se = null;
public static StreamReader r = null;
public static StreamWriter w = null;
Figure 3: Global variables
11
As shown in figure in line 1 variable count_ is counter for the Dokan file handler. It increase
each time new CreateFile() function invoked. The host_ variable in line 2 is the web server
hostname. filetable variable is hashtable which is store files meta information using filename as
the key and Items class as the value. Variable from line 5 to line 8 used to create pooled
connection. This will be elaborated more in ReadFile() function.
4.2.1
Items class
Item class functions to store file’s meta information such as the filenames, files attribute
(whether it a file or directory), file sizes, files key, files nonce and process ID (PID) of
application which opening the file. Details about key and nonce will be elaborated in chapter 5.
1
2
3
4
5
6
7
class Items {
public string name;
public int size;
public byte[] key;
public byte[] nonce;
public int PID;
}
Figure 4: Items class source code
4.2.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
CreateFile() function
public int CreateFile(String filename, FileAccess access, FileShare share,
FileMode mode, FileOptions options, DokanFileInfo info)
{
string path = HttpUtility.UrlPathEncode(filename.Replace("\\", "/"));
info.Context = count_++;
if (path.Equals("/"))
{
info.IsDirectory = true;
return 0;
}
if (!filetable.ContainsKey(path))
{
ulong PID = GetNamedPipe(path);
if (PID == 0 || PID !-= info.ProcessID)
return -DokanNet.ERROR_FILE_NOT_FOUND;
}
try
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://host_" +
path);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
request.Accept = "text/plain";
if (response.StatusCode.ToString() == "OK") return 0;
else return -DokanNet.ERROR_FILE_NOT_FOUND;
12
23
24
25
26
27
28
}
catch
{
return -DokanNet.ERROR_FILE_NOT_FOUND;
}
}
Figure 5: CreateFile() function code
When the application creates a file handle it will invoke this function. The first thing this
function do is converts the backslash in the filename to slash. This is necessary as URL use
slash as delimiter instead of backslash like Windows do. In the line 4 on source code above the
function increase the fill the file Context with the counter and increase it. After that it checks
whether the application opening a file or a root directory. If it opening the root directory then it
just set Dokan file handler, IsDirectory as true and return 0. It continues to verify the filetable
whether the filename key exists or not. If the key does not exist it will invoke the
GetNamedPipe() method. GetNamedPipe() sends a request to the pipe server which is the
application that open the file. If the application returns with the proper token, GetNamedPipe()
fill the filetable with received data and returns the application PID. Otherwise it returns 0. More
details about GetNamedPipe() will be elaborated in Named Pipe section in Secure Channel
subchapter. If the returned PID is 0 or the PID is not the same with the PID of application which
opened the file, then CreateFile() returns a file not found error. After that the CreateFile()
functions accesses the file header from the web server to obtain the verification of the files’
existence on the web server. If the file exists it will return 0, otherwise it return error with file
not found status.
4.2.3
OpenDirectory() function
Virtually the file system is empty, since this file system does not allow access to any
applications other than legitimate application, and the legitimate application already knows
which files that exist. In other words, applications are not allowed to open directory. However,
to avoid problem upon opening file, OpenDirectory() always return 0 or true.
4.2.4
1
2
3
4
5
6
7
8
GetFileInformation() function
public int GetFileInformation(
String filename,
FileInformation fileinfo,
DokanFileInfo info)
{
string path = HttpUtility.UrlPathEncode(filename.Replace("\\", "/"));
if (!table.ContainsKey(path)) return -1;
Items file = (Items)filetable[path];
13
9
10
11
12
13
14
15
fileinfo.Attributes = m.type;
fileinfo.CreationTime = DateTime.Now;
fileinfo.LastAccessTime = DateTime.Now;
fileinfo.LastWriteTime = DateTime.Now;
fileinfo.Length = file.size;
return 0;
}
Figure 6: GetFileInformation() code
This function is invoked when the application which opened the file try to get the file
information. GetFileInformation() draws the file information from the filetable hashtable and
fill it in the fileinfo parameter and returns 0. As the file time is not important it’s just filled with
the current time. If the requested filename is not available in the hashtable then it returns -1
(error).
14
4.2.5
1
2
3
4
5
6
7
8
9
10
11
12
13
.
.
.
24
25
26
27
28
29
ReadFile() function
public int ReadFile(String filename, Byte[] buffer, ref uint readBytes, long
offset, DokanFileInfo info)
{
string path = HttpUtility.UrlPathEncode(filename.Replace("\\", "/"));
if (!table.ContainsKey(path)) return -1;
Items file = (Items)filetable[path];
int filesize = file.size;
byte[] key = file.key;
byte[] nonce = file.nonce;
if ((int)offset >= filesize)
{
readBytes = 0;
return 0;
}
... calculate block counts ...
... get position of block in file ...
... find new offset ...
byte[] encryptedData = AccessData(path, offset, bytesize);
byte[] decryptedData = Decrypt(encryptedData, firstBlock, key, nonce,
filesize, int offsetInFirstBlock)
Array.Copy(decryptedData, buffer, decryptedData.Length);
readBytes = (uint)decryptedData.Length;
return 0;
}
Figure 7: ReadFile() function code
This is the most important function as 85it almost the ideas of this research go into this function.
Like other necessary callback function, it converts backslash in the filename variable into slash.
After that it tries to draw the meta information from the hashtable. If it fails, it return -1 or error.
The meta information used in this function are file size, file key and file nonce. After that it
continues to verify the offset that whether or not it’s still in file size range. Otherwise it return 0
or success, but with 0 readBytes. Such verification is vital to avoid any error when accessing
data on the web server.
After doing necessary verifications, it starts to calculate the block number for the given offset
and how much data need to be accessed to decrypt it properly. After these mathematical process
are done, it uses the calculation result to access the data portion from the web server. Detail
about AccessData() function will be detailed in Chapter 4.4. AccessData() returns an array of
bytes and placed in encryptedData variable. Decrypt() function called to decrypt the gained data
portion by applying gained encryptedData, buffer length, drawn file key, file nonce and file
size as parameter. Details of Decrypt() will be elaborated more in Chapter 6.5.
Returned decrypted bytes array is copied into buffer with the length of decryptedData, which is
in some case shorter than buffer length. And decryptedData length also returns as readByte.
15
4.3
Secure Channels
It is vital to prevent malicious applications from opening the protected files. To realize this, a
file needs to be encrypted and the legitimate application must pass a token to the VFS module,
which is needed to decrypt the requested file. To pass the token, a secure channel between the
VFS module and the application needs to be established. We are considering two ways to
implement this: a named pipe and a filename extension.
4.3.1
Named Pipe
Named pipe is a named, one-way or duplex pipe (shared memory) for communication between
the pipe server and one or more pipe clients. All instances of a named pipe share the same pipe
name, but each instance has its own buffers and handles, and provides a separate conduit for
client/server communication. In our case, the application will act as pipe server and create a
named pipe which can be accessed by the VFS module.
1
2
3
4
5
6
7
8
9
.
.
17
18
19
20
21
22
23
public ulong GetFileInformation(String filename)
{
IInterProcessConnection clientConnection = null;
try {
clientConnection = new ClientPipeConnection("PipeName", ".");
clientConnection.Connect();
clientConnection.Write(filename);
string base64data = clientConnection.Read();
clientConnection.Close();
... omitted (decoding base64data, split it, decrypt and put it into hashtable)
...
return PID;
}
catch {
clientConnection.Dispose();
return 0;
}
}
Figure 8: Partial of GetNamedPipe() code
.NET (dot net) itself not has library for named pipe. So we make use of freely available named
pipe library for C# []. Upon execution, the legitimate application creates a named pipe and
listens to it. As shown in the source code above, user-level VFS module create connection to
pipe named “PipeName” and send request to the application with filename as parameter
whenever the application open a file on the file system. The application replies with self PID
and token which is: filename, file size, file key, file nonce, self PID. All this elements are
encrypted, convert to base64 string and concatenated with some delimiters. This is for easy
transfer across the named pipe. Upon receiving the response from the application this function
will reverse all the procedure done by application. It splits the string; convert it back to byte
16
arrays and decrypts it. After that it put the appropriate data into the hashtable of filetable. If the
data is successfully inserted into the hashtable, it returns the PID. If any error occurred, it
returns 0. The error may be the string failed to be splitted, insufficient data like there no nonce
or no key in the string, or the received key is not long enough.
4.3.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.
.
24
25
26
Filename Extension
public int CreateFile(String filename, FileAccess access, FileShare share,
FileMode mode, FileOptions options, DokanFileInfo info)
{
string path = HttpUtility.UrlPathEncode(filename.Replace("\\", "/"));
info.Context = count_++;
if (path.Equals("/"))
{
info.IsDirectory = true;
return 0;
}
int semiColon = path.LastIndexOf(";");
if (semiColon > -1) path = path.Substring(0, semiColon);
if (!filetable.ContainsKey(path))
{
string token = filename.Substring(semiColon + 1);
... convert token to bytes arrays, decrypts it and put it into the hashtable ...
... omitted ...
}
try
{
... omitted ..
Figure 9: CreateFile() function code when using filename extension
In this method, a token is passed with the filename upon opening a file. For example, when
opening a file named “sports.wmv”, the legitimate application must open ‘sport.wmv;
MithJmRCTHVCNzIzcyhEZ3N0S0orWmNAdw==’
instead,
while
‘MithJmRCTHVCNzIzcyhEZ3N0S0orWmNAdw==’ is the base64 encoded token and the
semicolon is the delimiter.
Notice that when using this method CreateFile() function slightly different, as shown in Figure
9. Instead of calling GetNamedPipe(), it split the requested filename to real filename and token
as shown on line 10,11 in Figure 9. Other functions also need this line if the filename extension
method is used. After that, it do the same procedure as GetNamedPipe() does: split it using
special delimiter, convert base64 string to bytes arrays, decode it and put it into the hashtable.
However, there is a limitation using this method. In Windows, a filename is limited up to 255
characters long for Windows XP and 260 characters for Windows Vista. This problem prevents
the application from opening the file since the token itself is can exceed 40 characters.
17
4.4
Performance Issues
In file systems, performance always matter. To avoid high latency to the whole system, we
perform several optimizations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
.
40
41
42
43
44
45
46
47
48
49
50
51
52
53
if (c == null || !c.Connected)
{
c = new TcpClient(host_, port_);
s = (Stream)c.GetStream();
se = new SimpleEncoding();
r = new StreamReader(s, se);
w = new StreamWriter(s, se);
}
w.WriteLine("GET " + HttpUtility.UrlPathEncode(path) + " HTTP/1.1");
w.WriteLine("HOST: " + host_);
w.WriteLine("Accept: */*");
w.WriteLine("Keep-Alive: 1000");
w.WriteLine("Connection: keep-alive");
w.WriteLine("range: bytes=" + offset + "-" + (offset + bytesize));
w.WriteLine("");
w.Flush();
WebHeaderCollection h = new WebHeaderCollection();
while (true)
{
try
{
string str = r.ReadLine();
if (str.Length == 0) break;
if (str.IndexOf(":") != -1) h.Add(str);
}
catch (Exception)
{
... omitted ... (reconnected if the connection closed)
}
}
int len = int.Parse(h[HttpRequestHeader.ContentLength]);
char[] tmp = new char[len];
int offset2 = 0;
while (true)
{
int ret = r.Read(tmp, offset2, tmp.Length - offset2);
if (ret <= 0) break;
offset2 += ret;
if (offset2 >= len) break;
}
byte[] data = se.GetBytes(tmp);
return data;
Figure 10: Portion of AccessData() function code
18
4.4.1
Pooled Connections
Upon accessing data by an application, data is usually accessed in small chunks from random
positions in the file. This leads to rapid ReadFile() requests in a very short time, and each time
ReadFile() is called, new connection is created to access the data portion from the internet. To
avoid delays to establish a new connection, we use pooled connections to the server. In other
words, the VFS module reuses the same connections to access several portions of data from the
same file. Since the no method to create pooled connection in available in .NET, we created the
method ourselves by using TcpClient() class. We make use of custom SimpleEncoding() class to
detect the stream encoding and returns the proper supported encoding. GetStream() method is
used to send request using StreamWriter() method and StreamReader() to get the response. Line
9 to 15 in Figure 10 indicates the function send request header to web server. After that the
AccessData() function read the response header from the web server as indicates in line 18 to
line 40. We use try and catch to work around with any error occur due to lost connection. If the
connection is lost due to timeout or the keep-alive connection limit reached, we would
reestablish the connection. Afterward the function read the response stream and return it.
4.4.2
Cache and Prefetch
Since we implement a decryption capability, data needs to be downloaded based on block sizes
instead of requested buffer sizes. The VFS module downloads data that is slightly bigger than
the requested buffer size. Data chunks are cached first then decrypted before the module sends
them to the application.
In addition, when playing audio and video files, either from beginning or after perform seeking
across the file, data are usually accessed sequentially. Therefore, we improve the performance
by using prefetching. The algorithm guesses the next data, fetches it and stores it in memory.
Each time a user performs skipping backward or forward command, a new prefetching session
is started. To support prefetch properly, we perform the prefetch function in a dedicated thread.
<<todo>>
19
Chapter 5 Server-side Programs
For the web server, any web server applications that supports HTTP version 1.1 are usable.
These include Apache, Microsoft ISS, lighttpd, etc. HTTP 1.1 is vital because it contains what
we need to make the random access idea comes true. Since version 1.1, it supports range
persistence connection (keep-alive) and request (byte serving) []. The range request is needed
for random access and persistence connection is essential for pooled connections.
5.1
Persistent Connections
In HTTP/0.9 and 1.0, the connection is closed after a single request/response pair. In HTTP/1.1
a keep-alive-mechanism was introduced, where a connection could be reused for more than one
request. Such persistent connections reduce lag considerably, because the client does not need to
re-negotiate the TCP connection after the first request has been sent. The advantage of
persistence connections are:





5.2
less CPU and memory usage (because fewer connections are open simultaneously)
enables HTTP pipelining of requests and responses
reduced network congestion (fewer TCP connections)
reduced latency in subsequent requests (no handshaking)
errors can be reported without the penalty of closing the TCP connection
Byte serving
Byte serving is the process of sending only a portion of an HTTP/1.1 message from a server to a
client. Clients which request byte-serving might do so in cases in which a large file has been
only partially delivered and a limited portion of the file is needed in a particular range. Byte
Serving is therefore a method of bandwidth optimization. In the HTTP/1.0 standard, clients
were only able to request an entire document. By allowing byte-serving, clients may choose to
request any portion of the resource. One advantage of this capability is when a large media file
is being requested, and that media file is properly formatted, the client may be able to request
just the portions of the file known to be of interest. <<todo>>
20
Chapter 6 Encryption and Decryption
The encryption we use in our mechanism is Advanced Encryption Standard (AES) with Counter
(CTR) mode. We use CTR mode because it is suitable with random access.
6.1
Advanced Encryption Standard
Advanced Encryption Standard (AES), also known as Rijndael, is a block cipher adopted as an
encryption standard by the U.S. government. It has been analyzed extensively and is now used
widely worldwide as was the case with its predecessor, the Data Encryption Standard (DES).
AES was announced by National Institute of Standards and Technology (NIST) as U.S. FIPS
PUB 197 (FIPS 197) on November 26, 2001 after a 5-year standardization process. It became
effective as a standard May 26, 2002. As of 2006, AES is one of the most popular algorithms
used in symmetric key cryptography.
The cipher was developed by two Belgian cryptographers, Joan Daemen and Vincent Rijmen,
and submitted to the AES selection process under the name "Rijndael", a portmanteau of the
names of the inventors. AES is not precisely Rijndael (although in practice they are used
interchangeably) as Rijndael supports a larger range of block and key sizes; AES has a fixed
block size of 128 bits and a key size of 128, 192, or 256 bits, whereas Rijndael can be specified
with key and block sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum
of 256 bits. Due to the fixed block size of 128 bits, AES operates on a 4×4 array of bytes,
termed the state (versions of Rijndael with a larger block size have additional columns in the
state).
6.2
Counter mode
Counter mode (CTR mode) is a block cipher mode operation which turns a block cipher into a
stream cipher. It generates the next keystream block by encrypting successive values of a
"counter". The counter can be any simple function which produces a sequence which is
guaranteed not to repeat for a long time, although an actual counter is the simplest and most
popular. CTR mode has similar characteristics to Output Feedback (OFB), but also allows a
random access property during decryption, and is believed to be as secure as the block cipher
being used. The initialization vector (IV) in this mode is the combination of nonce and the
counter. The nonce and the counter can be concatenated, added, or XORed together to produce
the actual unique counter block which we use as IV for encryption. CTR mode is well suited to
operation on a multi-processor machine where blocks can be encrypted in parallel, which is also
an advantage of CTR mode.
21
Nonce
c2b3f342…
Key
Counter
00000000
0
Block Cipher
Encryption
Original bytes
Ciphered bytes
Nonce
c2b3f342…
Key
Counter
00000001
10
Block Cipher
Encryption
Original bytes
Nonce
c2b3f342…
Key
Counter
00000002
0
Block Cipher
Encryption
Original bytes
Ciphered bytes
Ciphered bytes
Figure 11: Counter mode encryption
6.1.1
Padding
Because AES works on units of a fixed size; 16 bytes, but original data come in a variety of
lengths, this mechanism require that the final block be padded before encryption. Several
padding schemes exist. The simplest is to add null bytes to the original data to bring its length
up to a multiple of the block size, but care must be taken that the original length of the data can
be recovered; this is so, for example, if the original data is a C style string which contains no
null bytes except at the end.
6.3
Token
The token term in our mechanism is a combination of nonce, key and original file size. The
token also may contain any additional useful data. The reason why original file size is included
is because it needed when decrypting the file by VFS module. Both nonce and key generates
randomly by the encryption software for each file. This provides more secure protections
against malicious access. The token can be in any size, depend on original file size and used key
size. Supported key sizes are, 128 bits, 192 bits and 256 bits. The nonce size is 64 bits as it is
concatenated later by counter bytes.
22
6.4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Encryption software
string filename = args[0];
string filename2 = "enc_" + filename;
FileInfo info = new FileInfo(filename);
long size = info.Length;
int blockCount = (int) Math.Round((decimal)size /
16,MidpointRounding.AwayFromZero);
try
{
FileStream fs = File.OpenRead(filename);
FileStream fs2 = File.OpenWrite(filename2);
int offset = 0;
byte[] buffer = new byte[16];
byte[] encrypted = new byte[16];
byte[] nonce = new byte[8];
byte[] iv = new byte[16];
RijndaelManaged transform = new RijndaelManaged();
transform.Padding = PaddingMode.Zeros;
transform.GenerateIV();
transform.GenerateKey();
byte[] key = transform.Key;
Array.Copy(transform.IV, nonce, 8);
for (int i = 0; i < blockCount -1; i++)
{
byte[] counter = BitConverter.GetBytes((long)i);
Array.Copy(nonce,0,iv,0,8);
Array.Copy(counter, 0, iv, 8, 8);
fs.Seek(offset, SeekOrigin.Begin);
offset = fs.Read(buffer, 0, buffer.Length) + offset;
transform.IV = iv;
ICryptoTransform encrypt = transform.CreateEncryptor();
encrypt.TransformBlock(buffer, 0, 16, encrypted, 0);
fs2.Write(encrypted, 0, 16);
}
byte[] counter2 = BitConverter.GetBytes((long)blockCount-1);
Array.Copy(nonce, 0, iv, 0, 8);
Array.Copy(counter2, 0, iv, 8, 8);
fs.Seek(offset, SeekOrigin.Begin);
offset = fs.Read(buffer, 0, buffer.Length);
transform.IV = iv;
ICryptoTransform encrypt2 = transform.CreateEncryptor();
byte[] lastBLock = encrypt2.TransformFinalBlock(buffer, 0, offset);
fs2.Write(lastBLock, 0, 16);
fs.Close();
fs2.Close();
}
Figure 12: Portion of Encrypt() function
Our encryption software is really simple. We make use of RijndaelManaged class in the .NET
library to simplify the encryption process. When it encrypts a file, it generates a random nonce
and key by using GenerateIV() and GenerateKey() method respectively. Since there are no
nonce terms in RijndaelManaged class, we use the truncated IV. IV is truncated to 8 bytes to
produce the nonce. Since the .NET library 86not has the counter mode implementation, we must
23
implement the mode ourselves. The software divides the given file into blocks of 128 bit (16
bytes) and calculates the block counts. Typically the division result does not always return an
integer. In case it returns a floating number, it is rounded using “round away from zero”
rounding method. For example: a file with 1030 bytes size divides with 16 resulting 64.375.
However, instead rounded to 64 it’s rounded to 65. As shown in line 21 to 32 in Figure 12 the
function starts a loop to read the original file, decrypt and writes it to the destination file. The
loop starts with generating of a counter value. It cast i into long value and converts it into byte
array. The reason it cast the i into long is to make it 8 bytes long. The counter bytes array is
concatenated with the nonce bytes array to produce a temporary IV bytes array which is set into
transform.IV property. In each loop, 16 bytes data is read from the origin file and offset is
increased with the readbyte value. We make use of ICryptoTransform interface to perform the
block transformation by using TransformBlock() method. After it has been encrypted, it’s
immediately written into the destination file. This continues until it reach second last block.
When the loop finished the function perform transformation of last block.
After that, each block is fixed with the counter value incrementally. The software encrypts each
block using the key, the nonce and the counter. Since AES algorithm can encrypt 16 bytes
block only, the last block need to be padded in case it less than 16 bytes. In our mechanism we
use Zeros Padding which adds zero byte to fill the block. For example, a file with 1030 bytes
size split into 64 blocks of 16 bytes and a block of 6 bytes for the final block. The final block
will be padded by 10 of zero byte (0x0) to make it 16 bytes. The result file size will be 1040
bytes. After finishing the encoding the software uploads the encrypted file to web server and
register the filename, file size, file key and nonce to the authentication server.
6.3.1
Multithreads
We make use of the advantage of current technology and the suitability of CTR mode with
multithread. 87The whole file divided into blocks and the whole blocks set divided again into
several subsets as desired thread count. For example, in case we want to use 4 threads, 1030
bytes file divides into a set of 65 blocks. This block set will be divided again into 4 subsets. The
first, second, and third subset contain 16 blocks while the forth block contains 17 blocks. The
counter value for the first block for each subset is sum of previous subsets blocks. In this case:
Blocks subset
First
Second
Third
Forth
First counter value Blocks contained
1
16
17
16
33
16
49
17
Table 1: Multithreads calculation
Calculation
0+1
16 +1
16+17
16+33
All four threads execute simultaneously resulting in faster encrypting process.
24
6.5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
Decryption
private byte[] Decrypt(byte[] sourceFile, int startBlock, byte[] key, byte[]
nonce, int filesize, int offsBl)
{
int blockCount = (int)sourceFile.Length / 16;
int lastBlock = 0;
int totalBlock = (int)Math.Round((decimal)size / 16,
MidpointRounding.AwayFromZero);
int padding = 0;
byte[] lastBlk = null;
if ((startBlock * 16 + sourceFile.Length) > filesize)
{
lastBlock = 1;
padding = totalBlock - size;
}
try
{
int offset = 0;
byte[] iv = new byte[16];
byte[] decrypted = new byte[sourceFile.Length – (lastBlock * 16)];
int i = startBlock;
RijndaelManaged transform = new RijndaelManaged();
transform.Padding = PaddingMode.Zeros;
transform.Key = key;
for (int k = 0; k < blockCount - lastBlock; k++)
{
byte[] counter = BitConverter.GetBytes((long)i);
Array.Copy(nonce, 0, iv, 0, 8);
Array.Copy(counter, 0, iv, 8, 8);
transform.IV = iv;
ICryptoTransform decrypt = transform.CreateDecryptor();
decrypt.TransformBlock(sourceFile, offset, 16, decrypted, offset);
offset = offset + 16;
i++;
}
int dataLen = sourceFile.Length – offsBl – padding;
byte[] data = new byte[dataLen];
Array.Copy(decrypted, offsBl, data, 0, decrypted.Length - offsBl);
if (last == 1)
{
byte[] counter = BitConverter.GetBytes((long)i);
Array.Copy(iv, 0, nonce, 0, 8);
Array.Copy(counter, 0, nonce, 8, 8);
transform.IV = nonce;
ICryptoTransform decryptLast = transform.CreateDecryptor();
lastBlk = decryptLast.TransformFinalBlock(sourceFile, offset, 16 padding);
Array.Copy(lastBlk, 0, data, decrypted.Length - offsBl, lastBlk.);
}
return data;
}
... omit ...
Figure 13: Decrypt() method code
25
Decryption process is done in user-level module of VFS module in ReadFile() function. Instead
88
we place the code in the ReadFile() function, we made an independent method; Decrypt()
which can be called from the ReadFile(). Parameter of Decrypt() consists of bytes array
sourceFile which represent encrypted data portion, key and nonce, integer startBlock which is
first block counter, filesize represent original file size and offsBl represent offset in first block.
On initialization, the function count the blocks of the given sourceFile, verify if the sourceFile
contains the last block and calculates the padding if the last block exists. Afterward it tries to
begin block transformation by initializing the RijndaelManaged class. Using for statement, it
loops in blockCount. In each loop, it generates the IV base on given nonce and incremental
counter begins with startBlock. After that it copies the decrypted data into the destination bytes
array; data.
If the last block exist, the loop count decrease by 1 and last block transformation perform after
the loop. The decrypted last block copied into data and the data returns.
26
Chapter 7 Experiments
Some experiments have been done to prove the ability of this mechanism. For the experiment
purpose, we use a PC to act as both administrator PC (use to upload file to web server and
register file to authentication server) and a user PC (used to access the file on web server).
7.1
Experiment environment

User PC/Admin PC:
o CPU: Intel Core2 Quad 2.4 GHz
o RAM: 4 GB RAM
o Ethernet: 100 Mbps connection
o OS: MS Windows XP SP2

Web Server:
o CPU: Intel Pentium 4 3.0 GHz with Hyper-Threading
o RAM: 1 GB
o Ethernet: 100 Mbps connection
o OS: Ubuntu Linux 7.10
o Software: Apache 2.2.4 as web server software and ProFTPd as FTP server
Both user PC/admin PC and the web server is located in same intranet. We used a quad core PC
for the admin PC to test the full ability of the encoding software.
7.2
Encryption Rate
To analyse the encryption rate, we experimented with several file with different sizes: 5, 10, 50
and 100 MB. Each file was encrypted using several thread counts: 1, 2 and 4 threads. We use
the user PC to run the experiment. As you can see from the result shown in table and graph the
encryption rate nearly double when we doubled the thread count. The content of the file
(whether it is media file, text file or compressed file) does not affect the encryption rates.
27
50
44.234375
45
40
Duration (seconds)
35
30
22.125
25
1 thread
24.0625
2 threads
20
4 threads
15
5
12.796875
12.109375
10
2.328125
4.40625
2.46875
0.625
1.265625
0
5
6.65625
1.28125
10
50
100
Filesize (MB)
Figure 14: Graph showing encryption rate using different thread count
7.3
Transfer rate
We benchmark the transfer rate by comparing several methods used to transfer data online. The
methods are: direct access HTTP/HTTPS by using web client, Server Message Block (SMB)
protocol, WebDAV protocol using Windows XP embedded WebDAV Mini Redirector (shell
namespace extension), our VFS module with and without decryption. The methods we used to
benchmark transfer rate for HTTP and HTTPS is using WGET [] software. For benchmark of
SMB, our VFS module and WebDAV we mount a logical drive in My Computer and use
FastCopy [] program to copy file on remote server to local drive.
As the result we can see from the graph in Figure 15, HTTP transfer rate is the fastest as
expected. Surprisingly our VFS module transfer rate is faster than the SMB/CIFS protocol
transfer rate. Our VFS module with prefetch does not improve the transfer rate as it use to
decrease request counts. As expected using decryption in VFS module decrease the transfer rate
to 25% of original transfer rate. However if we make this number as the maximum speed of our
mechanism we can still afford any intermediate level of video streaming.
For example, if we want to stream a video file the mechanism can still afford to stream up to 20
Mbits/s.
28
12
11.32
10.42
10.84
10.34
9.64
10
Transfer rate (MB/s)
10.51
8
6
4
2.64
2
0
Samba
VFS Module VFS Module VFS Module
(prefetch) (decryption)
HTTP
HTTPS
WebDAV
Figure 15: Graph of transfer rate using different methods
100%
90%
80%
CPU Usage
70%
60%
50%
40%
30%
20%
10%
0%
Samba
VFS Module VFS Module VFS Module
(prefetch) (decryption)
HTTP
Figure 16: Graph of CPU usage of different method
29
HTTPS
WebDAV
Random access
30
Chapter 8 Integration with Movie Database System of Japan
Institute of Sports Sciences
This mechanism is scheduled to replace current data distribution mechanism in movie database
system of Japan Institute of Sports Sciences (JISS).
8.1 Current data distribution mechanism in JISS movie database
system
As shown in Figure 17, the process starts with client logging in to the user administrator server
by sending his/her username and password and the client received list of movie IDs. Client can
access movies on the list by sending the movie ID back to user admin server to received URL.
User admin server logs the request and create random key and store it in a temporary table.
Beside the key, the table also contains client IP address, movie ID and timestamp. After that, the
user admin server responses to the client PC with the URL of proxy with key. Client creates
connection to the proxy with the key as filename. The proxy verifies the request by contacting
the user admin server and sends the requested key and the client IP address. User admin server
then, compares the key, IP address and timestamp with the temporary table. If they are valid and
table is still not expired, the user admin server responds with the real URL to the proxy server.
The proxy server would then access the real file on the streaming server and relay it to client’s
PC.
8.2 New data distribution mechanism propose to JISS movie
database system
The current mechanism which was developed 2 year ago, suffers from high latency due to
relaying data through a proxy. In our mechanism we simplify the system by removing the proxy
server. Comparing to the current mechanism which has 4 components, our mechanism only has
3 components. Based on Figure 18 the process start with client logging in to the authentication
server using the username and password and receive movie IDs list. When client want to access
a movie on the list, it requests the movie filename by sending the movie ID to the authentication
server and that authentication server replies with the movie’s filename and token. The client
application access the file directly from the local virtual drive. The VFS module in client PC
open connection to web server and access the wished file, decrypt it using the key and nonce
contained in the token and pass it to the application.
31
Streaming server
6. Access
http://server/moviefile
Movie files
publisher
(upload movie files)
(register movie ID)
7. Stream the file
4. key, client IP address
Proxy
5. http://server/moviefile
1. username, password
movieID
3. Access
http://proxy/key
User admin server
8. Relay the
streaming
2. http://proxy/key
Client PC
Figure 17: Current data distribution mechanism in movie database system of JISS
8.3
The advantages compared to current mechanism
As we can see from the explanation above there is several advantages of this new mechanism
compared to the current mechanism.


Less components
As the working component reduced, the latency of whole system is reduced. It also
saves the whole costs of the system as less hardware needed.
No proxy
As the proxy is not involved any more, movie file is downloaded directly from the web
server instead relaying it using the proxy. In this mechanism, VFS module acts as the
proxy but yielded much more performance due to its existence in client PC, instead of
located in a separate and remote server.
32

Encryption
There no encryption involved in current mechanism of data distribution in JISS movie
database system. Our new mechanism introduces an encryption property which
provides a much more secure system. For example in the current mechanism if the real
URL of the movie file is leaked, malicious user can gain access to the file and the file
would lose its confidential. On the other hand, in our mechanism, even if the file can be
accessed, malicious user can’t open it without the proper key.
Web server
(Streaming server)
(Encrypt and
upload movie file)
Movie files
publisher
(Register movie ID)
3. Access encrypted file
on web server
Authentication server
(user admin server)
4. Response the
encrypted file
1. Username, password,
movie ID
2. Movie filename, token
Client PC
Figure 18: New data distribution mechanism proposed to movie database system of JISS
33
Chapter 9 Conclusions
In the paper, we have proposed a new data protection mechanism and how it works. By using
this mechanism, users are able to access protected data over HTTP with a random access
capability while providing a tight security. In the future, we evaluate the use of multiple pooled
connections for each file to relieve stress on a single connection. We are also considering
implementing support to FTP servers.
<<todo>>
34
Acknowledgements
First and foremost, I would like to thank my supervisors, Professor Kozo Itano, Professor
Yasushi Shinjo, Professor Akira Sato and Professor Hisashi Nakai of Graduate School of
Systems and Information Engineering at the University of Tsukuba for being exceptional
advisors. They never 89seize to amaze me with their unlimited patience, attention to detail and
helping me improve the thesis’s presentation style. I can’t imagine how I could have perfected
this thesis without their support.
I am also grateful to all my lab partners in the Software Laboratory at the University of Tsukuba
especially Mr. Daiyuu Nobori for offering they’re support and encouragement since the day I
joined the research lab.
I owe my deepest debt of gratitude to my parents, Che Abdullah and Nik Rahmah, for providing
me with the resources to succeed in life. They have constantly advised and supported me in
everything that I have done. I am especially grateful to my mother for teaching me that time
management, organization, and diligence, are keys to success.
Finally, I would like to acknowledge the financial support that made this research possible. All
the years when I was in Japan was supported by the Public Service Department of Malaysia and
I am grateful for being given the opportunity to further my studies in a foreign country to widen
my knowledge.
35
References
[1] Chikara Miyagi, Koji Ito, and Jun Shimizu: "Creating the SMART system - A Database for
Sports Movement", The Engineering of Sport 6, Vol.3, pp.179-184, 2006.
[2] Hiroki Asakawa: Design and implementation of user-mode file system library, 2008
http://decas-dev.net/en/
[3] Microsoft Developer Network (MSDN) Documentation: DirectShow, 2008
[4] MSDN Documentation: Registering Shell Extensions, 2008
[5] Galaxy File system Toolkit: Chad Yoshikawa, 2005 http://galaxy.sourceforge.net/
[6] GMail Drive shell extension: Bjarke Viksoe 2007, http://www.viksoe.dk/code/gmail.htm
[] Filesystem in Userspace: http://fuse.sourceforge.net/
[] Ivan Latunov: Inter-Process Communication in .NET Using Named Pipes
http://ivanweb.com/articles/namedpipes/
[] RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1, 1999
[] GNU Wget: Hrvoje Nikšić, http://www.gnu.org/software/wget/ Accessed on 18 January
2008
[] FastCopy: SHIROUZU Hiroaki, http://www.ipmsg.org/tools/fastcopy.html.en Accessed on
18 January 2008
36
Download