FIOS: A Flexible Virtualized I/O Subsystem to Alleviate

advertisement
FIOS: A Flexible Virtualized I/O Subsystem to Alleviate
Interference among Virtual Machines
Qi Zhang, Hai Jin, Xiaofei Liao, Dingding Li, Wei Deng
Cluster and Grid Computing Lab
Services Computing Technology and System Lab
Huazhong University of Science and Technology, Wuhan, 430074, China
hjin@hust.edu.cn
ABSTRACT
Serving as the infrastructure of cloud computing, virtualization
technologies have attracted considerable interest in recent years
for their excellent resource utility, scalability, and high
availability. Virtual M achine M onitor (VMM ), which is a key
element in cloud computing, enables multiple guest operation
systems running simultaneously to share the same physical
resources. This may lead to significant interference of disk I/O
performance among virtual machines (VM ). Particularly, the I/O
performance of none I/O intensive domains can be seriously
injured by the advent of I/O intensive nodes. We address this
problem by building a block-level cache in the virtualized layer
to absorb I/O requests from different domains. This method not
only effectively alleviates the I/O performance interference
caused by I/O intensive domains, but also greatly improves the
I/O performance of guest OS. We implement and evaluate a
Flexible I/O Subsystem (FIOS) within Xen VMM and show an
evident reduction of I/O performance interference among virtual
machines as well as a remarkable improvement of disk
throughput.
Categories and Subject Descriptors
B.4.3 [Interconnections(Subsystems)]:
Asynchronous/synchronous operation
General Terms
Performance, Design, Experimentation
Keywords
Cloud Computing, I/O, Xen, Virtualization
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies
are not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
ICUIMC’12, February 20–22, 2012, Kuala Lumpur, Malaysia.
Copyright 2012 ACM 978-1-4503-1172-4…$10.00.
1. INTRODUCTION
Cloud computing, which makes possible for users to access large
pools of computational and storage resources on demand, is
gaining prominence. This trend is evidenced by the increasing
development of cloud services and cloud platforms such as
Gmail, Facebook, and Amazon EC2. Virtual M achine M onitor
(VMM ), e.g. Xen, is playing a key role in cloud computing.
VMM offers an abstract and unified layer on top of the
underlying hardware resources. Also, it provides services that
allow multiple computer operating systems to execute on the
same computer hardware concurrently.
Although virtualization technologies offers a lot of benefits
including flexibility, security, ease to configuration and
management, reduction of cost [1], there is an obvious
problem—performance interference. Applications running in
virtual machines always expected to own the physical resources
exclusively so that they can achieve best performance. However,
when multiple virtual machines running simultaneously on the
same computer, the contention of underlying physical resources
directly leads to performance degradation of the applications.
Various researches [4, 6, 26] have focused on solving this
problem, but most of them concentrate on the allocation of CPU
and network bandwidth, little attentions have been paid to
alleviate disk I/O performance interference among virtual
machines, which is a critical factor in determining the overall
performance of I/O applications running in virtual machines.
Specifically, once an I/O intensive application starts to run, the
I/O performance of co-located applications residing in other
virtual machines will be seriously injured. For example, in order
to provide a better user experience, interactive application
M icrosoft Word require a short average response time. At the
same time, when extracting Linux source code from a
compressed file, intensive I/O is necessary to minimize the
executing time. When these two applications are running
simultaneously on different virtual machines and contending the
limited disk I/O bandwidth, the I/O performance of M icrosoft
Word decreases significantly.
Apart from limited disk I/O bandwidth, other factors can also
lead to this performance degradation. For instance, in virtual
machine environments, I/O operation has to entrap into a VMM
and/or a privileged VM , which may turn out to be a performance
bottleneck for virtualized I/O systems [2]. Previous research has
revealed that in para-virtualization, VMM protects the well
behaved virtual machines from all misbehavior domains except
the disk I/O intensive one [3].
This paper is the first one that provides a method which not only
largely alleviates the I/O performance interference caused by I/O
intensive domains, but also improves the I/O performance of
virtual machines. We allocated a block-level cache in VMM to
absorb I/O requests from guest VMs. And a time sharing cache
management strategy is also introduced to improve the efficiency
of FIOS. Result of the experiments showed an evident reduction
of I/O performance interference among virtual machines as well
as a remarkable improvement of disk throughput.
The rest of this paper is organized as follows. We describe the
related work in the next section. Section 3 discusses the design of
FIOS. Section 4 introduces the Xen and Blktap briefly and
represents our implementation. Section 5 uses queuing theory to
analyze the performance of FIOS. Section 6 describes the
experimental methodology and discusses the result. Finally,
section 7 concludes the paper and talks about the future work.
2. RELATED WORK
Based on Xen , Gupta et al has designed and implemented a set
of primitives to enforce the performance isolation across virtual
machines [6]. Firstly, they implement a XenMon tool that can
accurately measure the per-VM resource consumption. Secondly,
SEDF-DC scheduler was introduced to calculate the total VM
resource consumption in allocating CPU. Finally, they use Share
Guard to restrict the resource usage in VM M on behavior of each
virtual machine. Compared with our work, most of Gupata’s
research focus on the isolation of Network I/O and CPU
performance, their methods do not concern with the performance
of disk I/O.
Similar to our solution, Yiming Hu et al have presented a novel
disk storage architecture called DCD for the purpose of
optimizing disk I/O performance [18]. They use a small log
disk, called cache-disk, as a secondary disk cache to optimize
write performance. The physical properties of cache-disk is the
same as the normal disk, but the different data units and different
ways in which data are accessed of cache-disk enable it to have a
higher data accessing speed. Whether this architecture works well
in virtualized environment requires further investigation.
M any studies [4, 24, 25] focus on adjusting the schedulers in
VMM to provide each virtual machine with fair performance.
Some researches revealed that traditional VMM schedulers have
focused on fairly sharing the processor resources among domains
while leaving the scheduling of I/O resources as a secondary
concern. Although certain extensions, such as boost optimization,
sorting the run queue base on remaining credits and tricking the
scheduler [4], have applied to VMM scheduling to improve the
I/O performance of virtual machines, the effect of these
approaches depends on the type of applications running in virtual
machines. Another study[5] examined different combination of
schedulers in both virtual machine and VM M, and found that
different kinds of combination lead to various I/O performance,
when NOOP schedule algorithm is used in VMM , result can be
the best. However, it does not provide any method to alleviate the
interference of I/O performance among virtual machines,
especially caused by I/O intensive domain. Some other studies
concentrating on enhancing I/O performance isolation in
virtualized environment [6], but most of them pay attention to the
isolation of network I/O and CPU allocation.
There have also been other recent works concentrating on
identifying the characterization and consumption of I/O
applications [8, 19, 23] and improving the virtualized I/O
performance [2, 16, 22]. But few of them concern about
alleviating disk I/O performance interference among virtual
machines.
3. DESIGN
In this section, we firstly discuss the goals of FIOS. Then, detail
solutions are described to meet our goals. The solutions are
divided into four parts: the first three parts are introduced
according to the processing of I/O requests in FIOS. In the
fourth parts, a time-sharing strategy for the usage of physical
disk bandwidth is introduced to avoid excessive memory
consumption.
3.1 Goals
Our primary goal is to minimize the degradation of I/O
performance brought by I/O intensive virtual machines.
Transparency and portability are also our goals, various kinds of
operating systems and applications can be able to adapt to FIOS
without any change of their source codes. Besides, the
architecture of FIOS should not be restricted to any specific
virtualized environments.
3.2 Solutions
3.2.1 Intercepting requests
Virtual machines are not permitted to manipulate the physical
disk directly. All the I/O operations emitted by virtual machines
should be handled by VMM [7]. The p rocessing of I/O requests
in virtualized environment is shown in figure 1. I/O requests from
guest OS are firstly put into disk queue of virtual machine and
disposed by virtual disk driver. Then, these requests are taken out
by the physical disk driver locating in VM M, which packs these
logical I/O requests into physical ones. After that, physical disk
driver will send command to disk controller and start data
transfer. When data transfer has been completed, VMM will
inform the corresponding virtual machine and the applications
which are waiting for the result of I/O operations can continue
working.
Virtual Machine N
Virtual Machine 1
APP
1
logical I/O
request
2
APP
APP
Guest OS
Guest OS
APP
1
logical I/O
request
Virtual Disk Driver
Virtual Disk Driver
2
3
3
Virtual Machine Monitor (VMM)
physical I/O
request
5
physical I/O
request
Physical Disk Driver
4
Disk
Figure 1. I/O path in virtualized environme nt.
The bandwidth of physical disk is limited, thus once an I/O
intensive virtual machine exists which is likely to consume the
bandwidth in great qualities, the I/O performance of other virtual
machines will be affected severely. Considering that the speed of
memory accessing is faster than that of disk, we allocate a
V-cache in VMM for each guest domain and intercept the I/O
requests from virtual machines.
As shown in figure 2, instead of being sent directly to physical
disk driver, logical I/O requests will be intercepted by VMM in
FIOS. VMM will inform the I/O application in virtual machine to
continue working right after these requests have arrived in
V-cache. A thread called ―FlushCtrl‖ in VMM will dispose the
requests in V-cache and complete data transfer some time later.
In this way, requests from I/O intensive virtual machines can be
absorbed by V-cache and the crazy consuming of disk bandwidth
can be reduced. Therefore, the I/O performance of other virtual
machines will not be injured by the advent of I/O intensive
domain. Besides, I/O applications in virtual machine do not have
to wait for the completion of I/O requests, so the I/O performance
of virtual machines will also be improved.
Virtual Machine
App
1
begin;
r = receive a write request;
if ( exist a request R in V-cache & R.addr == r.addr)
{
R.data = r. data;
mark R as the one used most recently;
}else{
Insert r into V-cache;
mark r as the one used most recently;
}
end;
Guest OS
logical I/O
request
Figure 3. Processing writing request in FIO S.
2
Virtual Disk Driver
4
3
V-cache for I/O requests
Physical I/O
request
5 FlushCtrl
VMM
Physical 5
Disk Driver
6
Disk
Figure 2. I/O path in FIO S.
Through the real-time interception of I/O requests in VMM, we
can detect the I/O intensive virtual machine quickly and
accurately.
The I/O intensive virtual machine has very frequent I/O
operations, so its corresponding virtual disk queue is usually
filled with larger numbers of requests. We use two metrics to
judge whether the virtual machine is an I/O intensive one: (1) the
history I/O records, i.e., number of total byte transferred by this
domain and (2) the virtual I/O rate, i.e., the number of bytes
transferred per CPU second [8], both of which can be collected in
VM M by scanning the virtual disk queue.
3.2.2 Arranging requests
Arranging I/O requests in corresponding V-cache bears many
advantages. First of all, direct connection between I/O process in
virtual machine and the physical disk can be cut off. From the
prospect of virtual machine, its I/O operations can be considered
as finished when the I/O requests reaches V-cache, therefore,
even if multiple virtual machines executing at the same time, I/O
requests torrents can be absorbed by V-cache, thus the bandwidth
of physical disk will not become their primary competition
resources. Secondly, storing requests provides the opportunity for
batch requests processing, which can largely increase the overall
I/O throughput. Because of semantic gap lying between VM and
VMM , many I/O optimization methods no longer work well in
virtual machine environments. For example, metadata operations
in VM will be translated into small file read/write in VMM
which directly decrease the I/O performance, however, this
problem can be mitigated if these requests can be stored and
processed in batch.
Besides these advantages, each virtual machine in FIOS will be
initialized with a V-cache to store its own I/O requests. On one
hand, the isolation and security of I/O data coming from various
virtual machines can be better ensured, and on the other hand,
VMM can handle the requests separately according to the
different priorities of virtual machines.
3.2.3 Disposing requests
Although I/O requests from a certain domain are stored in a
V-cache, different kinds of requests will be treated differently
according to their own charactistics.
As shown in figure 3, when VM M receives a writing request,
FIOS tries to find out in V-cache whether there is a request
whose indicative physical disk address is the same as the
receiving one. If there is, we only replace the I/O data of the
finding request with that of the received one; if not, the received
writing request should be stored into V-cache and marked as the
latest. Once the writing request has been inserted into V-cache,
VMM will notify the corresponding virtual machine immediately,
so that the I/O applications in virtual machine can continue
running.
As shown in figure 4, when receives a reading request, FIOS tries
to find out whether there is a request whose indicative physical
disk address is the same as the received one. If there is, return the
corresponding I/O data to the I/O applications in virtual machine
immediately and mark the data as the latest; if not, the virtual
machine has to get the data from disk, the data will be, on one
hand, send directly to the I/O applications in virtual machine, and
on the other hand, stored in the V-cache so that next time when
the data is needed, it can be get quickly from V-cache.
begin;
r = receive a read request;
if ( exist a request R in V-cache & R.addr == r.addr)
{
return r.data to the process in VM;
mark R as the one used most recently;
}else{
get data from disk;
return the data to process in VM;
store the data in V-cache;
mark r as the one used most recently;
}
end;
Figure 4. Processing reading request in FIOS.
Besides dealing with new requests, refreshing V-cache is also a
key aspect of FIOS. Due to the resources of memory is rare, the
strategy of refreshing the cache is of significant importance to the
performance of the FIOS. We describe some key factors
associated with refreshing as follows.
Firstly, we need two thresholds; one is the upper limit, the other
is the lower limit. They are used to indicate when to begin the
refresh operation and when to stop: once the utilization of the
cache reaches its upper limit, the ―FlushCtrl‖ thread in VMM
starts to flush the data in V-cache until the utilization of the cache
reaches its lower limit. How to determine the value of these two
thresholds will be discussed in the latter part of this paper.
Secondly, substitute algorithm is indispensible. We choose LRU
(Least Recently Used) [9] as the replacement algorithm of
V-cache. Since this method can make full use of the location
feature of the program, which enhance the hitting rate of V-cache,
so that the overall I/O performance of the whole system can be
improved.
Thirdly, there are particular occasions that refreshing should
started immediately. 1) If a virtual machine shutdown normally,
I/O requests in the corresponding V-cache should be disposed in
time and the V-cache should be released in order to avoid
unnecessary data corruption which may lead to the damage of file
system. 2) If a virtual machine crashes abruptly, I/O requests in
the V-cache must be processed as soon as possible to guarantee
the integrity of the file system and avoid the damage of the disk
image 3) Before the virtual machine begins to migrate to another
VMM , remaining requests in V-cache should be flushed
immediately for the sake of ensuring the integrity and
consistency of the data in physical disk.
Last but not the least, we created several ―FlushCtrl‖ thread in
FIOS which is responsible for flushing the data in V-cache. The
amount of this thread varies between 0 and 5 according to the
utility of V-cache. ―FlushCtrl‖ copies all the data from
corresponding V-cache into its own data space, so that the
V-cache can be cleaned up and ready for further use. However,
we should not ignore a serious problem caused by this method:
how to guarantee data synchronism, that is to say, what if when
the I/O thread in virtual machine needs to read some data which
is in neither V-cache nor physical disk? At this time, VMM has
to get the data from data space of corresponding ―FlushCtrl‖
thread. Due to the location characteristic of program and the LRU
algorithm we have taken, the probability of the situation
described above is rather small, so that the overall I/O
performance of the FIOS will not be affected seriously.
3.2.4 Avoiding excessive memory consumption
Compared with physical disk, memory is so rare that we cannot
expand the capacity of V-cache without any limit. Otherwise, the
system will collapse because of too much memory consumption.
Therefore, we have designed a time-sharing strategy for the
usage of physical disk bandwidth.
We divided I/O intensive VM s into two categories: short-term
and long-term. The former one refers to domains that produce
large amount of I/O requests in a short time which V-cache is
able to accommodate, while the latter category refers to VMs that
constantly produce requests for a long time which V-cache
cannot sustain.
illustrated in figure 5), and attach each part to other none I/O
intensive virtual machines in order to increase their capability of
accommodating I/O requests. And then allocate all the bandwidth
of physical disk to this domain. This may have several
advantages, on one hand, allocating all the I/O bandwidth of
physical disk to the short-term I/O intensive VM guarantees that
VM1
(long-term I/O
intensive)
V-cache1
VM2
VM3
VM4
V-cache2
V-cache3
V-cache4
Disk Driver
Virtual Machine Monitor (VMM)
Disk
Figure 6. Dealing with long-term I/O intensive domain .
the I/O requests torrent from these domains can be handled in a
short period of time. On the other hand, V-cache of other none
I/O intensive VM is unlikely to reach their upper limit during this
short period of time because of their increased capacity, that is to
say, these VM s do not need any physical disk bandwidth.
Therefore, the short-term I/O intensive virtual machine will not
affect the I/O performance of other none I/O intensive VM s.
As a short-term I/O intensive domain continues to run, it will
become a long-term one. In this situation, described by figure 6,
we firstly rebuild the V-cache (V-cahce1) for this domain to
prevent its future consumption of disk bandwidth. Secondly, stop
handling the I/O requests in V-cache1, allocating all the physical
disk bandwidth to the other none I/O VM s. When the utilization
of V-cahce1 reaches its upper limit, the state of physical disk will
be judged, if it is not busy, we stop disposing the I/O requests in
the V-cache of none I/O intensive VM s, and allocate all the
bandwidth of physical disk to the I/O intensive VM until the none
I/O intensive VMs need to refresh their V-caches again; if the
physical disk is busy working, we suspend the I/O intensive
virtual machine until the physical disk is free. Therefore, the
priority of I/O requests from none I/O intensive VMs can be
improved, so that their I/O performance will not be affected by
long-term I/O intensive VM .
4. IMPLEMENTATION
VM1
(short-term I/O
intensive)
VM2
VM3
VM4
In this section, we first introduce Xen VMM and Blktap, based
on which FIOS was implemented. Secondly, we describe the
implementation of V-cache in details.
4.1 Xen and Blktap
part 1
part 2
part 3
V-cache1
V-cache2
V-cache3
V-cache4
part 1
part 2
part 3
Disk Driver
Virtual Machine Monitor (VMM)
Disk
Figure 5. Dealing with short-term I/O intensive domain .
As shown in figure5, for the short-term I/O intensive VM , we
split its V-cache into several parts (e.g. part1, part2, part3 as
In order to avoid the overhead of virtual machine performance
caused by instruction translation and simulation, Xen [11]
hypervisor offered a split driver model which allows the guest
OS accessing the real device effectively with the help of Domain
0 [7]. As illustrated in figure 7, for physical disk, its driver has
divided into two parts, the one resides in guest OS is called
front-end and the other locates in domain 0 is called back-end.
Both ends communicated with each other through events
channels and an I/O control ring which is a memory page shared
by both ends.
Domain 0
User space
Domain U
Tapdisk
User space
App
shown in figure 8. Every node in linked list represents an I/O
request. We will discuss the benefits brought by these data
structures in details in the latter part of this paper.
Double Linked Lisk
User I/O Ring
Kernel space
Disk
driver
Blktap
Kernel
space
Front
end
List Node
Int addr;
char* content;
int data_size;
node *previous;
node *next;
Int addr;
char* content;
Head int data_size;
node *previous;
node *next;
Event Channel
Shared
I/O Ring
Int addr;
char* content;
int data_size;
node *previous;
node *next;
Tree
Node1
Xen Virtual Machine Monitor
(VMM)
Tree
Node2
Disk
……
Int addr;
char* content;
int data_size; Tail
node *previous;
node *next;
int ListNodeAddr;
Int addr
Tree
Node4
AVL Tree
Tree
Node3
Figure 7. Split driver model in Xen .
them into the I/O control ring and notifying domain 0 to handle
these requests through event channel in VM M . Domain 0 is
responsible for transferring data between I/O buffers and physical
disk.
Blktap [12] equals to the back-end of disk driver which resides in
domain 0. Besides the kernel I/O control ring shared with
frontend, Blktap is also equipped with an I/O ring shared with
user space thread Tapdisk, which provides users with I/O
operation interfaces such as open and close, read and write.
Blktap maps I/O requests in kernel ring to user ring, Tapdisk
fetches the requests from user ring and issues new file I/O
operations with specific manner, and then submit them into the
kernel of Dom 0, just like a process does during a common disk
I/O.
Tapdisk has many unique advantages since it resides in the user
space of domain 0. Firstly, metadata disk formats such as
Copy-on-Write, encrypted disks, sparse formats and other
compression features can be easily implemented. Secondly, it
facilitates the development of soft devices. Thirdly, it allows soft
devices to be constructed as user space applications in an virtual
machine and developers can work with high level languages and
debuggers [13]. Tapdisk opens the image file of virtual machine
with O_DIRECT flag, which in order to guarantee the semantic
of I/O operations in virtual machines, that is to say, when the
virtual machine trigger the flush operations, it must wait until the
data reaches the physical disk.
Therefore, great burden of disk can be predicted as long as
intensive I/O domain exists.
4.2 V-Cache based on Xen and Blktap
Our implementation is based on Xen hypervisor, taking use of
Blktap. We changed the flow of processing I/O requests by
modifying Blktap.
A V-cache is established in VMM . Instead of forwarding I/O
requests directly to physical disk driver, FIOS puts these requests
into the corresponding V-cache, and then returns as if the
handling of this I/O operation has completed. I/O requests stored
in cache will be handled by ―FlushCtrl‖ sometime later.
In order to guarantee the correctness of I/O operations, items
stored in the cache should include the content, the size and the
physical disk address of the I/O data. Besides, since the size of
V-cache is limited, the content must be refreshed in time to avoid
overflow of V-cache. We choose double linked list supplemented
with AVL tree to organize to data structure of V-cache, as is
Figure 8. O rganization of V-cache .
Each V-cache is represented by a head node of double linked list
and a root node of AVL tree. The reason of implementing
V-cache with an AVL tree is as follows: If the virtual machine
issues I/O requests frequently, the size of double linked list will
become very large, therefore searching the corresponding node
in this list becomes a tough task which will seriously injure the
performance of FIOS. So we use another data structure-AVL
tree [14]-to deal with searching operation. As for an AVL tree
with N nodes, the average searching complexity is O(logN),
which is far less lower than that of a linked list O(N), especially
when N is very large. However, in order to decrease the data
redundancy in V-cache, we do not duplicate the nodes in linked
list into AVL tree. As an efficient method, we only put a pointer,
which indicate the memory address of the node in double linked
list, into the node of AVL tree, each pointer is a four bytes
integer in a 32-bit machine.
Refreshing the cache is an inefficient but inevitable operation.
We take the following measures to improve the efficiency
refreshing. Firstly, requests are combined in V-cache. Although
operating system does the same work before submitting the
requests into disk queue, the maximum queue length goes up to
100 or 1000 [15], which is much smaller than the capacity of
V-cache, therefore combination work in V-cache is more
efficient. Secondly, requests in V-cache are sorted according to
their disk addresses so that disk rotating can be reduced. Thirdly,
new thread called ―FlushCtrl‖ is created to make an
asynchronous refreshing.
As mentioned before, the thresholds of V-cache are significant
factors in deciding when to flush. However, deciding the
appropriate values of thresholds is a hard problem and, more
importantly, their optimal values should depend heavily on the
workloads. In the situation of light I/O, the upper limit of
V-cache should be larger so that the cache can store more
requests before flush. On the other hand, in the situation of
heavy I/O, the upper limit should be smaller to avoid the over
flow of the cache [16].
We take a rate based approach to set up the value of upper limit.
Suppose that
a(t )
is the arriving rate of I/O requests of
V-cache at a particular time
t.
If a(t )  a(t  1) , which
means the arriving rate of I/O requests is increasing, the value of
upper limit should be decreased. On the contrary,
if
a(t )  a(t 1)
, which means the arriving rate of I/O
requests is decreasing, the value of upper limit should be
increased. So we decide to calculate the value of upper limit as
follows:
h(t )  h(t  1)*
a(t  1)
a(t )
(1)
k 
Ls   (kpk ) 
k 0

where
The value of lower limit is not only decided by the arriving rate
at the previous time, but also by the flushing rate, in other words,
how fast I/O request can be disposed by physical driver. We
suppose f (t ) is the flushing rate, when f (t ) is faster than
bandwidth.
arriving rate, the value of lower limit could be lager in order to
make the flushing less aggressive. When f (t ) is lower than
the arriving rate, the value of lower limit should be smaller so
that to prevent the overflow of V-cache. So we decide to
calculate the value of low threshold as follows:
Ws 
l (t )  l (t  1)*
a(t  1)
f (t )
*
a(t )
f (t  1)
(2)




1 

(4)
 
, which represents the utilization of I/O
And the average response time of each I/O request is determined
by formula (5):
Ls


1
1

 (1   )   
(5)
5.1 I/O Performance Interference in Xen
When an I/O intensive VM starts to run, both of the average
length of virtual disk queue and the average response time of
each I/O request will be affected. Figure 10 describes the model
which reflects the arrival of an I/O intensive VM .
Some experiments with different functions were carried out to
adjust the value of h(t ) and l (t ) , and found out that these
two simple schemes perform well in practice and have low
computational requirement [16].
…
…
N
In this section, Queue Theory is used to analyze and compare
the interference of I/O performance caused by I/O intensive
domains in Xen VMM and FIOS. Before deduce the
performance parameters of the models, the following
assumptions are made:
1) All of the I/O requests arrival process conforms to the
Possion distribution. And we assume the arrival rate is  . The
arrival of I/O requests in each VM obeys the following formula:
s
k!
es
k = 0,1,2…
Vdisk Queue N
T
N
Figure 10. I/O service model in Xen when I/O intensive
VM arrives.

1 

(3)
N

( N  T  1)
 ( N  T  1)
N
virtual disk queue of each VM
3) Physical disk scheduler is CFQ (Complete Fair Queuing),
which is the default and most popular scheduler in domain0.
( N  T  1)
time of each I/O request
Ls1 

…
…
Vdisk Queue N

N
Figure 9. I/O service model in Xen with Blktap.
According the queue theory, the average virtual disk queue
length of each virtual machine can be calculated by formula (4):
the average response
Ws1 can be calculated
as formula (7)

1 


N
(7)


1
1 

(8)
 ( N  T  1)     T  1
N
N
given the same value of the parameters  ,  and N ,
Ws1 
Vdisk Queue 1
Ls1 and
and formula (8).
As shown in figure 9, if the number of VM is N, I/O service can
be modeled as follows:

(6)
Therefore, according to formula (6), the average length of
And we also assume that the arrival rate of I/O intensive VM is
T times as that of none I/O intensive VM . As usually, T is
much larger than 1.
2) The discipline that physical disk driver serves the VDisk
queue conforms to the exponential distribution, and the service
rate is  .
N

Vdisk Queue N-1

5. SIMULATION
P{N (t  s)  N (t )  k} 
Vdisk Queue 1

N
Thus
Ls1

virtual disk queue length and average response time of the I/O
request will be affected by the value of parameter T: the larger
the T is, the more Ls1 and Ws1 will decrease, in other words,
the I/O performance of none I/O intensive VM s will be affected
by the arrival of I/O intensive domain.
Latency is defined simply as the time that the system has spent
to complete a single I/O operation. It is also an important feature
to estimate the I/O performance of virtual machine.
5.2 I/O Performance Interference in FIOS
Vdisk Queue 1

M
N
Fairness is the equality of throughput divided among different
running virtual machines [16]. We use Jain’s fairness measure to
quantify the fairness between virtual machines. Jain’s fairness
measure [17] ranges between 0, which means completely unfair,
and 1, which means completely fair. Jain’s fairness is defined as
…
…
Vdisk Queue N-1

M
N
Vdisk Queue N-1
T
N
M
n
Figure 11. I/O service model in FIO S when I/O
intensive VM arrives.
(
fairness 
Figure 11 describes the model in FIOS. I/O requests can be
stored in the V-cache, that is to say, they do not have to reach
the physical disk before return. Thus in FIOS, the service
rate  '  M  , where M represents the speed ratio of memory
access and disk access. Since each VM is equipped with an
independent V-cache, when an I/O intensive VM arrives, Ls
and
Ws
in FIOS can be calculated by the following formula:


' N 
M  NM 
(9)
Ls ' 
'


1   ' NM   
Ws ' 
Ls '


N
NM   
(10)
(11)
N
Given the same parameter
 and  , through the comparison
between (7) and (10), (8) and (11), we notice that
Ls '  Ls1
and Ws '  Ws1 . It is clear that in FIOS, I/O performance
interference caused by I/O intensive VM will be decreased.
M oreover, through the comparison between (4) and (10), (5) and
(11), we can find that Ls '  Ls and Ws '  Ws , so it is
reasonable to predict that I/O performance of none I/O intensive
VM s in FIOS will be better that in Xen.
6. EVALUATION
In this section, we firstly describe the key metrics which we
used to evaluate the performance of FIOS. Then we introduce
the corresponding benchmarks we used in the experiment.
Finally, we describe the evaluation steps. Also, the results of the
experiments are represented and analyzed.
6.1 Key Metrics
Generally speaking, we choose three primary criterions to
estimate the performance of FIOS: throughput, latency and
fairness.
Throughput is the amount of I/O data disposed by system in a
unit of time. In FIOS, the throughput is measured by specific I/O
benchmarks described below.
X )
2
i
i 1
(12)
n
n
X
2
i
i 1
where X i means the throughput of virtual machine
i.
6.2 Experimental Steps
All results in this paper were collected on an Intel Xeon
platform with two 1.6GHz processors,4GB RAM and 160 GB
S-ATA II hard driver with 7200 RPM (ST3160815AS). The
Linux 2.6.18.8 kernel was used throughout. Each virtual
machine was allocated 40GB of the 160GB physical disk and
used Ext3 as the ordered model as their file sy stems. The virtual
disks were created in contiguous space on physical disk to
minimize the seeking time when performing I/O operations on
different virtual disks. In order to avoid the affect by memory
cache in virtual machine, we allocate each virtual machine with
only 256M B memory, which is small enough to trigger disk
operations during I/O process. Dom0 is running a 64-bit CentOS
5.4 distribution and the hypervisor is Xen 3.4.3.
Cleary, different workloads have different disk access patterns.
There has been no one optimal I/O system for all different
workloads [18]. Thus we selected two typical benchmarks to
estimate the performance of FIOS, IOZone and DBench.
IOZone [20] is a file system benchmark tool. It generates and
measures a variety of file operations to test file I/O performance.
DBench [21] is a tool to generate I/O workloads to either a file
system or to a networked server. The workload can be specified
by a configuration file in the DBench working directory, which
consists of a mixture of file system operations.
Table1 reflects the performance of I/O subsystem in original
Xen VMM while we running IOZone to read/write a 2GB file
and running DBench respectively in a domain configured 256M
memory.
Table 1. Disk bandwidth consumption of different
benchmarks
IOzone
DBench
Average
queue length
Average
waiting time
I/O
bandwidth
utiliz ation
142.33
2.05
1167.05ms
0.86ms
99.00%
24.29%
According to the table1, when running IOZone, the average
physical disk utilization can reach to 99%, while running
DBench, the average physical disk utilization is about 25%.
Therefore, we use IOZone to stimulate an I/O intensive VM
and use DBench to stimulate a none I/O VM .
To begin with, three none I/O intensive VM s are running on Xen
VMM and FIOS respectively , and the average throughput
measured by DBench is collected. While they are running,
another VM is started, which executes IOZone.
Figure 12 shows that firstly, when running on Xen, the I/O
throughput of all none I/O intensive domains is largely injured
by advent of I/O intensive VM at about 200th second, which
have decrease from about 125M B/sec to 85 M B/sec - a nearly
32% loss. However, while running on FIOS, there is only a very
slight decrease in the value of I/O throughput of none I/O
intensive VM s. This is largely due to the V-cache we have
created in FIOS, which has intercepted and stored the I/O
request from VM s. Secondly, comparing the two sets of curves
in Figure 12, even if without the interruption of I/O intensive
virtual machine, I/O throughput of none I/O intensive virtual
machines running on FIOS is about 225 M B/sec, while running
on the original Xen VMM , the I/O throughput is about
125M B/sec, which is almost 45% lower than the former one.
The reason is that in FIOS, I/O operation will return
immediately when they reach the corresponding V-cache, and
does not have to wait until the data has been written into
physical disk.
VM1 on Xen
VM2 on Xen
VM3 on Xen
240
VM1 on FIOS
VM2 on FIOS
VM3 on FIOS
240
Throughput(MB/S)
VMs on FIOS
180
I/O intensive VM starts
I/O intensive VM ends
VM1 on Xen
VM2 on Xen
VM3 on Xen
600
600
I/O intensive
VM ends
400
Latency(ms)
6.3 Results
We can observe the fairness of VMs’ I/O throughput from
figure13, the value of fairness is 0.99 both on Xen and FIOS.
Because in Xen, the default schedule algorithm is CFQ
(Completely Fair Queuing) and our modification does not alter
this quality.
400
I/O intensive
VM starts
200
200
0
0
150
200
250
300
350
400
Time(sec)
Figure 14. I/O latency of VMs on Xen.
VM1 on FIOS
Latency(ms)
We carried out the experiment by comparing the throughput ,
latency and fairness of I/O operations in virtual machines
running on the Xen VM M and FIOS.
VM2 on FIOS
VM3 on FIOS
200
200
100
100
180
0
0
I/O intensive VM starts
160
120
120
VMs on Xen
200
250
300
Time(sec)
350
400
Figure 12. None I/O intensive VMs running on Xe n and
FIO S, with interruption from I/O intensive domain .
Fairness Measure
Xen
FIOS
0.8
0.8
0.4
0.4
0.0
0.0
Figure 13. Fairness of VMs’ I/O performance .
180
200
220
240
Time(sec)
260
280
300
Figure 15. I/O latency of VMs on FIO S .
Figure 14 and figure 15 have reflected the latency of I/O
operations in VM s, which are running on Xen and FIOS
respectively. By comparing the two figures, it is obvious that
first of all, the lines are significantly serrated in figure14 while
lines in figure15 are much smoother apart from for a few
exceptions. Each point on lines indicates the latency of an I/O
operation, saw tooth in the figure14 indicates the jitter of I/O
performance. Besides, the average I/O latencies reflected in
figure14 is far more lager than that reflected in figure15, which
demonstrates that even without I/O intensive VM , interferences
among normal domains are still significant, which largly reduce
VM s’ I/O performance. Finally, when I/O intensive VM is
started, I/O latencies reflected in figure14 are suffered an
obvious increase, which sustained until I/O intensive VM is
stopped. However, in figure15, the latencies nearly remain the
same except for a few sharply increase. This increase is due to
that certain time is needed to identify the I/O intensive VM .
as it has I/O operations. And, in FIOS, virtual machines only
have a slight interference with each other.
1VM on Xen
1VM on FIOS
2VMs on Xen
2VMs on FIOS
3VMs on Xen
4VMs on Xen
3VMs on FIOS
4VMs on FIOS
Throughput(MB/sec)
240
240
I/O intensive VM ends
160
160
80
80
I/O intensive VM starts
0
0
0
20
40
60
80
100
Time(sec)
120
140
160
180
200
Figure 16. Different numbers of VMs running on Xen and
FIO S, with inte rruption from I/O intensive domain .
In another experiment, different numbers (1, 2, 3, 4) of none I/O
intensive VM s are running on VM M respectively to see how the
number of VM will affect the performance of FIOS. Also an I/O
intensive VM is started at about the 60th second. As is
manifested in figure 16, while running on Xen, average I/O
throughput of VM has decreased more than 50%, from about
170M B/sec to 80M B/sec as the number of these machines
increase from 1 to 4. However, while running on FIOS, the case
is different, we can hardly notice any decline in I/O throughput
of none I/O intensive VM s. This is because in FIOS, every VM
has its own V-cache, which acts as a cushion of I/O operations.
FIOS
200
200
Throughput (MB/sec)
Xen
150
150
100
100
50
50
7. CONCLUSION AND FUTURE WORK
Disk I/O is a time consuming operation. When multiple domains
share the same disk resource, I/O performance interference
among them is conspicuous because of the limited disk
bandwidth. We have demonstrated that when 5 VM s are running
simultaneously on the same VMM , their I/O performance is only
25% of that when there is only one running VM . Worse still,
when there exists an I/O intensive VM , the I/O performance of
other domains will be seriously injured.
Our implementation has effectively avoided the interference of
I/O performance among VM s. Especially prevent the remarkable
injury brought by I/O intensive domains. In our system, when 5
VM s are running simultaneously on the same VMM , their I/O
performance can achieve nearly 94% of that when only one
domain is running. Also, the I/O performance of none I/O
intensive VM will not be affected seriously by the advent of I/O
intensive domain.
Furthermore, while running on FIOS, VM s’ I/O performance
has been promoted significantly in comparison with running on
Xen VM M .
M oreover, we believe that our implementation can be easily and
conveniently applied to other virtualization infrastructures.
In the future, our studies will focus on improving the flexibility
and stability of V-cache in FIOS. When multiple I/O intensive
domains are running simultaneously on VM M , I/O traffic they
produce may be so heavy that exceed the capacity of FIOS,
which will easily lead to the overflow of V-cache, this will cause
significant decrease of overall performance of FIOS. Therefore,
future cache management strategies are needed to deal with this
situation.
We also plan to employ SSD as the p hysical storage in this
system. It is well known that compared with traditional disk ,
SSD bears many advantages such as higher performance and
lower energy consumption, but whether SSD can be well
adapted to virtualization environment is an interesting and
challenging issue.
8. ACKNOWLEDGMENTS
0
1VM
2VMs 3VMs
1VM
2VMs
3VMs
0
Figure17. Different numbers of VMs running on FIO S and Xen.
In the third experiment, various numbers of none I/O intensive
VM s are running on Xen and FIOS respectively , however, there
is no interrupt brought by I/O intensive VM . This experiment is
designed to discover the I/O performance interference among
none I/O intensive VM s. Figure 17 shows that if running on
FIOS, the value of I/O throughput of these virtual machine
decreases only about 6%, from 230M B/sec to 216 MB/sec,
while the number of normal VM increases from 1 to 5. On the
contrary, if running on Xen, the value of I/O throughput suffers
a significant decrease about 75%, from 171M B/sec to 43M B/sec.
It reveals that the new coming VM , no matter I/O intensive or
not, will affect the I/O performance of existing domains as long
This work is supported by the China National Natural Science
Foundation (NSFC) (No. 60973133), the M oE-Intel Information
Technology Special Research Foundation under grant No.
M OE-INTEL-10-05.
9. REFERENCES
Che, J., He, Q., Gao, Q., and Huang, D. 2008. Performance
measuring and comparing of virtual machine monitors. In
Proceedings of Embedded and Ubiquitous Computing
(EUC). 381-386.
[2] Liu, J., Huang, W., Abali, B. and Panda, D.K. 2006. High
performance VM M -bypass I/O in virtual machines. In
Proceedings of the annual conference on USENIX. 29-42.
[1]
[3]
Deshane, T., M ccabe, M . and Neefe, J. Performance
isolation of a misbehaving virtual machine with Xen,
VM ware and Solaris containers.
http://people.clarkson.edu/~jnm/publications/isolationOfMi
sbehavingVMs.pdf.
Ongaro, D., Cox, A.L. and Rixner, S. 2008. Scheduling I/O
in virtual machine monitors. In Proceedings of the 4th
ACM SIGPLAN/SIGOPS international conference on
Virtual execution environments (VEE). ACM , New York,
NY, 1-10.
[5] Boutcher, D. and Chandra, A. 2010. Does virtualization
make disk scheduling passé? ACM SIGOPS Operating
Systems Review. Vol. 44, 20-24.
[4]
[6]
[7]
Gupta, D., Cherkasova, L., Gardner, R. and Vahdat, A.
2006. Enforcing performance isolation across virtual
machines in Xen. In Proceedings of International
Conference on Middleware. 342-362.
Fraser, K., Hand, S., Neugebauer, R., Pratt, I., Warfield, A.
and Williamson, M . 2004. Safe hardware access with the
Xen virtual machine monitor. In Proceedings of the 1st
Workshop on Operating System and Architectural Support
for the on demand IT InfraStructure (OASIS).
Pasquale, B.K. and Polyzos, G.C. 1994. Dynamic I/O
characterization of I/O intensive scientific applications. In
Proceedings of the 1994 ACM/IEEE conference on
Supercomputing. ACM , New York, NY, 660-669.
[9] Chrobak, M . and Noga J. 1998. LRU is better than FIFO. In
Proceedings of the 9th ACM-SIAM Symposium on Discrete
Algorithms (SODA). Society for Industrial and Applied
M athematics, Philadelphia, PA, USA, 78-81.
[8]
[10] Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E.,
Limpach, C., Pratt, I. and Warfield, A. 2005. Live
migration of virtual machines. In Proceedings of the 2nd
conference on Symposium on Networked Systems Design &
Implementation (NSDI). Vol. 2. USENIX Association,
Berkeley, CA, USA, 273-286.
[11] Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T.,
Ho, A., Neugebauer, R., Pratt, I., and Warfield, A. 2003.
Xen and the art of virtualization. In Proceedings of the 19th
ACM symposium on Operating systems principles (SOSP).
ACM , New York, NY, 164-177.
[12] Blktap. http://wiki.xensource.com/xenwiki/blktap.
[13] Warfield, A., Hand, S., Fraser, K. and Deegan, T. 2005.
Facilitating the development of soft devices. In
Proceedings of the annual conference on USENIX Annual
Technical Conference. 379-382.
[14] Larsen, K.S. 1994. AVL trees with relaxed balance. In
Proceedings of the 8th international parallel processing
symposium. 888-893.
[15] Ruemmler, C. and Wilkes, J. 1993. UNIX disk access
patterns. In Proceedings of Winter 1993 USENIX. 405-420.
[16] Batsakis, A., Burns, R., Kanevsky, A., Lentini, J. and
Talpey, T. 2008. AWOL: an adaptive write optimizations
layer. In Proceedings of the 6th USENIX Conference on
File and Storage Technologies. 67-80.
[17] Jain, R., Chiu, D.M . and Hawe,W.R. 1984. A quantitative
measure of fairness and discrimination for resource
allocation in shared computer system. Technical Report
TR-301, DEC Research.
[18] Hu, Y. and Yang, Q. 1996. DCD—disk caching disk: A
new approach for boosting I/O performance. In
Proceedings of the 23rd annual international symposium
on Computer architecture (ISCA). ACM , New York, NY,
169-178.
[19] Cherkasova, L. and Gardner, R. 2005. M easuring CPU
overhead for I/O processing in the Xen virtual machine
monitor. In Proceedings of USENIX Annual Technical
Conference.
[20] William D.Norcott. IOZone, http://www.iozone.org, 2001.
[21] DBench. http://dbench.samba.org.
[22] Dong, Y., Dai, J., Huang, Z., Guan, H., Tian, K. and Jiang,
Y. 2009. Towards high-quality I/O virtualization. In
Proceedings of SYSTOR 2009: The Israeli Experimental
Systems Conference (SYSTOR). ACM , New York, NY,
12-19.
[23] Chadha, V., Illiikkal, R., Iyer, R., M oses, J., Newell, D. and
Figueiredo, R.J. 2007. I/O processing in a virtualized
platform: a simulation-driven approach. In Proceedings of
the 3rd international conference on Virtual execution
environments (VEE). ACM , New York, NY, USA,
116-125.
[24] Seelam, S.R. and Teller, P.J. 2006. Fairness and
performance isolation: an analysis of disk scheduling
algorithms. In Proceedings of IEEE International
Conference on Cluster Computing. 1-10.
[25] Seelam, S.R. and Teller, P.J. 2007. Virtual I/O scheduler: a
scheduler of schedulers for performance virtualization. In
Proceedings of the 3rd international conference on Virtual
execution environments (VEE). ACM , New York, NY,
105-115.
[26] M ei, Y., Liu, L., Pu, X. and Sivathanu, S. 2010.
Performance measurements and analysis of network I/O
applications in virtualized cloud. In Proceedings of the 3rd
International Conference on Cloud Computing. 59-66.
Download