Cross-VM Covert Channel Risk Assessment for

advertisement
2014 IEEE 22nd International Conference on Network Protocols
Cross-VM Covert Channel Risk Assessment for
Cloud Computing: An Automated Capacity Profiler
Rui Zhang, Wen Qi, Jianping Wang
Department of Computer Science, City University of Hong Kong, Hong Kong SAR
Email: zhangrui.ray@gmail.com, qi.wen@my.cityu.edu.hk, jianwang@cityu.edu.hk
take some counter-measures against potential cross-VM covert
channels on their platforms.
In the literature, the risk of a cross-VM covert channel is
usually measured by the transmission rate with certain error
rate1 . For example, Xu et al [4] reported one of their experiment
results with the L2 cache covert channel as follows. When the
transmission rates are 215.11 bps and 85.86 bps, the error rates
are 5% and 1%, respectively. Such quantitative measures only
provide the average risk of a covert channel. To a cloud service
provider, the estimation on the maximum transmission capacity
of a covert channel is often more important in assessing the risk
of information leakage.
Classic Shannon equation [6] has been widely used to
estimate the transmission capacity of communication channels, which, however, is hard to be applied to estimate the
transmission capacity of cross-VM covert channels. The main
reason is that there is no precise timer in most cross-VM
covert channels, thus, it is hard to calculate signal-to-noise
ratio (SNR) required in classic Shannon equation. Shannon
entropy formulation [7] was proposed to estimate transmission
capacity of covert channels. It does not require calculation of
SNR and is suitable for estimating the transmission capacity of
cross-VM covert channels. With Shannon entropy formulation,
however, to assess the transmission capacity of cross-VM covert
channels, there are still several challenges. In Section II, we
will elaborate more on Shannon entropy formulation and the
challenges of applying it to assess the capacity of cross-VM
covert channels. Here, we conceptually present two challenges.
Firstly, in order to apply the Shannon entropy formulation to
estimate the capacity of cross-VM covert channels, we need to
obtain the corresponding conditional probabilities which reflect
their symbol classification accuracy. The conditional probabilities are statistically estimated with general assumptions made
for all cloud platforms, which may lead to loose estimation of
cross-VM covert channel capacity on a specific cloud platform.
Thus, in order to assess the capacity of cross-VM covert
channels established on any particular cloud platform, we need
a method to automate the process of obtaining the conditional
probabilities specific to that cloud platform.
Secondly, in order to estimate the capacity of cross-VM
covert channels, we have to construct a covert channel and op-
Abstract—Cross-VM covert channels leverage physical resources shared between co-resident virtual machines, like CPU
cache, memory bus, and disk bus, to leak information. The
capacity of cross-VM covert channels varies on different cloud
platforms. Thus, it is hard for cloud service providers to estimate
the risk of information leakage caused by cross-VM covert
channels on their own platforms. In this paper, we develop an
Auto Profiling Framework of Covert Channel Capacity (AP F C 3 )
to automatically profile the maximum capacities of various crossVM covert channels on different cloud platforms. The framework
consists of automated parameter tuning for various cross-VM
covert channels to achieve high data rate and automated capacity
estimation of those cross-VM covert channels. We evaluate the
proposed framework by constructing fine-tuned cross-VM covert
channels on different virtualization platforms and comparing the
optimized achievable data rate with the estimated maximum capacity computed using the proposed framework. The experiments
show that in most cases, the capacity estimated using AP F C 3 is
very close to the achieved data rate of constructed covert channels
with fine-tuned parameters.
Index Terms—Cross-VM covert channel; Capacity estimation;
Shannon entropy
I. I NTRODUCTION
Virtualization is one of the key technologies used in cloud
computing. Most virtualization technologies, such as KVM,
XEN, VMWare, and Hyper-V, can provide logical isolation
between co-resident virtual machines (VM) on the same physical machine. However, virtualization only provides isolation on
software level. Co-resident VMs still share common pools of
physical resources, e.g., disk bus, memory bus, and CPU cache.
These shared hardware components open doors for cross-VM
information leakage.
Among various cross-VM information leakage methods like
side channels [1] [2] and covert channels [3] [4] [5], cross-VM
covert channel is extremely hard to detect due to unnoticed
information transmission. In recent years, various cross-VM
covert channels have been reported [2]–[5] in public cloud
platforms. Despite such efforts, it is hard for a cloud service
provider to estimate the threat of information leakage on its
own cloud platform as the capacity of a covert channel is
jointly determined by hardware platforms, hypervisors, and
workloads. Therefore, it is important to provide a quantitative
measure for evaluating the risk of cross-VM covert channels
on specific cloud platforms. With such a quantitative measure,
cloud service providers can decide whether it is worth to
978-1-4799-6204-4/14 $31.00 © 2014 IEEE
DOI 10.1109/ICNP.2014.24
1 The error rate is usually measured by repeatedly sending data
strings through the covert channel and counting the number of incorrectly received ones.
25
A. Cross-VM covert channels
timize its transmission quality through parameter tuning which
could be complex and cumbersome. Our preliminary studies
have shown that the capacity of cross-VM covert channels
varies dramatically with their parameter settings and hardware
environments, which indicates that it may not be practically
feasible to construct a covert channel with the maximum data
rate through brute-force methods.
To solve these issues, we propose a generic framework,
named AP F C 3 , which can automatically profile the maximum
capacity of cross-VM covert channels. The main contributions
of this paper are summarized as follows.
• We propose a compact framework to estimate the transmission capacity of cross-VM covert channels. The proposed
framework consists of two essential components, namely,
parameter tuning for optimizing the covert channel and
machine learning for evaluating its maximum capacity.
The entire process can be orchestrated using a centralized
controller.
• We propose an automatable procedure to obtain the prerequisite for applying the Shannon Entropy formulation [7]
on estimating the capacity of cross-VM covert channels
established on any given cloud platform.
• We statistically model the noise of a cross-VM covert
channel under a specific cloud platform to eliminate the
covert channel implementations which perform poorly
and, hence, narrow down the parameter space.
• Using the fined-tuned covert channel implementations, we
collect a number of sample signals with their corresponding ground truth labels. Lightweight machine learning
tools are utilized to cross-validate the samples and, hence,
estimate the capacity of the covert channel.
• We evaluate the proposed framework on three cloud platforms and four types of cross-VM covert channels. The
result shows that AP F C 3 produces capacity estimation
close to the data rates achieved by constructed covert
channels using fine-tuned parameters.
The rest of the paper is organized as follows. In Section II, we
introduce the background and discuss our preliminary efforts
on analyzing cross-VM covert channels. Then, we give an
overview of AP F C 3 in Section III. In Section IV, we elaborate
the technical details of the proposed framework. In Section V,
we conduct extensive experiments to evaluate the proposed
framework. Existing works on cross-VM covert channels and
capacity assessment methods are reviewed in Section VI. Finally, we conclude the paper in Section VII.
Cross-VM covert channels can be established between two
VMs that co-reside on the same physical machine. If both
VMs share a hardware component of the machine, the usage
by one VM can unexpectedly affect the usage of the other
VM. Existing works [2]–[4] show that covert channels can be
established by intentional contention and contention detection
on the shared hardware components between two co-resident
VMs.
Covert channels can be categorized in timing-based covert
channels and storage-based covert channels. In this paper, we
focus on the timing-based covert channels since most cross-VM
covert channels discovered in the literature are timing-based
channels [2]–[4].
The procedure of constructing a timing channel can be
generalized as follows.
•
•
The sender encodes data by varying the time required for
performing an operation whose execution time is sensitive
to the contention from other co-resident VMs.
The receiver monitors the system status by frequently
performing the operation and measuring the execution
time.
A possible implementation of the timing channel can be as
follows. The receiver divides its time into equal sized sampling
periods. For each sampling period, the receiver repeatedly
executes the contention sensitive operation and counts the
number of times it is executed. The sender divides its time
into much longer equal sized intervals. In each interval, the
sender attempts to transmit one data bit. To transmit a bit
“1”, the sender poses contention to the operation by repeatedly
executing it. To transmit a bit “0”, the sender stays idle.
There are mainly four types of cross-VM covert channels
discovered in the literature. We now briefly introduce these four
types of cross-VM covert channels which will be evaluated in
our framework.
•
•
•
•
II. BACKGROUND AND P RELIMINARIES
Before introducing the framework for estimating cross-VM
covert channel capacity, in this section, we first introduce
the basic principles in constructing various types of covert
channels. Then, we present our preliminary results to show
that the covert channel capacity is determined by hardware
platforms and parameter settings. We introduce the Shannon
entropy equation and discuss the challenges of applying it to
estimate the capacity of cross-VM covert channels.
L2 cache covert channel, which uses the access to CPU L2
cache as the contention sensitive operation for establishing
the covert channel.
CPU load covert channel, which can be established between VMs sharing CPU cores. CPU hungry operations
are used as the contention sensitive operation for establishing the covert channel.
Memory bus covert channel, which utilizes memory bus
lock as the contention sensitive operation.
Disk bus covert channel, which utilizes reading and writing to the disk as the contention sensitive operation.
B. Same covert channel implementation on different cloud
platforms
To demonstrate that the capacity of a covert channel varies
on different hardware platforms, we conduct the following
experiment. We implement the memory bus covert channel
introduced by Wu et al [3] in three different test environments,
shown as follows:
26
(a) Xen
(b) VMWare
Fig. 1.
(c) KVM
Square wave covert channel signals captured on different virtualization platforms
1) Xen: Intel Xeon CPU E5-2670 2.60GHz with Hypervisor
Xen x64, 32GB memory
2) VMWare: Intel Core 2 Duo CPU E8500 3.16GHz with
Hypervisor VMWare ESXi 5.5, 6GB memory
3) KVM: Intel Xeon CPU X5650 2.67GHz with Hypervisor
KVM x64, 48GB memory
In each platform there are two VM instances and all of the VMs
run Linux Ubuntu 12.04 LTS with 1 GB memory and 1 vcpu.
The same settings will be used in the remaining experiments
of this paper.
We let one VM alternate between sending bit “1” and “0”
through the covert channel. Figure 1 shows sample periodic 01 square wave signals captured at a co-resident VM on these
three test environments. In all three cases, we use exactly the
same implementation and parameter settings. We can observe
that the signal sample recorded on KVM exhibits a significant
level of noise while the signal samples show clear square
waves on the other two platforms. This indicates that even if
the implementations of the covert channels are identical, the
resulting capacities on different cloud environments may be
different. Therefore, a generic profiling tool is required to assess
the capacity of covert channels on different cloud environments.
Fig. 2.
Covert channel error rate against different classification thresholds
Here, C represents the channel capacity, B stands for the
bandwidth and SN R is the signal-to-noise ratio of the communication channel. To avoid the calculation of SN R, we select
an alternating approach which is developed from the Shannon
Entropy equation [8] [9].
We now briefly introduce Shannon Entropy equation. Suppose that data is encoded into a sequence of symbols for
transmission. There are N distinct symbols in total (we refer to
the set of distinct symbols as the alphabet). Let i be one of the
N distinct symbols received by the receiver when the sender
was casting out symbol j. Therefore, when i = j, the receiver
correctly receives a symbol, otherwise, it gets an error symbol.
Let B be the average number of symbols which can be cast
out from the sender in one second. In addition, there are two
statistical features for defining a communication channel, qj and
R(i|j). Here, qj , j = 1 . . . N , is the probability of occurrence
for symbol j in the sequence of symbols cast out by the sender.
R(i|j) denotes the conditional probability that the receiver
received symbol i given the fact that the sender transmitted
symbol j. Let C be the covert channel capacity. According to
[7], C can be calculated by the following equation:
C. Different covert channel implementations on the same test
environment
In this experiment, we show that the covert channel capacity
varies with different parameter settings on the same test environment. To interpret the square wave signals, the receiver
can use a simple threshold approach, i.e., signal value greater
than the threshold t0 is interpreted as a bit “1”, otherwise,
“0”. We show that different t0 leads to different covert channel
capacity. We implement the L2 cache covert channel on the Xen
platform. We conducted 12 groups of estimation with the same
settings except the threshold t0 . As anticipated, we find that the
error rate varies a lot with different classification threshold. As
shown in Figure 2, the error rates range from 1.60% to 50.26%.
Only when the threshold is equal to 18.5 (which indicates the
number of executions of the contention sensitive operation in
a sampling period), the classification performs the best and
reaches an error rate 1.60%. This indicates that even for the
same test environment, the resulting capacities with different
parameters may be different.
C=B
max
qj ,j=1...N
qj R(i|j) log
R(i|j)
k,k=1...N qk R(i|k)
(2)
The decisive variables of (2) are R(i|j), as the value of qj
which maximizes C can be determined base on R(i|j) using the
iterative algorithm proposed by Arimoto [10]. We now briefly
discuss the tasks to be accomplished in order to apply (2) to
calculate the capacity of cross-VM covert channels.
Firstly, most timing channels communicate using only two
distinct symbols which are commonly denoted as “0” and “1”.
Due to the lack of a precise timer, symbol losses and insertions
often occur in cross-VM covert channels. The symbol losses
and insertions cannot be accounted by (2), as it only takes
symbol misclassification into account. To resolve this problem,
D. Maximum capacity estimation using Shannon Entropy
The classic Shannon equation (1) is often used to estimate
the capacity of communication channels.
C = B log2 (1 + SN R)
(1)
27
covert channels, we adopt a lightweight approach which will
be discussed in section IV.
Based on the noise model, we can generate sample signals
under different parameter settings of the cross-VM covert
channel. Here, we introduce three configurable parameters for
the timing based cross-VM covert channels.
• Threshold t0 is used for distinguishing between the high
and low signal values.
• The interval for sending one data bit from the sender is
denoted as ds .
• The interval of one sampling period at the receiver is
denoted as dr .
Given a specific parameter setting for a covert channel, we
can then generate mock samples by adding noise based on
the model to the ground truth signals. Subsequently, we can
statistically analyze the sample signals with respect to their
ground truth and assess the covert channel setting.
We iterate this parameter tuning procedure for different
covert channel implementations. We collect a number of parameter settings which lead to the highest covert channel capacity.
As generating these mock signal samples takes much less time
in comparison to collecting real samples, we can perform a
large number of trials and gain an overview of the parameter
space. For discrete parameters, such as the encoding method
used by the covert channel, we are able to enumerate several
possibilities. The detailed covert channel implementation will
be introduced in Section IV.
Fig. 3.
Flowchart demonstrating the procedure of AP F C 3
a solution is to group multiple consecutive binary symbols
as one symbol, which makes occasional symbol losses and
insertions insignificant.
Secondly, we need to design an automated procedure to
calculate R(i|j) cross different cloud platforms. Normally,
R(i|j) is obtained by statically analyzing the covert channel
implementation based on a set of assumptions. Making precise
assumptions for different cloud platforms, however, is practically infeasible. In order to automate the capacity assessment
process for any given cross-VM covert channel in an arbitrary
cloud platform, we need to collect sample signals for each
symbol in the alphabet and classify them with a best-practice
classifier. Then, R(i|j) is computed from the classification
result. To fulfill this requirement, we proposed a cross-VM
covert channel capacity estimation framework, AP F C 3 , which
will be introduced in the following section.
III. OVERVIEW
OF
B. Generating real samples from ground truth
After obtaining the reasonable covert channel implementation, we establish a covert channel on two co-resident VMs
under the test environment with the corresponding parameters.
According to the formulation shown in (2) and without changing the covert channel implementation, we treat the signals
exchanged between the sender and the receiver through the
cross-VM covert channel as symbols.
For each symbol in the alphabet, we need to collect a number
of signal samples from the receiver side in order to apply
statistical analysis. The main challenge of this step is that the
boundary between signal samples cannot be reliably received by
the receiver. On the other hand, we do not intend to force such
synchronization as the ability of separating consecutive samples
also affects the capacity of the covert channel. The separation
of signal samples is achieved by a preprocessing process which
will be introduced in Section IV. After performing sample
collection, we should have a set of signal samples with their
corresponding ground truth labels. We can then apply statistical
analysis to these samples.
AP F C 3
In this section, the framework for our AP F C 3 is briefly
introduced. The procedures of our proposed cross-VM covert
channel capacity estimation framework consists of three main
steps, namely, parameter tuning, sample generation, and sample
analysis, as demonstrated in Fig. 3. We now briefly introduce
each main step. The technical details of each main step will be
introduced in Section IV.
C. Sample analysis
According to (2), we next compute the conditional probability, R(i|j), for estimating the covert channel capacity. We
first apply a customized hierarchical clustering algorithm to
pre-process the samples. After that, we use Neural Network
classifier to cross-validate the real samples and compute R(i|j)
for the classification result.
A. Parameter tuning for covert channel implementation
In order to narrow down the parameter space, we start with
modeling the noise of cross-VM covert channels established in
any particular cloud platform. As getting a precise model of
the noise is infeasible due to the exotic behavior of cross-VM
28
!"
Figure 4 depicts the signal-to-noise ratios (SNR) of the
gettimeof day function against a range of sleep time intervals
on different test environments, as introduce in section II. We
have two important observations. Firstly, the timers on different
platforms show significantly different levels of precision. We
can see that the Xen platform has the highest timer precision.
Secondly, all of the SN R curves exhibit some degree of fluctuation especially for the KVM environment. This indicates that
the timer function on all platforms produces some inconsistent
readings.
Here, we model the effect of system background noise on
cross-VM covert channel signals to perform an initial parameter
space reduction. The noise causes two types of deformation on
cross-VM covert channel signals: (1) the variation on the time
interval for transmitting one bit and (2) the variation of signal
values for each sampling point.
According to the two types of deformation that noise may
cause on cross-VM covert channel signals, we propose the
following noise model. Suppose that a ground truth signal has
many data bits where each data bit consists of a sequence of
sampling points. We denote the number of sampling points for
the ith data bit of a ground truth signal as ni and the signal
value of the jth sampling point for the ith data bit as vij . For the
corresponding mock signal, we denote the number of sampling
points for the ith data bit as ni and denote the signal value of the
jth sampling point for the ith data bit as uji . The relationship
between the mock signal and the ground truth signal can be
represented using the following equations. For data bit i,
#
$%
%&
Fig. 4.
SNR of gettimeofday function in different platforms
Lastly, we apply the iterative method proposed by Arimoto
[10] to compute the maximum covert channel capacity. This
procedure is repeated for each covert channel implementation
collected from the parameter tuning step. We also report the
confidence interval for the capacity of the covert channel
settings.
IV. T ECHNICAL D ETAILS
In this section, we discuss the technical details of the
aforementioned three main steps in our AP F C 3 .
ni = ni + δi
A. Parameter Tuning
As introduced in the previous section, we speed up the
parameter tuning procedure by modeling the noise of crossVM covert channels. Here, we separate the noise into two parts,
namely, the system background noise and the front-end noise
caused by user applications. The front-end noise is unstable
as the users may run different applications at different time
instances. The background noise is stable and can be modeled
statistically. These two types of noise combined together form
the noise of cross-VM covert channels. By modeling the system
background noise, we can perform an initial parameter space
reduction for the covert channel implementation. In this section,
we first conduct an experiment to show the existence of system
background noise. Then, we demonstrate our noise model using
normal distributed random variables.
As introduced in section II, timing channels use system timer
functions to receive the covert data bits, the precision and
consistency of the timer readings is key to their transmission
accuracy. We show the existence of system background noise
to cross-VM covert channels by demonstrating the inconsistent
readings of timer functions on measuring the execution time
of system operations. Here, we utilize the Linux C function
gettimeof day to measure the execution time of the sleep
function with an input interval ranged from 0 to 10000 microseconds. We minimize the front-end noise by silencing all
user induced workloads on the machine.
(3)
Here, δi is a random variable which models the variation in the
number of sampling points for transmitting one data bit due to
noise. For the jth sampling point of the ith data bit,
uji = vij + λij
(4)
Here, λij is a random variable which models the variation of
the signal value due to noise.
We model δi and λi as independently, identically and
normally distributed random variables, i.e., δi ∼ N (μ1 , θ1 ),
λi ∼ N (μ2 , θ2 ). Here μ1 , θ1 , μ2 , and θ2 are sampled from the
test environment. To verify this statistical model, we conducted
the following experiment on our test environment.
The quantile-quantile plot is often used to compare two
distributions. If the pairwise quantiles of both distributions
are all close to the x = y diagonal on the plot, the two
distribution can be deemed similar, otherwise, different. We
sample a number of signal values on our test environment and
plot them against the corresponding normal distribution, the
result is shown in Figure 5. We can observe that the samples
follows closely to their corresponding distribution. We have
similar observation for the number of sampling points collected
on one data bit.
Using this noise model, we are able to generate mock signal
samples under any covert channel implementation and obtain a
quick feedback on its transmission quality.
29
Fig. 6.
Signal delimiter diagram
can obtain a range of capacity estimations and, hence, produce
a confidence interval on the maximum capacity estimation.
!
( C. Sample analysis
'
'
After collecting signal samples with their corresponding
ground truth from the fine-tuned covert channel implementation, we conduct statistical techniques to analyze the samples
and estimate the maximum capacity of the target covert channel. The sample analysis will be accomplished in three steps.
Each sample collected consists of a sequence of signal strength
readings. The dimension of a sample refers to the number
of readings it consists of. We first pre-process the samples
collected to make sure they have the same dimension. Then,
we apply a best-practice machine learning algorithm on the preprocessed samples to produce the classification result. Lastly,
capacity estimation is computed using an iterative algorithm
proposed by Arimoto [10].
1) Data pre-processing: During the transmission, bit
loss/insertion and bit-flip can directly affect the transmission
accuracy. As introduced previously, we define the duration for
transmitting one data bit as ds and the duration for one sampling
period as dr , with ds > dr . The receiver gets multiple readings
for a single bit which empirically improved the transmission
accuracy. We introduce an adjustable parameter n so that
ds = n × dr . Ideally, we get n sampling points for a single
data bit. If n is small, especially n = 1, we anticipate that the
receiver will experience a high data loss or error rate. On the
other hand, a large n limits the maximum value of the sample.
In an extreme case, the samples are either one or zero which
also reduce the transmission accuracy.
Before passing the samples to our machine learning tools,
we need to make sure that the samples are of the same dimension. Here, we propose a customized Hierarchical clustering
algorithm, as defined in Algorithm 1. The input parameters to
Algorithm 1 are a set of signal samples S and the number of
target clusters, k. Here, k also stands for the dimension of the
samples after being processed by Algorithm 1.
In the example shown in Figure 7, the sender transmits data
0101 and the receiver receives with n = 4. Two sampling
points in the first and third sampling groups are lost. We set
k = 4. After being processed by the customized Hierarchical
clustering algorithm, sampling points are clustered into four
groups. As shown in Figure 7, sample points surrounded by
the same dummy circle are grouped together.
2) Machine learning: After the preprocessing procedure, we
assess the signal samples using machine learning tools. The
goal here is to find the best-practice classifier for the signal
samples with respect to their corresponding ground truth labels.
( Fig. 5. Quantile-quantile plot for the distribution of sample signal values
against normal distribution
B. Sample collection
For sample collection, we first generate the symbol alphabet. As introduced in Section II, we group a sequence of
binaries as one compact symbol in order to make the symbol
losses and insertions insignificant to the capacity estimation.
For convenience, we denote the binaries as “0”s and “1”s.
Let the frame size, m, be a configurable integer which is
greater than 1. Each symbol in the covert channel alphabet is
composed of m binaries. We should have 2m distinct symbols
in the alphabet. For example, when m = 2, the alphabet is
{“00”, “10”, “01”, “11”}. Our empirical study shows that the
design of alphabet does not cause significant impact on the
capacity estimation.
As introduced previously, during the transmission of data
symbols, the channel noise causes the variation of the duration
for transmitting a data bit. This variation accumulates as more
points are sampled. Eventually, it leads to a bit insertion or loss
error. Here, we use delimiters to constrain the propagation of
shifted boundaries. The delimiters are produced by the sender
frequently alternating between contention and idle to the contention sensitive operation, as shown in Figure 6. The receiver
detects the delimiter by monitoring the difference between consecutive sampling values. If the accumulated difference exceeds
a predefined threshold, the receiver infers the current signal to
be a delimiter. To find the end of the delimiter, we perform the
same procedure except that, if the accumulated difference falls
below a predefined threshold, the receiver infers the current
signal as the end of the delimiter. Using different delimiter
size leads to a performance-accuracy trade-off for the covert
channel. Increasing the delimiter size improves the transmission
accuracy while reducing the transmission speed. Decreasing delimiter size increases the transmission speed while reducing the
transmission accuracy. We include a configurable parameter to
the covert channel implementation for determining the delimiter
size used.
For each symbol in the alphabet, we transmit N copies
of the symbol through the covert channel, consecutively, with
delimiters inserted in between. We generate a variety of sample
sets with different transmission frequency, B. By this way, we
30
ten sets. At each iteration, we call the target set the test set.
The samples other than those in the test set form the training
set. We train the classifier using the training set and evaluate
its performance on the test set. Initially, we set the number
of neurons in the hidden layer the same as the number of
neurons in the input layer. We increase the number of neurons
in the hidden layer until the cross-validation accuracy seizes to
improve.
The result of the cross-validation procedure is recorded as a
matrix, P . The number of the rows and the number of columns
of the matrix is equal to the number of distinct symbols in the
alphabet. The element on the ith row and jth column of the
matrix indicates the number of samples which are intended to
be symbol j at the sender side and classified as symbol i at the
receiver side. Assuming that there are three distinct symbols, s1 ,
s2 and s3 , in the alphabet and a hundred samples are collected
for each symbol, the matrix shown in (5) depicts that all of the
samples for s1 , s2 and s3 are correctly classified except five
samples of s2 which are classified as s1 .
⎤
⎡
100 5
0
0 ⎦
P = ⎣ 0 95
(5)
0
0 100
!"
)
* "
+ Fig. 7.
"
Clustering of data points
Then, we use the classifier to classify the samples and produce
the conditional probability, R(i|j), introduced in (2).
With the purpose of finding a solution for all possible
scenarios, Neural Network [11] is selected as the supervised
machine learning tool for the proposed framework, since it is
good at generating flexible decision boundaries. Here, we apply
a three-layer feed-forward Neural Network to assess the signal
samples. The three layers are referred to as the input, hidden
and output layers. We note that Neural Network is not the
only viable machine learning tool for this task. We have also
experimented other classifiers like LDA and QDA. However,
the performance was indifferent. It turns out that the quality of
the samples collected is much more important.
Cross-validation technique is commonly used in machine
learning to avoid over-fitting, a behavior such that the classifier
performs well on trained samples but poorly on others, when the
sample size is limited. For finding the best-practice classifier,
we apply the ten-fold cross-validation procedure on the labeled
signal samples. Ten-fold cross-validation procedure can be
described as follows. The samples are randomly partitioned
into ten sets. We iterate the following process for each of the
Then, we transform P into the required conditional probability,
R(i|j), by dividing each column element by the corresponding
column sum. The example shown in (5) will be transformed
into (6), where R(i|j) equals the element on the ith row and
jth column of matrix R.
⎤
⎡
1 0.05 0
0.95 0⎦
R=⎣
(6)
0
0
1
3) Iterative capacity computation: After obtaining the conditional probability, R(i|j), we use the iterative algorithm
proposed by Arimoto [10] to obtain the optimal prior q =
{qj }j=1...N distribution for the sender symbols which is required by (2). To distinguish between the prior distribution
vector q computed on different iterations, we denote the prior
distribution obtained on the tth iteration as q t = {qjt }j=1...N .
The algorithm is as follows.
Algorithm 1 Customized Hierarchical Clustering Algorithm
1: C = build cluster list(S)
2: while C.size > k and no cluster contains single point do
3:
D = init distance list()
4:
for all cluster c ∈ {x ∈ C, x = C.last} do
5:
D.add(distance(c, c.next))
6:
end for
7:
for all cluster c ∈ {x ∈ C, x = C.last} do
8:
if distance(c, c.next) = D.min then
9:
merge(c, c.next)
10:
break f or loop
11:
end if
12:
end for
13: end while
14: if C.size < k then
15:
split(C, k) // split clusters proportionally
16: end if
•
•
Set the initial prior, q 0 , to be qj0 = N1 for all j = 1 . . . N .
Then iterate the following two steps until the vector
difference between q t and q t+1 is smaller than a threshold,
which is set to 0.0001 in our framework.
Let φt be defined as follows.
R(i|j)qjt
φt (j|i) = N
t
k=1 R(i|k)qk
•
(7)
Update q using the following equation and normalize q by
dividing each element with the sum of all elements
N
t+1
t
qj = exp
R(i|j) log(φj|i )
(8)
i=1
Arimoto [10] has proven the optimality and convergence of
the algorithm. However, there is one issue in the algorithm. All
31
elements in R must be strictly positive due to the logarithm term
in step 3. We tackle this issue by adding a small number, 1e6, to each entry of R. For the example conditional probability
R(i|j) described in (6), q converge as follows:
q0
q
1
q2
=
=
{0.3333333, 0.3333333, 0.3333333}
{0.3407914, 0.2933110, 0.3658976}
=
{0.3440178, 0.2901843, 0.3657979}
Algorithm 2 Profiling protocol for the sender
1: tcp connect(receiver)
2: tcp transmit(start signal)
3: for all alphabet ∈ alphabetsets do
4:
for i = 1 to N do
5:
for all binary value ∈ alphabet do
6:
if binary value = 0 then
7:
idle(ds )
8:
else
9:
contention(ds )
10:
end if
11:
end for
12:
Transmit the delimiter.
13:
end for
14: end for
15: tcp transmit(ending signal)
Lastly, we apply (2) to compute the upper bound of covert
channel capacity.
D. Implementation details
The implementation of AP F C 3 involves three entities,
namely, the sender, the receiver and the controller. The sender
and receiver are a pair of VMs co-residing on the same physical
machine provided by the cloud service provider. The controller
can be installed on a separate machine or on the same machine.
We require that the controller can directly communicate with
both the sender and the receiver, e.g., through SSH.
The entire process will be decomposed into three steps as
introduced in section III. Initially, we transmit the executable
files from the controller to the sender and the receiver. For the
parameter tuning step, the sender and receiver cooperate on
collecting samples for measuring the noise model parameters,
μ, θ, and γ. The samples are statistically analyzed at the
receiver side. Then, the parameterized noise model is sent back
to the controller. The controller enumerates a large amount of
covert channel parameter settings and filters out those which
would perform poorly on the given noise model. The selected
parameter settings are sent to the receiver and the sender to
reconfigure the covert channel implementation. The receiver
and the sender collect samples of the alphabet symbols, as
described in the sample collection step. The samples are
transmitted back to the controller to compute the final capacity
estimation. The pseudo code for the sender, the receiver, and
the controller is shown in Algorithm 2, Algorithm 3, and
Algorithm 4, respectively.
Algorithm 3 Profiling protocol for the receiver
1: listen(tcp port)
2: repeat
3:
idle()
4: until Receive the start signal from the sender.
5: repeat
6:
sampling data = contention(period dr )
7:
write file(sampling data)
8: until receive the ending signal
signal encoding protocol, the symbol size, and the delimiter
size.
1) Signal encoding protocol: To demonstrate the impact of
different encoding protocols on the covert channel, we conduct
simulations to find out which signal encoding method, among
Non-Return-to-Zero (NRZ) encoding, Manchester Encoding
and Differential Manchester (Diff-Manchester) Encoding, is
more efficient under a chosen scenario.
V. E VALUATION
Algorithm 4 Profiling protocol for the controller
1: transmit(receiver binaries, receiver V M ) // via SCP.
2: transmit(sender binaries, sender V M ) //via SCP.
3: launch(receiver) // via SSH.
4: launch(sender) // via SSH.
5: repeat
6:
idle()
7: until the receiver finishes profiling.
8: retrieve(sampling data, receiver V M ) // via SCP.
9: delimiters = detect delimiters(sampling data)
10: segments = split(sampling data, delimiters)
11: for all segment ∈ segments do
12:
customized hieratical clustering(segment) // see Algorithm 1
13: end for
14: Process the sampling data with neural network classifier.
In this section, we carry out the experiments on cross-VM
covert channels and evaluate the performance of AP F C 3 .
We first demonstrate the impact of configurable parameters
on cross-VM covert channel capacity. Then, we evaluate the
effectiveness of the parameter tuning method proposed in
section IV. At last, we compare the estimated capacity with
the achieved data rate for the covert channels introduced in
section II on our test platforms.
A. Impact of configurable covert channel parameters on the
capacity of cross-VM covert channels
Before evaluating the performance of AP F C 3 on estimating
the capacity of cross-VM covert channels, we conduct the
following experiments to demonstrate the impact of the configurable parameters to the covert channel. Here, we conduct three
experiments to exam three configurable parameters, namely, the
32
,-./
0,1/
As we introduced before, we reduce the parameter space by
performing a parameter tuning step introduced in section III.
In order to evaluate the parameter tuning process, we compare
the data rate between the covert channels constructed with and
without fine-tuned parameters.
As there is a trade-off between the data rate and the error rate
for constructed covert channels, we use the following method
to compute the error rate. The sender sends 16 packets which
are 400 bits in length to the receiver. We use a longest common
subsequence algorithm to compute the correctly received bits
and compute the average error rate. We only show the achieved
data rate with an error rate below 20%. Figure 11 shows the
box plot for the achieved data rates for both the constructed
covert channels with and without fine-tuned parameters. We
can observe that, in most of the cases, the constructed covert
channels with fine-tuned parameters achieve higher data rates
than those without fine-tuned parameters, which indicates the
effectiveness of our parameter tuning method.
B. The impact of Parameter tuning on covert channel data rate
transmission rates. We observe that the capacity estimations
using 3-bit, 4-bit and 5-bit symbols exhibit similar trends.
Neither of them outperforms the rest significantly. In our
remaining experiment, we choose to use 4-bit symbols.
3) Different delimiter size: As introduced in section IV,
there is a trade-off between data transmission rate and accuracy
on the delimiters inserted between consecutive data signals.
A long delimiter increases the transmission accuracy while
decreasing the data rate. A short delimiter increases the data
transmission rate while decreasing the accuracy. Here, we
conduct an experiment to evaluate the impact of different
delimiter sizes to the capacity estimation result.
In this experiment, we estimate the capacity of memory
bus covert channels on the Xen test environment for the
delimiter sizes which are three, four, five and six times of
the sender interval, ds . Figure 10 shows memory bus covert
channel capacity estimation with different delimiter sizes. We
can observe that when the delimiter size is four times of the
sender interval, ds , the estimated capacity is maximized. We
found similar results for the other two test environments. We
use this setting for the rest of our experiments.
Fig. 10.
Estimated memory bus covert channel capacity with different
delimiter sizes
'1
'1
'1
,-./
In this experiment, we fix the covert channel parameters
except the data transmission rate and estimate the capacity for
different signal encoding protocols against a range of data rates.
The corresponding covert channel capacity estimated using
AP F C 3 is reported for the memory bus covert channel on
the Xen platform.
As we can see from Figure 8, the Non-Return-to-Zero encoding outperforms the other encoding methods significantly. The
advantage of Manchester and differential Manchester encodings
is that the receiver only needs to distinguish four different
binary signal patterns, “0”, “00”, “1”, and “11”. If Manchester
and differential Manchester encodings are used, in order to
achieve the same data rate with NRZ, the sender interval, ds ,
must be halved. This means that Manchester and differential
Manchester encodings are more vulnerable to noise caused by
timer imprecision. The experiment result also indicates that the
timer precision is the major bottleneck for the capacity of crossVM covert channels.
2) Different symbol size: To demonstrate the insignificance
of different symbol sizes on covert channel capacity estimation,
we conduct the following experiment.
Using AP F C 3 , we estimate the memory bus covert channel
introduced by Wu et al. [3] on the VMWare test environment.
We estimate the capacity using 3-bit, 4-bit and 5-bit symbols.
Figure 9 shows the estimation result against a range of data
Fig. 8. Estimated memory bus covert channel capacity with different encoding
methods
5
"
5
"
5
"
5
"
,1/
234
5'
0,1/
,-./
Fig. 9. Estimated memory bus covert channel capacity with different symbol
sizes
33
) )
* ;.
* ; .
)
)
* ;.
* ; .
)
))
* ;.
* ; .
) )
)
09 8*
)
)
*678
*678
09 8*
))
5:9 5:9 *678
09 8*
5:9 (a) VMWare
(b) Xen
(c) KVM
Fig. 11. Data rate comparison between the covert channels constructed with and without fine-tuned parameters
that of the achieved data rate. The distribution of estimated
capacity always has smaller variance and tends towards the
upper end of the achieved data rate. This indicates that AP F C 3
is capable of producing precise estimation on the upper bound
of covert channel capacity. For each type of covert channels,
the estimated capacity and the achieved data rate are the least
on the KVM platform. This result complies with our intuition
on the quality of the signals captured on the platforms, shown
in Figure 1. We notice that there is a significant gap between
the estimated capacity and achieved data rate for CPU load and
Memory bus covert channels on the Xen platform. We think this
gap is due to the complexity introduced by the credit scheduler
deployed by Xen hypervisor which was discussed in [3].
))
%&
#
$%
*
0
)
)
Fig. 12.
)
*67
01 8*
5:1 At last, we conduct experiments to compare the capacity
estimation produced by AP F C 3 with the data rate achieved by
constructed covert channels with fine-tuned parameters. Along
with the achieved data rate, we also report the transmission
error rate. In this set of experiments, we plot the estimated
capacity with the achieved data rate of constructed covert
channels over different time instances. As the system noise
changes over time, we expect fluctuation on both the estimated
capacity and the achieved data rate. The estimated capacity,
however, should be greater than the achieved data rate in most
cases as AP F C 3 estimates the upper bound of the capacity.
Estimated capacities on three different platforms
C. Comparison between estimated capacity and achieved data
rate
To evaluate the performance of AP F C 3 , we use it to assess
the capacity for four types of cross-VM covert channels on
our test platforms. Then, we construct the covert channels with
fine-tuned parameters and compare the estimated capacity with
the achieved data rate.
We first carry out the capacity estimation using AP F C 3 for
four types of covert channels, namely CPU load, Memory bus,
CPU L2 Cache, and Disk bus covert channels, on three test
platforms, VMWare, Xen, and KVM, as introduced in section I.
The results are shown in Figure 12.
We can observe that CPU load based and memory bus based
covert channels are estimated to have the highest capacity
for the VMWare and Xen platforms.Disk bus covert channel,
on the other hand, is estimated to have the lowest capacity
in all platforms. This result is consistent with the achieved
data rates reported in [3] [4] [2]. We also observe that the
estimated capacity for the KVM platform is much lower than
other platforms. The is consistent with our previous experiment
which shows that covert channel signal on the KVM platform
is subject to the largest amount of noise.
In Figure 13, we compare the distribution of data rate
achieved by covert channels constructed with fine-tuned parameters and their estimated capacity. We can observe that,
the estimated capacity has a maximum value slightly above
The experiment results on the VMWare platform are depicted
in Figure 14. As the timer function on the VMWare platform
has high precision and consistency, we are able to construct
covert channels with low transmission error rate. We can
observe that, for CPU load, memory bus, and L2 cache covert
channels, the estimated capacity is close to the achieved data
rate with transmission error rate less than 20%. For the disk
bus covert channel, we could not construct a covert channel
with transmission error rate below 20%. However, the achieved
data rate is quite close to the estimated capacity when the
transmission error rate is above 20% and below 40%.
The experiment results on the Xen platform are depicted in
Figure 15. We can observe that the capacity estimation is close
to the achieved bit rate for the L2 cache covert channel and
disk bus covert channel. For the CPU load covert channel and
the memory bus covert channel, there is a gap between the
estimated capacity and achieved data rate when we limit the
34
))
;.
* ;.
)
)
;.
* ;.
)
))
)
)
)
)
*678
09 )
)
;.
* ;.
8*
)
)
5:9 )
*678
09 )
)
)
8*
5:9 *678
09 8*
5:9 (a) VMWare
(b) Xen
(c) KVM
Fig. 13. Comparison between the distribution of achieved data rates and the distribution of estimated capacity for covert channels constructed with fine-tuned
parameters
*67
* <
<
<
<
<
<
<
<
<
8
* <
<
)<
< <
<
< <
<
)<
Fig. 14.
<
<
)<
5:1 * * <
)<
01 * * <
<
<
*67
<
<
)<
<
<
<
<<
<
< )<
< )<
)< <
<
<
<
8
< )<
<
)<
< <
< )< < )<
<
< < < <
< < <
<
<
5:1 * <
)<
<
<
* <
01 <
<
)<
<
<
)<
<
<
<
)<
)<
<
error rate to be less than 20%. The gap becomes significantly
smaller when we loose the error rate constraint to be less
than 50%. This result may indicate that there are better ways
for constructing the covert channels than those introduced
in existing work on the Xen platform. We leave this as an
extension to our work.
The experiment result on the KVM platform are depicted in
Figure 16. We can observe that the estimated capacity is close to
the achieved data rate for the CPU load covert channel and the
L2 cache covert channel. The estimated capacity for memory
bus covert channel and disk bus covert channel is below 10 bps.
Also, we could not construct a memory bus covert channel or
a disk bus covert channel with a data rate above 10 bps while
having an error rate smaller than 20%. Therefore, we omitted
the figures on these two types of covert channels for the KVM
platform.
<
<
* <
Fig. 15.
Estimated capacity vs. achieved data rate on VMWare
)<
<
<
<
)<
)<
<
<
<
* <
<
<
)<
<
<
Estimated capacity vs. achieved data rate on Xen
*67
* 8
* <
<
< <
<
<
Fig. 16.
<
<
<
<
<
)<
<
<
< <
<
<
<
<
Estimated capacity vs. achieved data rate on KVM
[15].
In cloud computing, Ristenpart et al [2] were the first to
demonstrate the possibility of conducting cross-VM attacks in
the public cloud. Wong et al. [16] described a theoretically CPU
function unit based cross-VM covert channel against simultaneous multi-threaded processors. Xu et al [4] experimentally
quantified the bit rate of the CPU L2 cache based cross-VM
covert channel on Amazon EC2 [17]. A memory bus based
cross-VM covert channel was demonstrated by Wu et al [3].
These works focus on demonstrating the existence of various
cross-VM covert channel attacks and proving the threat of
VI. R ELATED WORK
Since covert channel was defined by Lampson [12] in 1973,
it has long been studied in computer systems. Network based
covert channel is a well-established research topic [13] [14]
35
information leakage through such covert channels.
Cross-VM covert channels post new challenges on covert
channel capacity estimation since they often have higher bit
rates and the media utilized to establish covert information flow
is more sophisticated. The hardware and software environments
vary dramatically from case to case. Thus, it calls for more
research effort to provide deeper understanding of cross-VM
covert channel risks in public clouds. In [18], Zander et al
applied the classic Shannon equation which requires calculating
signal-to-noise ratio. As we mentioned in Section I, crossVM covert channels do not rely on a precise clock, which
makes it practically infeasible to obtain signal-to-noise ratio. An
alternative approach based on the Shannon entropy formulation
was introduced in [7].
If the estimated capacity of cross-VM covert channels is
concerning on a cloud platform, the provider may consider
applying anti-measures to mitigate the threat. Existing antimeasures includes: disabling or limiting access to fine grained
timer functions [19] [20] [21] [22], redesigning access policy
to shared physical components [23] [24] [25], monitoring the
accesses to resources shared between VMs [26], and jamming
[27].
[4] Y. Xu, M. Bailey, F. Jahanian, K. Joshi, M. Hiltunen, and R. Schlichting,
“An exploration of l2 cache covert channels in virtualized environments,”
in Proceedings of the 3rd ACM Workshop on Cloud Computing Security
Workshop, ser. CCSW ’11. ACM, 2011, pp. 29–40.
[5] K. Okamura and Y. Oyama, “Load-based covert channels between xen
virtual machines,” in Proceedings of the 2010 ACM Symposium on
Applied Computing, ser. SAC ’10. ACM, 2010, pp. 173–180.
[6] C. E. Shannon, “A mathematical theory of communication,” SIGMOBILE
Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3–55, 2001.
[7] I. Gray, J.W., “On introducing noise into the bus-contention channel,” in
, 1993 IEEE Computer Society Symposium on Research in Security and
Privacy, 1993. Proceedings, 1993, pp. 90–98.
[8] J. Millen, “Finite-state noiseless covert channels,” in Computer Security
Foundations Workshop II, 1989., Proceedings of the, 1989, pp. 81–86.
[9] J. W. Gray and III, “Countermeasures and tradeoffs for a class of covert
timing channels,” Tech. Rep., 1994.
[10] S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,” IEEE Transactions on Information Theory,
vol. 18, no. 1, pp. 14–20, 1972.
[11] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed.
Prentice Hall PTR, 1998.
[12] B. W. Lampson, “A note on the confinement problem,” Commun. ACM,
vol. 16, no. 10, pp. 613–615, 1973.
[13] R. Smith and G. Scott Knight, “Predictable design of network-based
covert communication systems,” in IEEE Symposium on Security and
Privacy, 2008. SP 2008, 2008, pp. 311–321.
[14] V. Crespi, G. Cybenko, and A. Giani, “Engineering statistical behaviors
for attacking and defending covert channels,” IEEE Journal of Selected
Topics in Signal Processing, vol. 7, no. 1, pp. 124–136, 2013.
[15] X. Zi, L. Yao, X. Jiang, L. Pan, and J. Li, “Evaluating the transmission
rate of covert timing channels in a network,” Computer Networks, vol. 55,
no. 12, pp. 2760–2771, 2011.
[16] Z. Wang and R. Lee, “Covert and side channels due to processor architecture,” in Computer Security Applications Conference, 2006. ACSAC ’06.
22nd Annual, 2006, pp. 473–482.
[17] I. Amazon Web Services, “Amazon elastic compute cloud (ec2),” http:
//aws.amazon.com/ec2/, 2014, [Online; accessed 23-Jan-2014].
[18] S. Zander, P. Branch, and G. Armitage, “Capacity of temperature-based
covert channels,” IEEE Communications Letters, vol. 15, no. 1, pp. 82–
84, 2011.
[19] B. C. Vattikonda, S. Das, and H. Shacham, “Eliminating fine grained
timers in xen,” in Proceedings of the 3rd ACM Workshop on Cloud
Computing Security Workshop, ser. CCSW ’11, 2011, pp. 41–46.
[20] J. Wu, L. Ding, Y. Lin, N. Min-Allah, and Y. Wang, “XenPump: a new
method to mitigate timing channel in cloud computing,” in 2012 IEEE
5th International Conference on Cloud Computing (CLOUD), 2012, pp.
678–685.
[21] R. Martin, J. Demme, and S. Sethumadhavan, “TimeWarp: rethinking
timekeeping and performance monitoring mechanisms to mitigate sidechannel attacks,” in 2012 39th Annual International Symposium on
Computer Architecture (ISCA), 2012, pp. 118–129.
[22] P. Li, D. Gao, and M. K. Reiter, “Mitigating access-driven timing channels
in clouds using stopwatch,” in Dependable Systems and Networks (DSN),
2013 43rd Annual IEEE/IFIP International Conference on. IEEE, 2013,
pp. 1–12.
[23] J. Shi, X. Song, H. Chen, and B. Zang, “Limiting cache-based sidechannel in multi-tenant cloud using dynamic page coloring,” in 2011
IEEE/IFIP 41st International Conference on Dependable Systems and
Networks Workshops (DSN-W), 2011, pp. 194–199.
[24] H. Raj, R. Nathuji, A. Singh, and P. England, “Resource management
for isolation enhanced cloud services,” in Proceedings of the 2009 ACM
Workshop on Cloud Computing Security, ser. CCSW ’09. ACM, 2009,
pp. 77–84.
[25] T. Kim, M. Peinado, and G. Mainar-Ruiz, “STEALTHMEM: systemlevel protection against cache-based side channel attacks in the cloud,”
in Proceedings of the 21st USENIX Conference on Security Symposium,
ser. Security’12. USENIX Association, 2012, pp. 11–11.
[26] B. Saltaformaggio, D. Xu, and X. Zhang, “Busmonitor: A hypervisorbased solution for memory bus covert channels,” Proceedings of EuroSec,
2013.
[27] R. Zhang, X. Su, J. Wang, C. Wang, W. Liu, and R. Lau, “On mitigating
the risk of cross-vm covert channels in a public cloud,” IEEE Transactions
on Parallel and Distributed Systems, vol. PP, no. 99, pp. 1–1, 2014.
VII. C ONCLUSION
Assessing cross-VM covert channel capacity facilitates risk
management for cloud users and providers. In this paper,
we presented an automated cross-VM covert channel capacity
assessment framework, named AP F C 3 , based on the Shannon
entropy equation. We proposed a simple noise model which
helps on tuning the parameters of various cross-VM covert
channels. We proposed a sample collection method and a
machine learning strategy for computing the conditional probability required by the Shannon entropy equation. We evaluated
the proposed framework by comparing the estimated capacity
with the achieved data rate of covert channels constructed with
fine-tuned parameters. The evaluation result shows that the
capacity estimated using AP F C 3 is close to the achieved data
rate.
ACKNOWLEDGMENT
This work was supported in part by and Hong Kong ITF
research project (No. UIM/250) and Hong Kong General Research Funding under project 122913.
R EFERENCES
[1] Y. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, “Cross-vm side
channels and their use to extract private keys,” in Proceedings of the
2012 ACM conference on Computer and communications security, ser.
CCS ’12, 2012, pp. 305–316.
[2] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, “Hey, you, get
off of my cloud: Exploring information leakage in third-party compute
clouds,” in Proceedings of the 16th ACM Conference on Computer and
Communications Security, ser. CCS ’09. ACM, 2009, pp. 199–212.
[3] Z. Wu, Z. Xu, and H. Wang, “Whispers in the hyper-space: Highspeed covert channel attacks in the cloud,” in Proceedings of the 21st
USENIX Conference on Security Symposium, ser. Security’12. USENIX
Association, 2012, pp. 9–9.
36
Download