Serving Content with Unknown Demand: the High-Dimensional Regime Sharayu Moharir Javad Ghaderi

advertisement
Serving Content with Unknown Demand:
the High-Dimensional Regime
Sharayu Moharir
Javad Ghaderi
Department of ECE
University of Texas at Austin
Austin, TX 78712
sharayu.moharir@gmail.com
Department of ECE
University of Texas at Austin
Austin, TX 78712
jghaderi@ee.columbia.edu
Sujay Sanghavi
Sanjay Shakkottai
Department of ECE
University of Texas at Austin
Austin, TX 78712
sanghavi@mail.utexas.edu
Department of ECE
University of Texas at Austin
Austin, TX 78712
shakkott@austin.utexas.edu
ABSTRACT
Categories and Subject Descriptors
In this paper we look at content placement in the highdimensional regime: there are n servers, and O(n) distinct
types of content. Each server can store and serve O(1) types
at any given time. Demands for these content types arrive,
and have to be served in an online fashion; over time, there
are a total of O(n) of these demands. We consider the algorithmic task of content placement: determining which types
of content should be on which server at any given time, in
the setting where the demand statistics (i.e. the relative
popularity of each type of content) are not known a-priori,
but have to be inferred from the very demands we are trying
to satisfy. This is the high-dimensional regime because this
scaling (everything being O(n)) prevents consistent estimation of demand statistics; it models many modern settings
where large numbers of users, servers and videos/webpages
interact in this way.
We characterize the performance of any scheme that separates learning and placement (i.e. which use a portion of
the demands to gain some estimate of the demand statistics, and then uses the same for the remaining demands),
showing it is order-wise strictly suboptimal. We then study
a simple adaptive scheme - which myopically attempts to
store the most recently requested content on idle servers and show it outperforms schemes that separate learning and
placement. Our results also generalize to the setting where
the demand statistics change with time. Overall, our results
demonstrate that separating the estimation of demand, and
the subsequent use of the same, is strictly suboptimal.
C.2.4 [Computer-Communication Networks]:
Distributed Systems
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
SIGMETRICS’14, June 16–20, 2014, Austin, Texas, USA.
Copyright 2014 ACM 978-1-4503-2789-3/14/06 ...$15.00.
http://dx.doi.org/10.1145/2591971.2591978.
Keywords
Content Delivery Systems; Large Scale Systems; Content
Replication Strategies; Matchings
1.
INTRODUCTION
Ever increasing volumes of multimedia content is now requested and delivered over the Internet. Content delivery
systems (e.g., YouTube [23]), consisting of a large collection
of servers (each with limited storage/service capability), process and service these requests. Naturally, the storage and
content replication strategy (i.e., what content should be
stored on each of these servers) forms an important part of
the service and storage architecture.
Two trends have emerged in such settings of large-scale
distributed content delivery systems. First, there has been
a sharp rise in not just the volume of data, but indeed in the
number of content-types (e.g., number of distinct YouTube
videos) that are delivered to users [23]. Second, the popularity and demand for most of this content is uneven and
ephemeral ; in many cases, a particular content-type (e.g.,
a specific video clip) becomes popular for a small interval
of time after which the demand disappears; further a large
fraction of the content-types languish in the shadows with
almost no demand [1, 5].
To understand the effect of these trends, we study a stylized model for the content placement and delivery in largescale distributed content delivery systems. The system consists of n servers, each with constant storage and service
capacities, and αn content-types (α is some constant number). We consider the scaling where the system size n tends
to infinity. The requests for the content-types arrive dynamically over time and need to be served in an online manner by the free servers storing the corresponding contents.
The requests that are “deferred” (i.e., cannot be immediately
served by a free server with requested content-type) incur a
high cost. To ensure reliability, we assume that there are
alternate server resources (e.g., a central server with large
enough backup storage and service capacity, or additional
servers that can be freed up on-demand) that can serve such
deferred requests.
The performance of any content placement strategy crucially depends on the popularity distribution of the content.
Empirical studies in many services such as YouTube, Peerto-Peer (P2P) VoD systems, various large video streaming
systems, and web caching, [2,5,7,17,24] have shown that access for different content-types is very inhomogeneous and
typically matches well with power-law (Zipf-like) distributions, i.e., the request rate for the i-th most popular contenttype is proportional to i−β , for some parameter β > 0. For
the performance analysis, we assume that the content-types
have a popularity that is governed by some power-law distribution with unknown β and further this distribution changes
over time.
Our objective is to provide efficient content placement
strategies that minimize the number of requests deferred.
It is natural to expect that content placement strategies in
which more popular content-types are replicated more will
have a good performance. However, there is still a lot of flexibility in designing such strategies and the extent of replication of each content-type has to be determined. Moreover,
the requests arrive dynamically over time and popularities
of different content-types might vary significantly over time;
thus the content placement strategy needs to be online and
robust.
The fact that the number of contents is very large and
their popularities are time-varying creates two new challenges that are not present in traditional queueing systems.
First, it is imperative to measure the performance of content
replication strategies over the time scale in which changes
in popularities occur. In particular, the steady-state metrics typically used in queueing systems are not a right measure of performance in this context. Second, the number of
content-types is enormous and learning the popularities of
all content-types over the time scale of interest is infeasible.
This is in contrast with traditional multi-class multi-server
systems where the number of demand classes does not scale
with the number of servers (low-dimensional setting) and
thus learning the demand rates can be done in a time duration that does not scale with the system size.
1.1
Contributions
The main contributions of our work can be summarized
as follows.
High dimensional vs. low dimensional: We consider
the high dimensional regime where the number of servers,
the number of content-types, and the number of requests to
be served over any time interval all scale as O(n); further
the demand statistics are not known a-priori. This scaling
means that consistent estimation of demand statistics is not
possible. We show that a “learn-and-optimize” approach,
namely, learning the demand statistics based on requests
and then locally caching content on servers according to
this empirical statistics, is strictly sub-optimal (even when
using high-dimensional estimators such as the Good-Turing
estimator [13]). This is in contrast to the conventional lowdimensional setting (finite number of content-types) where
the “learn-and-optimize” approach is asymptotically optimal.
Adaptive vs. learn-and-optimize: We study an adaptive content replication strategy which myopically attempts
to cache the most recently requested content-types on idle
servers. Our key result is that even this simple adaptive
strategy strictly outperforms any content placement strategy based on the “learn-and-optimize” approach. Our results
also generalize to the setting where the demand statistics
change with time.
Overall, our results demonstrate that separating the estimation of demands and the subsequent use of the estimations to design optimal content placement policies is deprecated in the high-dimensional setting.
1.2
Organization and Basic Notations
The rest of this paper is organized as follows. We describe
our system model and setting in Section 2. The main results
are presented in Section 3. Our simulation results are discussed in Section 4. Section 5 contains the proofs of some
of our key results. Section 7 gives an overview of related
works. We finally end the paper with conclusions. The rest
of the proofs can be found in the Appendix.
Some of the basic notations are as follows. Given two functions f and g, we write f = O(g) if lim supn→∞ |f (n)/g(n)| <
∞. f = Ω(g) if g = O(f ). If both f = O(g) and f = Ω(g),
then f = Θ(g). Similarly, f = o(g) if lim supn→∞ |f (n)/g(n)| =
0, and f = ω(g) if g = o(f ). The term w.h.p. means with
high probability as n → ∞.
2.
SETTING AND MODEL
In this section, we consider a stylized model for large
scale distributed content systems that captures two emerging trends, namely, a large number of content types, and
uneven and time-varying demands.
2.1
Server and Storage Model
The system consists of n front-end servers, each with constant storage and service capacity, and a back-end server
that contains a catalog of m content-types (one copy of each
content-type, e.g., a copy of each YouTube video). The contents can be copied from the back-end server and placed on
the front-end servers. Each front-end server can store at
most d content pieces (d is a constant) and serve at most
d requests at each time, under the constraint that no two
requests can read the same content piece simultaneously on
a server1 . The system is essentially equivalent to a system
of nd servers, each with 1 content piece storage. Since we
are interested in the scaling performance, as n, m → ∞, for
clarity we assume that there are n servers and each server
can store 1 content and can serve 1 request at any time.
2.2
Service Model
When a request for a content arrives, it is routed to an idle
(front-end) server which has the corresponding content-type
stored on it, if possible. We assume that the service time
of each request is exponentially distributed with mean 1.
The requests have to be served in an online manner; further
service is non-preemptive, i.e., once a request is assigned to
1
Even without the constraint “no two requests can read the
same content piece simultaneously on a server”, the performance can be bounded from above by the performance of a
system with dn servers with a storage of 1 each, and from
below by that of another system with n servers with a storage of 1 each. Thus asymptotically in a scaling-sense, the
system is still equivalent to a system of n servers where each
server can store 1 content and can serve 1 content request
at any time.
a server, its service cannot be interrupted and also cannot
be re-routed to another server. Requests that cannot be
served (no free server with requested content-type) incur a
high cost (e.g., need to be served by the back-end server, or
content needs to be fetched from the back-end server and
loaded on to a new server). As discussed before, we refer
to such requests as deferred requests. The goal is to design
content placement policies such that the number of requests
deferred is minimized.
2.3
Content Request Model
There are m content-types (e.g., m distinct YouTube videos).
We consider the setting where the number of content-types
m is very large and scales linearly with the system size n,
i.e., m = αn for some constant α > 1. We assume that requests for each content arrive according to a Poisson process
and request rates (popularities) follow a Zipf distribution.
Formally, we make the following assumptions on the arrival
process.
Assumption 1. (Arrival and Content Request Process)
- The arrival process for each content-type i is a Poisson
process with rate λi .
- The P
load on the system at any time is λ̄ < 1, where
m
i=1 λi
.
λ̄ =
n
- Without loss of generality, content-types are indexed in
the order of popularity. The request rate for contenttype i is λi = nλ̄pi where pi ∝ i−β for some β > 0.
This is the Zipf distribution with parameter β.
We have used the Zipf distribution to model the popularity
distribution of various contents because empirical studies in
many content delivery systems have shown that the distribution of popularities matches well with such distributions,
see e.g., [5], [2], [24], [7], [17].
2.4
Time Scales of Change in Arrival Process
A key trend discussed earlier is the time-varying nature of
popularities in content delivery systems [1, 5]. For example,
the empirical study in [5] (based on 25 millions transactions
on YouTube) shows that daily top 100 list of videos changes
frequently. To understand the effect of this trend on the
performance of content placement strategies, we consider
the following two change models.
Block Change Model: In this model, we assume that
the popularity of various content-types remains constant for
some duration of time T (n), and then changes to some other
arbitrarily chosen distribution that satisfies Assumption 1.
Thus T (n) reflects the time-scale over which changes in popularities occur. Under this model, we characterize the performance of content placement strategies over such a timescale T (n).
Continuous Change Model: Under this model, we assume that each content-type has a Poisson clock at some
constant rate ν > 0. Whenever the clock of content-type
i ticks, content-type i exchanges its popularity with some
other content-type j, chosen uniformly at random. Note
that if T (n) = θ(1), the average time over which the popularity distribution “completely” changes is comparable to
that of the Block Change Model; however, here the change
occurs incrementally and continuously. Note that this model
ensures that the content-type popularity always has the Zipf
distribution. Under this model, we characterize the performance of content placement strategies over constant intervals of time.
3.
MAIN RESULTS AND DISCUSSION
In this section, we state and discuss our main results. The
proofs are provided in Section 5.
3.1
Separating Learning from Content Placement
In this section, we analyze the performance of storage policies which separate the task of learning and that of content
placement as follows. Consider time intervals of length T (n).
The operation of the policy in each time interval is divided
into two phases:
Phase 1. Learning: Over this interval of time, use the
demands from the arrivals (see Figure 1) to estimate the
content-type popularity statistics.
Phase 2. Storage: Using the estimated popularity of various content-types, determine which content-types are to be
replicated and stored on each server. The storage is fixed
for the remaining time interval. The content-types not requested even once in the learning phase are treated equally
in the storage phase. In other words, the popularity of all
unseen content-types in the learning phase is assumed to be
the same.
Phase 1
(Learn)
0
Phase 2
(Static Storage)
T(n)
time
Chosen Optimally
Figure 1: Learning-Based Static Storage Policies –
The interval T (n) is split into the Learning and Storage
phases. The length of time spent in the Learning phase can
be chosen optimally using the knowledge of the value of T (n)
and the Zipf parameter β.
Further, we allow the interval of time for the Learning
phase potentially to be chosen optimally using knowledge of
T (n) (the interval over which statistics remain stationary)
and β (the Zipf parameter for content-types popularity).
This is a natural class of policies to consider because it
is obvious that popular content-types should be stored on
more servers than the less popular content-types. Therefore,
knowing the arrival rates can help in the design of better
storage policies. Moreover, for the content-types which are
not seen in the learning phase, the storage policy has no
information about their relative popularity. It is therefore
natural to treat them as if they are equally popular.
The replication and storage in Phase 2 (Storage) can be
performed by any static policy that relies on the knowledge
(estimate) of arrival rates, e.g., the proportional placement
policy [10] where the number of copies of each content-type
is proportional to its arrival rate, or the storage policy of [11]
which was shown to be approximately optimal in the steady
state.
We now analyze the performance of learning-based static
storage policies under the Block Change Model defined in
Section 2.4 where the statistics remain invariant over the
time intervals of length T (n). The performance metric of
interest is the number of requests deferred by any policy belonging to class of learning-based static storage policies in
the interval of interest. We assume that at the beginning
of this interval, the storage policy has no information about
the relative popularity of various content-types. Therefore,
we start with an initial loading where each content-type is
placed on exactly one server. This loading is not changed
during Phase 1 (the learning phase) at the end of which, the
content-type on idle servers is changed as per the new storage policy. As mentioned before, this storage is not changed
for the remaining duration in the interval of interest.
The following theorem provides a lower bound on the
number of requests deferred by any learning-based static
storage policy.
Theorem 1. Under Assumption 1 and the Block Change
Model defined in Section 2.4, for β > 2, if T (n) = Ω(1), the
expected number of requests deferred by any learning-based
static storage policy is Ω(n0.5 ).
We therefore conclude that even if the division of the interval of interest into Phase 1 (Learning) and Phase 2 (Storage)
is done in the optimal manner, no learning-based static storage policy can defer fewer than Ω(n0.5 ) jobs in the interval
of interest. Therefore, Theorem 1 provides a fundamental
lower bound on the number of jobs deferred by any policy
which separates learning and storage. It is worth pointing
out that this result holds even when the time-scale of change
in statistics is quite slow. Thus, even when T (n), the timescale over which statistics remains invariant, goes to infinity
and the time duration of the two phases (Learning, Storage)
is chosen optimally based on β, T (n), Ω(n0.5 ) requests are
still deferred.
Next, we explore adaptive storage policies which perform
the task of learning and storage simultaneously.
3.2
Myopic Joint Learning and Placement
We next study a natural adaptive storage policy called
MYOPIC. In an adaptive storage policy, depending on the
requests that arrive and depart, the content-type stored on
a server can be changed when the server is idle while other
servers of the system might be busy serving requests. Therefore, adaptive policies perform the tasks of learning and
placement jointly. Many variants of such adaptive policies
have been studied for decades in the context of cache management (e.g. LRU, LRU-MIN [21]).
Let Ci refer to the ith content-type, 1 ≤ i ≤ m. The MYOPIC policy works as follows: When a request for contenttype Ci arrives, it is assigned to a server if possible, or deferred otherwise. Recall that a deferred request is a request
for which on arrival, no currently idle server can serve it
and thus its service invokes a backup mechanism such as
a back-end server which can serve it at a high cost. After
the assigment/defer decision is made, if there are no currently idle servers with content-type Ci , MYOPIC replaces
the content-type of one of the idle servers with Ci . This idle
server is chosen as follows:
- If there is a content-type Cj stored on more than one
currently idle servers, the content-type of one of those
servers is replaced with Ci ,
- Else, place Ci on that currently idle server whose contenttype has been requested least recently among the contenttypes on the currently idle servers.
For a formal definition of MYOPIC, refer to Figure 2.
1: On arrival (request for Ci ) do,
2: Allocate request to an idle server if possible.
3: if no other idle server has a copy of Ci , then
4:
if ∃j: Cj stored on > 1 idle servers, then
5:
replace Cj with Ci on any one of them.
6:
else
7:
find Cj : least recently requested on idle servers,
replace Cj with Ci .
8:
end if
9: end if
Figure 2: MYOPIC – An adaptive storage policy which
changes the content stored on idle servers in a greedy manner
to ensure that recently requested content pieces are available
on idle servers.
Remark 1. Some key properties of MYOPIC are:
1. The content-types on servers can be potentially changed
only when there is an arrival.
2. The content-type of at most one idle server is changed
after each arrival. However, for many popular contenttypes, it is likely that there is already an idle server with
the content-type, in which case there is no content-type
change.
3. To implement MYOPIC, the system needs to maintain
a list of content types ordered according to the time
at which recent most request of each content-type was
made.
The following theorem provides an upper bound on the
number of requests deferred by MYOPIC for the Block Change
Model defined in Section 2.4.
Theorem 2. Under Assumption 1 and the Block Change
Model defined in Section 2.4, over any time interval T (n)
such that T (n) = o(nβ−1 ), the number of requests deferred
by MYOPIC is O((nT (n))1/β ) w.h.p.
We now compare this upper bound with the lower bound on
the number of requests deferred by any learning-based static
storage policy obtained in Theorem 1.
Corollary 1. Under Assumption 1, the Block Change
Model defined in Section 2.4, and for β > 2, over any time
β
interval T (n) such that T (n) = Ω(1) and T (n) = o(n 2 −1 ),
the expected number of requests deferred by any learningbased static storage policy is Ω(n0.5 ) and the number of requests deferred by the MYOPIC policy is o(n0.5 ) w.h.p.
From Corollary 1, we conclude that MYOPIC outperforms
all learning-based static storage policies. Note that:
i. Corollary 1 holds even when the interval of interest
T (n) grows to infinity (scaling polynomially in n), or
correspondingly, even when the content-type popularity changes very slowly with time.
ii. Even if the partitioning of the (T (n)) into a Learning phase and a Static Storage phase is done in an
optimal manner with the help of some side information (β, T (n)), the MYOPIC algorithm outperforms
any learning-based static storage policy.
iii. Since we consider the high-dimensional setting, the
learning problem at hand is a large-alphabet learning problem. It is well known that standard estimation techniques like using the empirical values as estimates of the true statistics is suboptimal in this setting. Many learning algorithm like the classical GoodTuring estimator [13] and other linear estimators [16]
have been proposed, and shown to have good performance for the problem of large-alphabet learning.
From Corollary 1, we conclude that, even if the learningbased storage policy uses the best possible large-alphabet
estimator, it cannot match the performance of the MYOPIC policy.
Therefore, in the high-dimensional setting we consider,
separating the task of estimation of the demand statistics,
and the subsequent use of the same to design a static storage policy, is strictly suboptimal. This is the key message
of this paper.
Theorem 2 characterizes the performance of MYOPIC under the Block Change Model, where the statistics of the arrival process do not change in interval of interest. To gain
further insight into robustness of MYOPIC against changes
in the arrival process, we now analyze the performance of
MYOPIC when the arrival process can change in the interval of interest according to the Continuous Change Model
defined in Section 2.4.
Recall that under the Continuous Change Model, on average, we expect Θ(n) shuffles in the popularity of various
content-types in an interval of constant duration. For the
Block Change Model, if T (n) = Θ(1), the entire popularity distribution can change at the end of the block, which
is equivalent to n shuffles. Therefore, for both the change
models, the expected number of changes to the popularity
distribution in an interval of constant duration is of the same
order. However, these changes occur constantly but slowly
in the Continuous Change Model as opposed to a one-shot
change in the Block Change Model.
Theorem 3. Under Assumption 1, and the Continuous
Change Model defined in Section 2.4, the number of requests
deferred by the MYOPIC storage policy in any interval of
constant duration is O(n1/β ) w.h.p.
In view of Theorem 2, if the arrival rates do not vary in
an interval of constant duration, under the MYOPIC storage policy, the number of requests deferred in that interval
is O(n1/β ) w.h.p. Theorem 3 implies that the number of requests deferred in a constant duration interval is of the same
order even if the arrival rates change according to the Continuous Change Model. This shows that the performance of
the MYOPIC policy is robust to changes in the popularity
statistics.
3.3
Genie-Aided Optimal Storage Policy
In this section, our objective is to study the setting where
the content-type statistics is available “for free”. We consider
the setting where the popularity statistics are known, and
show that a simple adaptive policy is optimal in the class
of all policies which know popularity statistics of various
content-types. We denote the class of such policies as A
and refer to the optimal policy as the GENIE policy.
Let the content-types be indexed from i = 1 to m and let
Ci be the ith content-type. Without loss of generality, we
assume that the content-types are indexed in the order of
popularity, i.e, λi ≥ λi+1 for all i ≥ 1. Let k(t) denote the
number of idle servers at time t.
The key idea of the GENIE storage policy is to ensure that
at any time t, if the number of idle servers is k(t), the k(t)
most popular content-types are stored on exactly one idle
server each. The GENIE storage policy can be implemented
as follows. Recall Ci is the ith most popular content-type.
At time t,
- If there is a request for content-type Ci with i < k(t− ),
then allocate the request to the corresponding idle
server. Further, replace the content-type on server
storing Ck(t− ) with content-type Ci .
- If there is a request for content-type Ci with i > k(t− ),
defer this request. There is no storage update.
- If there is a request for content-type Ci with i = k(t− ),
then allocate the request to the corresponding idle
server. There is no storage update.
- If a server becomes idle (due to a departure), replace
its content-type with Ck(t− )+1 .
For a formal definition, please refer to Figure 3.
1: Initialize: Number of idle-servers := k = n.
2: while true do
3:
if new request (for Ci ) routed to a server, then
4:
if i 6= k, then
5:
replace content-type of idle server storing Ck with
Ci
6:
end if
7:
k ←k−1
8:
end if
9:
if departure, then
10:
replace content-type of new idle server with Ck+1
11:
k ←k+1
12:
end if
13: end while
Figure 3: GENIE – An adaptive storage policy which has
content popularity statistics available for “free”. At time t,
if the number of idle servers is k(t), the k(t) most popular
content-types are stored on exactly one idle server each.
Remark 2. Some key properties of GENIE are:
1. The implementation of GENIE requires replacing the
content-type of at most one server on each arrival and
departure.
2. The GENIE storage policy only requires the knowledge
of the relative popularity of various content types.
To characterize the performance of GENIE, we assume
that the system starts from the empty state (all servers are
idle) at time t = 0. The performance metric for any policy
A is D(A) (t), defined as the number of requests deferred by
time t under the adaptive storage policy A. We say that an
adaptive storage policy O is optimal if
D(O) (t) ≤st D(A) (t),
(1)
for any storage policy A ∈ A and any time t ≥ 0. Where
Equation 1 implies that,
P(D(O) (t) > x) ≤ P(D(A) (t) > x),
for all x ≥ 0 and t ≥ 0.
Theorem 4. If the arrival process to the content-type delivery system is Poisson and the service times are exponential random variables with mean 1, for the Block Change
Model defined in Section 2.4, let D(A) (t) be the number of
requests deferred by time t under the adaptive storage policy
A ∈ A. Then, we have that,
D(GEN IE) (t) ≤st D(A) (t),
for any storage policy A ∈ A and any time t ≥ 0.
- The “Good Turing + Static Storage” policy uses the
Good-Turing estimator [13] to compute an estimate
of the missing mass at the end of the learning phase.
The missing mass is defined as total probability mass of
the content types that were not requested in the learning phase. Recall that we assume that learning-based
static storage policies treat all the missing contenttypes equally, i.e., all missing content-types are estimated to be equally popular.
Let M0 be the total probability mass of the content
types that were not requested in the learning phase
and S1 be the set of content types which were requested
exactly once in the learning phase. The Good-Turing
c0 ) is given by
estimator of the missing mass (M
c0 =
M
|S1 |
.
number of samples
See [13] for details.
Let Ni be the number of times content i was requested
in the learning phase and Cmissing be the set of contenttypes not requested in the learning phase. The “Good
Turing + Static Storage” policy computes an estimate
of the content-popularity as follows:
i: If Ni = 0, pi =
c0
M
.
|Cmissing |
c0 )
ii: If Ni > 0, pi = (1 − M
Note that this theorem holds even if the λi s are not distributed according to the Zipf distribution. We thus conclude that GENIE is the optimal storage policy in the class
of all storage policies which at time t, have no additional
knowledge of the future arrivals except the values of λi for all
content-types and the arrivals and departures in [0, t). Next,
we compute a lower bound on the performance of GENIE.
Theorem 5. Under Assumption 1, for β > 2, the Block
Change Model defined in Section 2.4 and if the interval of
interest is of constant length, the expected number of requests
deferred by GENIE is Ω(n2−β ).
From Theorems 2 and 5 we see that there is a gap in
the performance of the MYOPIC policy and the GENIE
policy (which has additional knowldge of the content-type
popularity statistics). Since for the GENIE policy, learning
the statistics of the arrival process comes for “free”, this gap
provides an upper bound on the cost of serving contenttype with unknown demands. We compare the performance
of the all the policies considered so far in the next section
via simulations.
4.
SIMULATION RESULTS
We compare the performance of the MYOPIC policy with
the performance of the GENIE policy and the following two
learning-based static storage policies:
- The “Empirical + Static Storage” policy uses the empirical popularity statistics of content types in the learning phase as estimates of the the true popularity statistics. At the end of the learning phase, the number of
servers on which a content is stored is proportional to
its estimated popularity.
Ni
.
number of samples
At the end of the learning phase, the number of servers
on which a content is stored is proportional to its estimated popularity.
We simulate the content distribution system for arrival
and service process which satisfy Assumption 1 to compare
the performance of the four policies mentioned above and
also understand how their performance depends on various
parameters like system size (n), load (λ̄) and Zipf parameter
(β). In Tables 1, 2 and 3, we report the mean and variance
of the fraction of jobs served by the policies over a duration
of 5 s (T (n) = 5) for α = 1.
For each set of system parameters, we repeat the simulations between 1000 to 10000 times for each policy in order to
ensure that the standard deviation of the quantity of interest
(fraction of jobs served) is small and comparable. For the
two adaptive policies (GENIE and MYOPIC), the results
are averaged over 1000 iterations and for the learning-based
policies (“Empirical + Static Storage” and “Good-Turing +
Static Storage”), the results are averaged over 10000 iterations. In addition, the results for the learning-based policies
are reported for empirically optimized values for the fraction
of time spent by the policy in learning the distribution.
In Table 1, we compare the performance of the policies for
different values of system size (n). For the results reported
in Table 1, the “Empirical + Static Storage” policy learns for
0.1 s and the “Good Turing + Static Storage” policy learns
for 0.7 s. The performance of all four policies improves as the
system size increases and the adaptive policies significantly
outperform the two learning-based static storage policies.
Figure 4 is a plot of the mean values reported in Table 1.
In Table 2, we compare the performance of the policies
for different values of Zipf parameter β. For the results reported in Table 2, the duration of the learning phase for
MYOPIC
Empirical + Static Storage
Good Turing + Static Storage
n
200
400
600
800
1000
200
400
600
600
1000
200
400
600
800
1000
200
400
600
800
1000
Mean
0.9577
0.9698
0.9752
0.9788
0.9814
0.8995
0.9260
0.9380
0.9481
0.9532
0.6292
0.6918
0.7246
0.7464
0.7622
0.6875
0.7249
0.7443
0.7566
0.7651
σ
0.0081
0.0045
0.0034
0.0030
0.0025
0.0258
0.0167
0.0132
0.0101
0.0080
0.0662
0.0443
0.0353
0.0304
0.0268
0.0274
0.0180
0.0140
0.0118
0.0104
Table 1: The performance of the four policies as a function
of the system size (n) for fixed values of load λ̄ = 0.8 and
β = 1.5. The values reported are the mean and standard
deviation (σ) of the fraction of jobs served. Both adaptive
policies (GENIE and MYOPIC) significantly outperform the
two learning-based static storage policies.
both learning based policies is fixed such that the expected
number of arrivals in that duration is 100. The performance
of all four policies improves as the value of the Zipf parameter β increases, however, the MYOPIC policy outperforms
both learning-based static storage policies for all values of β
considered.
In Table 3, we compare the performance of the policies
for different values of load λ̄. For the results reported in
Table 3, the duration of the learning phase for both learning
based policies is fixed such that the expected number of
arrivals in that duration is 100. The performance of all four
policies deteriorates as the load increases, however, for all
loads considered, the MYOPIC policies outperforms the two
learning-based static storage policies.
5.
PROOFS OF MAIN RESULTS
1
0.95
Fraction of Jobs Served
Policy
GENIE
0.85
0.8
0.75
0.7
GENIE
MYOPIC
Good Turing + Static Storage
Empirical + Static Storage
0.65
200
1. If the learning phase lasts for the first nγ arrivals for
some 0 < γ ≤ 1, we show that under Assumption 1,
w.h.p., in the learning phase, there are no arrivals for
1
content-types k, k+1, ..m = αn for k = (nγ log n) β−1 +
1 (Lemma 1).
2. Next, we show that w.h.p., among the first nγ arrivals,
i.e., during the learning phase, Ω(nγ ) requests are deferred (Lemma 3).
400
500
600
700
800
900
1000
Figure 4: Plot of the mean values reported in Table 1 –
performance of the storage policies as a function of system
size (n) for λ̄ = 0.8 and β = 1.5.
3. Using Lemma 1, we compute a lower bound on the
number of requests deferred in Phase 2 (after the learning phase) by any learning-based static storage policy
(Lemma 4).
4. Using Steps 2 and 3, we lower bound the number of
requests deferred in the interval of interest.
In the case when the learning phase lasts for more than
n arrivals, we show that the number of requests deferred in
the learning phase alone is Ω(n), thus proving the theorem
for this case.
Lemma 1. Let E1 be the event that in the first nγ arrivals
for 0 < γ ≤ 1, there are no arrivals of types k, k + 1, ...m =
1
αn where k = (nγ log n) β−1 + 1. Then,
1
P(E1 ) ≥ exp −
,
(2)
2 log n
for n large enough.
Proof. Recall λi = λ̄npi where pi =
i−β
Z(β)
for Z(β) =
Pm
−β
.
i=1 i
Z(β) =
Proof of Theorem 1
We first present an outline of the proof of Theorem 1. We
consider two cases. We first focus on the case when the
learning-based storage policies use fewer than n arrivals to
learn the distribution.
300
n
In this section, we provide the proofs of our main results.
5.1
0.9
αn
X
i−β
Z
≥
αn+1
i−β di ≥
1
i=1
0.9
β−1
for n large enough. Therefore, for all i,
β − 1 −β
i .
0.9
The total mass of all content types i = k, ..m = αn is
Z αn
αn
αn
X
X
β − 1 −β
β − 1 −β
1
1
.
pi ≤
i ≤
i di ≤
0.9
0.9
0.9
(k
−
1)β−1
k−1
pi ≤
i=k
i=k
1
For k = (nγ log n) β−1 + 1, we have that,
nγ
1
1
1
P(E1 ) ≥ 1 −
≥ exp −
,
0.9 (k − 1)β−1
2 log n
for n large enough.
Policy
GENIE
MYOPIC
Empirical + Static Storage
Good Turing + Static Storage
β
2.0
3.5
5.0
2.0
3.5
5.0
2.0
3.5
5.0
2.0
3.5
5.0
Mean
0.9940
0.9998
0.9997
0.9774
0.9981
0.9990
0.8592
0.9328
0.9458
0.8448
0.9314
0.9457
σ
0.0025
0.0013
0.0017
0.0063
0.0024
0.0014
0.0206
0.0138
0.0092
0.0231
0.0139
0.0091
Table 2: The performance of the four policies as a function
of the Zipf parameter (β) for fixed values of system size
n = 500 and load λ̄ = 0.9. The values reported are the mean
and standard deviation (σ) of the fraction of jobs served.
The MYOPIC policy outperforms the two learning-based
static storage policies for all values of β considered.
Policy
GENIE
MYOPIC
Empirical + Static Storage
Good Turing + Static Storage
λ̄
0.500
0.725
0.950
0.500
0.725
0.950
0.500
0.725
0.950
0.500
0.725
0.950
Mean
0.9892
0.9788
0.9531
0.9605
0.9484
0.8973
0.7756
0.7705
0.7352
0.7849
0.7589
0.6869
σ
0.0025
0.0013
0.0017
0.0113
0.0105
0.0221
0.0222
0.0238
0.0235
0.0230
0.0249
0.0348
Table 3: The performance of the four policies as a function
of the load (λ̄) for fixed values of system size n = 500 and
β = 1.2. The values reported are the mean and standard deviation (σ) of the fraction of jobs served. The MYOPIC policy significantly outperforms the two learning-based static
storage policies for all loads considered.
We use the following concentration result for Exponential
random variables.
Lemma 2. Let Xk for 0 ≤ k ≤ v, be i.i.d. exponential
random variables with mean 1, then,
X
v
v
a
.
(3)
P
Xi ≤ a ≤ exp(v − a)
v
k=1
The proof of Lemma 2 follows from elementary calculations.
Lemma 3. Suppose the system starts with each content
piece stored on exactly one server. Let E2 be the event
that in the first nγ arrivals for γ such that 0 < γ ≤ 1, at
1
most ((nγ log n) β−1 +1)(log n+1) are served (not deferred).
Then, for β > 2,
1
P(E2 ) ≥ 1 −
.
2 log n
(4)
Proof. This proof is conditioned on the event E1 defined
in Lemma 1. Conditioned on E1 , in the first nγ arrivals, at
1
most (nγ log n) β−1 +1 different content types are requested.
1
Therefore, at most (nγ log n) β−1 + 1 servers can serve reγ
quests during the first n arrivals.
Let E3 be the eventγ that the time taken for the first nγ
. Since the expected time for the first
arrivals is less than 2n
λ̄n
γ
nγ arrivals is n
,
by
the
Chernoff bound, P(E3 ) ≥ 1−o(1/n).
λ̄n
The rest of this proof is conditioned on the event E3 .
If the system serves (does not defer) more than
1
((nγ log n) β−1 + 1)(log n + 1)
requests in this interval, at least one server needs to serve
more than log n requests. By substituting a = cn−1+γ and
v = log n in Lemma 2, we have that,
P
log
Xn
Xk ≤ cn−1+γ
≤ exp(log n − cn−1+γ )
k=1
×
cn−1+γ
log n
log n
=o
1
.
n
Therefore, the probability that aγ server serves
more than
time is o n1 . Therefore,
log n requests in an interval of 2n
λ̄n
using the union bound, the probability that none of these
1
(nγ log n) β−1 +1 servers serve more than log n requests each
1
γ
time is greater than 1 − ((nγ log n) β−1 + 1)o( n1 ).
in 2n
λ̄n
Therefore, we have that,
1
1
P(E2c ) ≤ ((nγ log n) β−1 + 1)o
+ P (E1c ) + P (E3c )
n
1
≤
2 log n
for n large enough.
Lemma 4. Let the interval of interest be T (n) such that
T (n) = Ω(1). If the learning phase of the storage policy lasts
for the first nγ arrivals, 0 < γ ≤1, the expected
number of
T (n)n1−γ
requests deferred in Phase 2 is Ω
.
log n
Proof. Let N2 be the number of arrivals in Phase 2, then
we have that, E[N2 ] = T (n)λ̄n − nγ .
Let E4 be the event that N2 > E[N2 ]/2. Using the Chernoff bound, it can be shown that P (E4c ) = o(1/n).
The rest of this proof is conditioned on E1 defined in
Lemma 1 and E4 defined above. We consider the following two cases depending on the number of servers allocated
to content types not seen in Phase 1.
Case I: The number of servers allocated to content types not
λ̄
. For
seen in Phase 1 is less than n for some ≤ 1 − 1000
β > 2,
Z(β) =
αn
X
i−β ≤
i=1
Therefore, for all i, pi ≥
types k, k + 1, ..αn is
αn
X
i=k
pi
≥
αn
X
i−2 ≤
i=1
6 −β
i .
π2
∞
X
i=1
i−2 =
π2
.
6
The total mass of all content
Z αn+1
αn
X
6 −β
6 −β
0.54
1
i
≥
i di = 2
,
2
β−1
π2
π
π
(β
−
1)
k
k
i=k
for n large enough. Therefore, the expected number of ar1
rivals of types k, k+1, ..m = αn, where k = (nγ log n) β−1 +1
γ
1
in Phase 2 is at least ( T (n)λ̄n−n
) π20.54
.
2
(β−1) nγ log n
Let E5 be the event that in Phase 2, there are at least
γ
1
) 2π20.54
arrivals of type k, k + 1, ..m =
( T (n)λ̄n−n
2
(β−1) nγ log n
1
αn, where k = (nγ log n) β−1 +1. Using the Chernoff bound,
P(E5c ) = o(1/n).
Conditioned on E1 , all content types k, k + 1, ..m = αn,
1
where k = (nγ log n) β−1 + 1, are not requested in Phase 1.
Recall that all learning-based policies treat all these content
types equally and that the total number of servers allocated
to store the content types not seen in Phase 1 is less than n.
1
Let η be the probability that Ck for k ≥ (nγ log n) β−1 + 1 is
not stored by the storage policy under consideration. Then,
n
η ≥1−
≥1− ,
1
2
γ
β−1
n − (n log n)
−1
for n large enough.
Let E6 = E1 ∩ E3 ∩ E4 ∩ E5 and D2 be the number of
requests deferred in Phase 2.
T (n)λ̄n − nγ
0.54
1
E[D2 |E6 ] ≥ η
2
2π 2 (β − 1) nγ log n
T (n)λ̄n − nγ
0.54
1
≥
1−
2
2
2π 2 (β − 1) nγ log n
T (n)n1−γ
.
= Ω
log n
Therefore,
E[D2 ] ≥ E[D2 |E6 ]P(E6 )
T (n)n1−γ
1
3
≥ E[D2 |E6 ] 1 −
−
=Ω
.
log n
n
log n
Case II: The number of servers allocated to content types
λ̄
not seen in Phase 1 is more than n for some > 1 − 1000
.
Let f (n) be the number of servers allocated to store all
content types that are requested in Phase 1. By our assumpλ̄
n.
tion, f (n) ≤ 1000
Let C1 P
be the set of content types requested in Phase 1.
Let p =
c∈C1 pc be the total mass of all content types
c ∈ C1 . Let p̂c be the fraction of requests for content-type c
in Phase 1. By the definition of C1 , the totalP
empirical mass
of all content types c ∈ C1 is obviously p̂ = c∈C1 p̂c = 1.
Recall that there are nγ arrivals in Phase 1. Let r = nγ .
We now use the Chernoff bound to compute a lower bound
on the true mass p, using a technique similar to that used
in [13] (Lemma 4). By the Chernoff bound, we know that,
prκ2
P(p̂ > (1 + κ)p) ≤ exp −
.
3
prκ2
Let δ = exp −
, then, we have that, with probability
3
greater than 1 − δ,
r
−3p log δ
p̂ − p >
.
r
Solving for p, we get that, with probability greater than
3 log(1/δ)
1 − δ, p > 1 −
, for n large enough. Let δ = 1/n,
2r
then we have that, with probability greater than 1 − 1/n,
3 log n
p > 1−
. Conditioned on the event E4 , there are at
2nγ
T (n)λ̄n−nγ
least
arrivals in Phase 2. The remainder of this
2
proof is conditioned on E4 . Let A2 be the number of arrivals
of types c ∈ C1 in phase 2. Let E7 be the event that
T (n)λ̄n − nγ
3 log n
A2 >
1−
.
2
2nγ
Since the expected number of arrivals of content types c ∈
C1 in Phase 2 is at least
3 log n
γ
(T (n)λ̄n − n ) 1 −
,
2nγ
using the Chernoff bound, we can show that P(E7c ) = o(1/n).
The rest of this proof is conditioned on E7 . By our assumption, the number of servers which can serve arrivals of types
c ∈ C1 in Phase 2 is f (n). Therefore, if at least A2 /2 requests are to be served in Phase 2, the sum of the service
times of these A2 /2 requests should be less than T (n)f (n)
(since the number of servers which can serve these requests
is f (n)). Let E8 be the event that the sum of A2 /2 independent Exponential random variables with mean 1 is less than
T (n)f (n). By substituting v = A2 /2 and a = T (n)f (n) in
Lemma 2, we have that,
A2
2T (n)f (n) 2
A2
− T (n)
P(E8 ) ≤ exp
2
A2
A2
2T (n)f (n) 2
A2
1
≤ exp
=o
2
A2
n
for n large enough. Hence,
A2
1
P D2 ≥
≥ 1 − P(E1c ) − o
2
n
T (n)n1−γ
.
⇒ E[D2 ] = Ω
log n
Proof. (of Theorem 1)
We consider two cases:
Case I: The learning phase lasts for the first nγ arrivals
where 0 ≤ γ ≤ 1.
Let D1 be the number of requests deferred in Phase 1 and
D be total number of requests deferred in the interval of
interest. Then, we have that,
E[D] = E[D1 ] + E[D2 ].
By Lemmas 3 and 4 and since T (n) = Ω(1), we have that,
1
E[D] ≥ nγ − (nγ log n) β−1 log n + E[D2 ]
= Ω(n0.5 ).
Case II: The learning phase lasts for longer than the time
taken for the first n arrivals.
By Lemma 3, the number of requests deferred in the first
1
n arrivals is at least n − (n log n) β−1 log n with probability
greater than 1 − 1/ log n. Therefore, we have that,
1
1
1−
= Ω(n0.5 ).
E[D] ≥ n − (n log n) β−1 log n
log n
5.2
Proof of Theorem 2
We first present an outline the proof of Theorem 2.
1. We first show that under Assumption 1, on every arrival in the interval of interest (T (n)), there are Θ(n)
idle servers w.h.p. (Lemma 6).
2. Next, we show that w.h.p., in the interval of interest of
1
length T (n), only O (nT (n) β ) unique content types
are requested (Lemma 7).
3. Conditioned on Steps 1 and 2, we show that, the MYOPIC policy ensures that in the interval of interest,
once a content type is requested for the first time, there
is always at least one idle server which can serve an incoming request for that content.
4. Using Step 3, we conclude that, in the interval of interest, only the first request for a particular content type
will be deferred. The proof of Theorem 2 then follows
from Step 2.
Lemma 5. Let the cumulative arrival process to the content delivery system be a Poisson process with rate λ̄n. At
time t, let χ(t) be the number of occupied servers under the
MYOPIC storage policy. Then, we have that, χ(t) ≤st S(t),
where S(t) is a poisson random variable with rate λ̄n(1 −
e−t ).
Proof. Consider an M/M/∞ queue where the arrival
process is Poisson(λ̄n). Let S(t) be the number of occupied
servers at time t in this system. It is well known that S(t)
is a Poisson random variable with rate λ̄n(1 − e−t ).
Here we provide a proof of this result for completeness.
Consider a request r∗ which arrived into the system at time
t0 < t. If the request is still being served by a server, we have
that, t0 + µ(r∗ ) > t, where µ(r∗ ) is the service time of request r∗ . Since µ(r∗ ) ∼ Exp(1), we have that, P(µ(r∗ ) > t −
= e−(t−t0 ) . Therefore, P(r∗ in the system at time t) ≤
Rt0t|t10 )−(t−t
0)
e
dt0 .
0 t
To show χ(t) ≤st S(t), we use a coupled construction
similar to Figure 5. The intuition behind the proof is the
following: the rate of arrivals to the content delivery system and the M/M/∞ system (where each server can serve
all types of requests) is the same. The content delivery system serves fewer requests than the M/M/∞ system because
some requests are deferred even when the servers are idle.
Hence, the number of busy servers is the content delivery
system is stochastically dominated by the number of busy
servers in the M/M/∞ queueing system.
Lemma 6. Let the interval of interest be [t0 , t0 + T (n)]
where T (n) = o(nβ−1 ) and ε ≤ 1−2 λ̄ . Let F1 be the event that
at the instant of each arrival in the interval of interest, the
number of idle servers in the system is at least 1 − λ̄ − ε n.
Then, P(F1c ) = o n1 .
Proof. Let F2 be the event that the number of arrivals
in [t0 , t0 + T (n)] ≤ nT (n)(λ̄ + ε). Using the Chernoff bound
for the Poisson process, we have that,
1
P(F2c ) = o
.
n
Consider any t ∈ [t0 , t0 + T (n)]. By Lemma 5, χ(t) ≤st S(t),
where S(t) ∼ Poisson(λ̄n(1 − e−t )). Therefore,
P(χ(t) > (λ̄ + ε)n) ≤ P(S(t) > (λ̄ + ε)n).
Moreover, S(t) ≤st W (t) where W (t) = Poisson(λ̄n). Therefore, using the Chernoff bound for W (t), we have that,
P(S(t) > (λ̄ + ε)n) ≤ P(W (t) > (λ̄ + ε)n) = e−c1 n ,
for some constant c1 > 0. Therefore,
P(F1c ) ≤ P(F2c ) + (λ̄ + ε)nT (n)P(χ(t) > (λ̄ + ε)n)
1
= o
.
n
Lemma 7. Let F3 be the event that in the interval of interest of duration T (n) such that T (n) = o(nβ−1 ), no more
than O((nT (n))1/β )different types of contents are requested.
Then, P(F3c ) = o n1 .
Proof. Recall from the proof of Lemma 1 that the total
mass of all content types k, ..m = αn is
αn
X
pi
i=k
≤
1
1
.
0.9 (k − 1)β−1
Now, for k = (nT (n))1/β + 1, we have that,
αn
X
i=k
pi ≤
1
− β−1
(nT (n)) β .
0.9
Conditioned on the event F2 defined in Lemma 6, the expected number of requests for content types k, k + 1, ..αn is
1
(λ̄ + ε)(nT (n))1/β . Using the Chernoff bound,
less than 0.9
2
the probability that there are more than 0.9
(λ̄+ε)(nT (n))1/β
requests for content types k, k + 1, ..αn in the interval of interest is less than n12 for n large enough.
Therefore, with probability greater than 1 − 1/n2 − P(F2c ),
the number different types of contents requests for in the in2
(λ̄+ε)(nT (n))1/β .
terval of interest is less than (nT (n))1/β + 0.9
Hence the result follows.
Proof. (of Theorem 2)
Let F4 be the event that, in the interval of interest, every
request for a particular content type except the first request
is not deferred. The rest of this proof is conditioned on F1
and F3 . Let U (t) be the number of unique contents which
have been requested in the interval of interest before time
t for t ∈ [t0 , t0 + T (n)]. Conditioned on F3 , as defined in
Lemma 7, U (t) ≤ k1 (nT (n))1/β for some constant k1 > 0
and n large enough. Conditioned on F1 , there are always
(1 − λ̄ − ε)n idle servers in the interval of interest.
CLAIM: For every i and n large enough, once a content
Ci is requested for the first time in the interval of interest,
the MYOPIC policy ensures that there is always at least 1
idle server which can serve a request for Ci .
Note that since T (n) = o(nβ−1 ), (nT (n))1/β = o(n). Let
n be large enough such that k1 (nT (n))1/β < (1 − λ̄ − ε)n,
i.e., at any time t ∈ [t0 , t0 +T (n)], the number of idle servers
is greater than U (t). We prove the claim by induction. Let
the claim hold for time t− and let there be a request at
time t for content Ci . If this is not the first request for Ci
in [t0 , t0 + T (n)], by the claim, at t = t− , there is at least
one idle server which can serve this request. In addition, if
there is exactly one server which can serve Ci at t− , then
the MYOPIC policy replaces the content of some other idle
server with Ci . Since there are more than k1 (nT (n))1/β idle
servers and U (t) < k1 (nT (n))1/β , at t+ , each content type
requested in the interval of interest so far, is stored on at
least one currently idle server. Therefore, conditioned on F1
and F3 , every request for a particular content type except
the first request, is not deferred.
1 n1/β different contents are among the top n1/β most popular contents. Given this, the proof follows the same lines
of arguments as in the proof of Lemma 7.
Hence, putting everything together,
5.4
P(F4 ) ≥ 1 − P(F1c ) − P(F3c ),
thus P(F4 ) → 1 as n → ∞ and the result follows.
Proof. (of Corollary 1)
From Theorem 2 we have that, w.h.p., the number of requests deferred by the MYOPIC storage policy is O(nT (n))1/β
o(n0.5 ) and by Theorem 1, we know that, the expected number of requests deferred by any learning-based storage policy
is Ω(n0.5 ).
5.3
Proof of Theorem 3
In this section, we first present an outline of the proof of
Theorem 3 followed by the proof details.
1. Since we are studying the performance of the MYOPIC
policy for the Continuous Change Model, the relative
order of popularity of contents keeps changing in the
interval of interest. We show that w.h.p., the number
of content types which are in the n1/β most popular
content types at least once in the interval of interest is
O(n1/β ) (Lemma 8).
2. Next, we show that w.h.p., in the interval of interest
of length b, only O(n1/β ) content types are requested
(Lemma 9).
3. By Lemma 6 and the proof of Theorem 2, we know
that, conditioned on Step 3, the MYOPIC storage policy ensures that in the interval of interest, once a content type is requested for the first time, there is always
at least one idle server which can serve an incoming request for that content. Using this, we conclude that,
in the interval of interest, only the first request for a
particular content type will be deferred. The proof of
Theorem 3 then follows from Step 2.
Lemma 8. Let G1 be the event that, in the interval of interest of length b, the number of times that a content among
the current top n1/β most popular contents changes its position in the popularity
ranking is at most 4b
n1/β ν. Then,
α
1
P (G1 ) ≥ 1 − o n .
Proof. The expected number of clock ticks in b timeunits is bnν. The probability that a change in arrival process
involves at least one of the current n1/β most popular con1/β
tents is nαn . Therefore, the expected number of changes in
arrival process which involve at least one of the current n1/β
n1/β
most popular contents is 2bν
α
. By the Chernoff bound,
we have that P (G1 ) ≥ 1 − o n1 .
Proof. (of Theorem 3) The proof of the theorem 3 follows from Lemma 9 and uses the same line of arguments as
in the proof of Theorem 2.
Proof of Theorem 4
To show that GENIE is the optimal policy, we consider the
process X(t) which is the number of occupied servers at time
t when the storage policy is GENIE. Let Y (t) be the number
of occupied servers at time t for some other storage policy
A ∈ A. We construct a coupled process (X ∗ (t), Y ∗ (t)) such
∗
∗
= that the marginal rates of change in X (t) and Y (t) is the
same as that of
X(t) and Y (t) respectively.
Pm
i=1 λi
. At time t, let CGENIE (t) and CA (t)
Recall λ̄ =
n
be the sets of contents stored on idle servers by GENIE
and A respectively. The construction of the coupled process
(X ∗ (t), Y ∗ (t)) is described in Figure 5. We assume that the
system starts at time t = 0 and X ∗ (0) = Y ∗ (0) = 0. In this
construction, we maintain two counters ZX ∗ and ZY ∗ which
keep track of the number of departures from the system.
Let ZX ∗ (0) = ZY ∗ (0) = 0. Let Exp(µ) be an Exponential
random variable with mean µ1 and Ber(p) be a Bernoulli
random variable which is 1 with probability (w.p.) p.
Lemma 10. X ∗ (t) and Y ∗ (t) have the same marginal rates
of transition as X(t) and Y (t) respectively.
Proof. This follows from the properties of the coupled
process described in Figure 5 using standard arguments.
Lemma 11. Let D(GEN IE) (t) be the number of jobs deferred by time t by the GENIE adaptive storage policy and
D(A) (t) to be the number of jobs deferred by time t by a
policy A ∈ A. In the coupled construction, let W ∗ (t) be
∗
the number of arrivals by time t. Let, DX (t) = W ∗ (t) −
∗
∗
Y∗
Z(X ∗ ) (t) − X (t) and D (t) = W (t) − Z(Y ∗ ) (t) − Y ∗ (t).
∗
∗
Then, DX (t) and DY (t) have the same marginal rates of
(GEN IE)
transition as D
(t) and D(A) (t) respectively.
Proof. This follows from Lemma 10 due to the fact that
X(t) have the same distribution as X ∗ (t) and the marginal
∗
rate of increase of DX (t) given X ∗ (t) is the same as the
rate of increase of D(GEN IE) (t) given X(t). The result for
∗
DY (t) follows by the same argument.
Lemma 12. X ∗ ≥ Y ∗ for all t on every sample path.
Proof. The proof follows by induction. X ∗ (0) = Y ∗ (0)
∗ −
by construction. Let X ∗ (t−
0 ) ≥ Y (t0 ) and let there be an
arrival or departure at time t0 . There are 4 possible cases:
∗ −
∗
∗ −
i: If ARR<DEP and X ∗ (t−
0 ) = Y (t0 ), Y (t0 ) = Y (t0 )+
∗
∗ −
∗
1 only if X (t0 ) = X (t0 ) + 1. Therefore, X (t0 ) ≥
Y ∗ (t0 ).
∗ −
∗
∗ −
ii: If ARR<DEP and X ∗ (t−
0 ) > Y (t0 ), Y (t0 ) ≤ Y (t0 )+
∗
∗
∗
1 ≤ X ∗ (t−
)
≤
X
(t
).
Therefore,
X
(t
)
≥
Y
(t
0
0
0 ).
0
Lemma 9. Let G2 be the event that in the interval of interest, no more than O(n1/β ) different
types of contents are
requested. Then, P(Gc2 ) = o n1 .
∗ −
∗
∗
iii: If DEP<ARR and X ∗ (t−
0 ) = Y (t0 ), X (t0 ) = Y (t0 ).
Proof. Conditioned on the event G1 defined in Lemma
8, we have that in the interval of interest, at most 2b
ν+
α
∗
∗ −
∗ −
iv: If DEP<ARR and X ∗ (t−
0 ) > Y (t0 ), X (t0 ) = X (t0 )−
∗ −
∗
∗
∗
1 ≥ Y (t0 ) ≥ Y (t0 ). Therefore, X (t0 ) ≥ Y (t0 ).
1: Generate: ARR ∼ Exp(nλ̄), DEP ∼ Exp(max{X ∗ , Y ∗ })
2: t = t + min{ARR,DEP}
3: if ARR<DEP, then
4:
if (X ∗ = Y ∗ ) then
P
i∈CGENIE (t) λi
5:
Generate u1 ∼ Ber
nλ̄
6:
if (u1 = 1) then
7:
X∗ ← X∗ + 1
P
i∈CA (t) λi
8:
Generate u2 ∼ Ber P
i∈CGENIE (t) λi
9:
if (u2 = 1) then Y ∗ ← Y ∗ + 1
10:
end if
11:
else
P
i∈CGENIE (t) λi
12:
Generate u1 ∼ Ber
nλ̄
13:
if (u1 = 1) then X ∗ ← P
X∗ + 1
i∈CA (t) λi
14:
Generate u2 ∼ Ber P
i∈CGENIE (t) λi
15:
if (u2 = 1) then Y ∗ ← Y ∗ + 1
16:
end if
17: else
18:
if (X ∗ ≥ Y ∗ ) then
19:
X ∗ ← X ∗ − 1, ZX ∗← Z
X∗ + 1
Y∗
20:
Generate u3 ∼ Ber
X∗
21:
if (u3 = 1) then Y ∗ ← Y ∗ − 1, ZY ∗ ← ZY ∗ + 1
22:
else
23:
Y ∗ ← Y ∗ − 1, ZY ∗ ← ZY∗ + 1
X∗
24:
Generate u4 ∼ Ber
Y∗
25:
if (u4 = 1) then X ∗ ← X ∗ − 1, ZX ∗ ← ZX ∗ + 1
26:
end if
27: end if
28: Goto 1
Figure 5: Coupled Process
Lemma 13. ZX ∗ ≥ ZY ∗ for all t on every sample path.
Proof. The proof follows by induction. Since the system
starts at time t = 0, ZX ∗ (0) = ZY ∗ (0). Let ZX ∗ (t−
0 ) ≥
ZY ∗ (t−
0 ) and let there be a departure at time t0 . By Lemma
∗ −
∗
12, we know that, X ∗ (t−
0 ) ≥ Y (t0 ). Therefore, ZX (t0 ) ≥
ZY ∗ (t0 ) by the coupling construction.
Proof. (of Theorem 4)
By Lemmas 12 and 13, for any sample path,
X ∗ (t) + ZX ∗ (t) ≥ Y ∗ (t) + ZY ∗ (t).
Therefore, for every sample path, the number of requests already served (not deferred) or being served by the servers by
a content delivery system implementing the GENIE policy
is more than that by any other storage policy. This implies
that for each sample path, the number of requests deferred
by GENIE is less than that of any other storage policy. Sample path dominance in the coupled system implies stochastic
dominance of the original process. Using this and Lemma
11, we have that,
D(GEN IE) (t) ≤st D(A) (t).
6.
PROOF OF THEOREM 5
Proof. (of Theorem 5) The key idea of the GENIE policy
is to ensure that at any time t, if the number of idle servers
is k(t), the k(t) most popular contents are stored on exactly
one idle server each. Since the total number of servers is
n, and the number of content-types is m = αn for some
constant α > 1, all content-types Ci for i > n are never
stored on idle servers by the GENIE policy. This means
that under the GENIE policy, all arrivals for content types
Ci for i > n are deferred. For β > 2, for all i, pi ≥ π62 i−β .
The cumulative mass of all content types i = n + 1, ..αn is
Z αn+1
αn
αn
X
X
6 −β
6 −β
i
≥
i di
pi ≥
2
2
π
π
n+1
i=n+1
i=k
≥
0.54
1
,
π 2 (β − 1) (n + 1)β−1
for n large enough.
Let the length of the interval of interest be b. The expected number of arrivals of types n + 1, n + 2, ..αn, in the
0.54bλ̄n
1
interval of interest is at least 2
. Thereπ (β − 1) (n + 1)β−1
fore, the expected number of jobs deferred by the GENIE
policy in an interval of length b is Ω(n2−β ).
7.
RELATED WORK
Our model of content delivery systems shares several features with recent models and analyses for content placement and request scheduling in multi-server queueing systems [10, 11, 15, 18]. All these works either assume known
demand statistics, or a low-dimensional regime (thus permiting “easy” learning). Our study is different in its focus
on unknown, high-dimensional and time-varying demand
statistics, thus making it difficult to consistently estimate
statistics. Our setting also shares some aspects of estimating large alphabet distributions with only limited samples,
with early contributions from Good and Turing [6], to recent
variants of such estimators [13, 16].
Our work is also related to the rich body of work on the
content replication strategies in peer-to-peer networks, e.g.,
[3, 4, 8, 9, 12, 14, 22, 25, 26]. Replication is used in various
contexts: [14] utilizes it in a setting with large storage limits,
[9, 12] use it to decrease the time taken to locate specific
content, [3,25,26] use it to increase bandwidth in the setting
of video streaming, and [4] uses it to minimize the number
of hosts that need to be probed to resolve a query for a file
in unstructured peer-to-peer networks.
However, the common assumption is that the number of
content-types does not scale with the number of peers, and
that a request can be served in parallel by multiple servers
(and with increased network bandwidth as the number of
peers with a specific content-type increases) which is fundamentally different from our setting.
Finally, our work is also related to the vast literature on
content replacement algorithms in server/web cache management. As discussed in [19], parameters of the content
(e.g., how large is the content, when was it last requested)
are used to derive a cost, which in-turn, is used to replace
content. Examples of algorithms that have a cost-based interpretation include the Least Recently Used (LRU) policy,
the Least Frequently Used (LFU) policy, and the Max-Size
policy [20]. We refer to [19] for a survey of web caching
schemes. There is a huge amount of work on the performance of replication strategies in single-cache systems; however the analysis of adaptive caching schemes in distributed
cache systems under stochastic models of arrivals and departures is very limited.
8.
ACKNOWLEDGEMENTS
We acknowledge the support of ** grants ***.
9.
CONCLUSIONS
In this paper, we considered the high dimensional setting
where the number of servers, the number of content-types,
and the number of requests to be served over any time interval all scale as O(n); further the demand statistics are
not known a-priori. This setting is motivated by the enormity of the contents and their time-varying popularity which
prevent the consistent estimation of demands.
The main message of this paper is that in such settings,
separating the estimation of demands and the subsequent
use of the estimations to design optimal content placement
policies (“learn-and-optimize” approach) is order-wise suboptimal. This is in contrast to the low dimensional setting,
where the existence of a constant bound on the number of
content-types allows asymptotic optimality of a learn-andoptimize approach.
10.
REFERENCES
[1] M. Ahmed, S. Traverso, M. Garetto, P. Giaccone,
E. Leonardi, and S. Niccolini. Temporal locality in
today’s content caching: why it matters and how to
model it. ACM SIGCOMM Computer Communication
Review, 43(5):5–12, October 2013.
[2] L. Breslau, P. Cao, L. Fan, G. Phillips, and
S. Shenker. Web caching and Zipf-like distributions:
Evidence and implications. In IEEE INFOCOM’99,
pages 126–134, 1999.
[3] D. Ciullo, V. Martina, M. Garetto, E. Leonardi, and
G. Torrisi. Stochastic analysis of self-sustainability in
peer-assisted VoD systems. In IEEE INFOCOM,
pages 1539–1547, 2012.
[4] E. Cohen and S. Shenker. Replication strategies in
unstructured peer-to-peer networks. In ACM
SIGCOMM Computer Communication Review,
volume 32, pages 177–190. ACM, 2002.
[5] P. Gill, M. Arlitt, Z. Li, and A. Mahanti. YouTube
traffic characterization: A view from the edge. In 7th
ACM SIGCOMM Conference on Internet
Measurement, pages 15–28, 2007.
[6] I. J. Good. The population frequencies of species and
the estimation of population parameters. Biometrika,
40(3-4):237–264, 1953.
[7] A. Iamnitchi, M. Ripeanu, and I. Foster. Small-world
file-sharing communities. In IEEE INFOCOM, March
2004.
[8] J. Kangasharju, K. Ross, and D. Turner. Optimizing
file availability in peer-to-peer content distribution. In
INFOCOM, 2007.
[9] J. Kangasharjua, J. Roberts, and K. Ross. Object
replication strategies in content distribution networks.
Computer Communications, 25:376–383, 2002.
[10] M. Leconte, M. Lelarge, and L. Massoulie. Bipartite
graph structures for efficient balancing of
heterogeneous loads. In the 12th ACM SIGMETRICS
Conference, pages 41–52, 2012.
[11] M. Leconte, M. Lelarge, and L. Massoulie. Adaptive
replication in distributed content delivery networks.
Preprint, 2013.
[12] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker.
Search and replication in unstructured peer-to-peer
networks. In 16th international conference on
Supercomputing, 2002.
[13] A. McAllester and R. Schapire. On the convergence
rate of Good-Turing estimators. In COLT Conference,
pages 1 – 6, 2000.
[14] B. Tan and L. Massoulie. Optimal content placement
for peer-to-peer video-on-demand systems.
IEEE/ACM Trans. Networking, 21:566–579, 2013.
[15] J. Tsitsiklis and K. Xu. Queueing system topologies
with limited flexibility. In SIGMETRICS ’13, 2013.
[16] G. Valiant and P. Valiant. Estimating the unseen: An
n/log (n)-sample estimator for entropy and support
size, shown optimal via new clts. In Proceedings of the
43rd annual ACM Symposium on Theory of
Computing, pages 685–694, 2011.
[17] E. Veloso, V. Almeida, W. Meira, A. Bestavros, and
S. Jin. A hierarchical characterization of a live
streaming media workload. In Proceedings of the 2nd
ACM SIGCOMM Workshop on Internet measurment,
pages 117–130, 2002.
[18] R. B. Wallace and W. Whitt. A staffing algorithm for
call centers with skill-based routing. Manufacturing
and Service Operations Management, 7:276–294, 2007.
[19] J. Wang. A survey of web caching schemes for the
Internet. ACM SIGCOMM Computer Communication
Review, 29:36–46, 1999.
[20] S. Williams, M. Abrams, C. Standridge, G. Abdulla,
and E. Fox. Removal policies in network caches for
world-wide web documents. In SIGCOMM’96, 1996.
[21] S. Williams, M. Abrams, C. Standridge, G. Abdulla,
and E. Fox. Caching proxies: limitations and
potentials. In the 4th International WWW Conference,
December 1995.
[22] W. Wu and J. Lui. Exploring the optimal replication
strategy in P2P-VoD systems: Characterization and
evaluation. IEEE Transactions on Parallel and
Distributed Systems, 23, August 2012.
[23] www.youtube.com/yt/press/statistics.html.
[24] H. Yu, D. Zheng, B. Zhao, and W. Zheng.
Understanding user behavior in large scale
video-on-demand systems. In EuroSys, April 2006.
[25] X. Zhou and C. Xu. Optimal video replication and
placement on a cluster of video-on-demand servers. In
International Conference on Parallel Processing, pages
547–555, 2002.
[26] Y. Zhou, T. Fu, and D. Chiu. On replication
algorithm in P2P-VoD. IEEE/ACM Transactions on
Networking, pages 233 – 243, 2013.
Download