Network Aware Resource Allocation in Distributed

advertisement
Network Aware Resource
Allocation in Distributed Clouds
Contribution





Develops efficient resource allocation algorithms
The developed 2-approximation algorithm for
optimum Data Center(DC) selection is found to be
quite efficient
Develops a heuristic for partitioning the requested
resources among the chosen DCs and racks
Minimizes distance (latency) between the selected
DCs
Simulations show that this approach yields
significant gains
Introduction





Resource allocation – a key function of cloud
management and automation
Resource allocation algorithms have high impact on
performance of applications
Also affects the efficiency of DCs in accommodating
requests
User requests require allocation of Virtual
Machines(VMs)
To satisfy these requests, resource allocator
maintains updated list of resources available at DCs,
current allocations and future requirements.
Introduction





User requests include number of VMs and the
communication links required between the VMs
Automation software’s objective is to choose the DC
and rack such that overall resource usage is
minimized and optimal performance is achieved
These two goals are complimentary
Usually involve attempts at allocating all requested
resources onto a single rack – not always possible
Thus, for best results, resource allocation algorithms
that are capable of handling many scenarios are
required
Introduction


Fragmentation of user requests reduces
performance
Difficult to solve fragmentation
This paper focuses on resource allocation problem in
distributed cloud systems spread out geographically
over WAN
Target : latency
System Architecture
Distributed Cloud





Requests should be handled by DCs close to them –
helps improve performance
Racks consist of blade servers, each containing
many cores
Communication between multiple blade servers
within the same rack happen via TOR switch
Two different racks communicate using aggregator
switch
DC networks designed with assumption of locality of
communication
System Architecture
Distributed Cloud


As distance between machines increases, the
bandwidth decreases
Bandwidth depends on physical machines that the
Virtual Machines(VM) are assigned to
 Overall efficiency of a DC also depends on this
 Number of requests serviceable by the DC also
depends on this
System Architecture
Distributed Cloud
System Architecture
Cloud Management and Automation S/W



Prior knowledge about communication links may not
be available
Automation S/W have to assign resources based on
worst case conditions and then re-optimize
There are also other conditions that need to be
satisfied
 Number of VMs / DC (for fault tolerance)

Automation S/W computes mapping of user requests
to physical machines
System Architecture
Cloud Management and Automation S/W



The output of the cloud automation software is a
mapping of VMs to physical resources
The software interacts with Network Management
System (NMS) and the local Cloud Management
System (CMS)
The cloud optimization software has two
functionalities
 Track resource usage
 Optimize assignment of user requests


Assignment of user requests consists of identifying
DCs and machines
Goal: To reduce inter-DC, intra-DC traffic
System Architecture
Cloud Management and Automation S/W
System Architecture
Cloud Management and Automation S/W

I.
Assignment of DCs is done in 4 steps
DC Selection
Identify DCs based on user constraints and availability
Identify subset of DCs that minimize latency
•
•
Partitioning Across DCs
II.
Minimize inter-DC traffic
Adhere to given constraints and partition VMs accordingly
•
•
III.
Rack, Blade, Processor selection
•
•
IV.
Identify physical computational resources in the DCs
Goal : Identify machines with low inter-DC traffic
VM Placement
•
•
Assign individual VMs to physical resources
Minimize inter-rack traffic
System Architecture
Data Center Selection

Select DCs that meet
 All specifications and constraints
 Optimize network resources
 Maximize application performance


Use an algorithm that selects a subset of DCs with
least hops
Handle other constraints such as maximum or
minimum VMs / DC
System Architecture
Data Center Selection


DC selection problem – sub-graph selection problem
Given G = (V,E,w,l)





V – Data Centers
E – Path between DCs
w – number of available VMs at DC
l – distance of these paths
Note :
 If there are constraints on maximum number of VMs / DC, w
takes this value instead
 If there is a constraint of the minimum number of VMs / DC,
DCs with fewer VMs are omitted
System Architecture
Data Center Selection




Let ‘s’ be number of VMs requested
Problem : Find sub-graph of G whose sum is at least
‘s’ with minimum diameter
Goal : Find sub-graph with minimum length of
longest edge
NP-hard problem
System Architecture
Data Center Selection
System Architecture
Data Center Selection


This algorithm finds a star topology
centered at v
Diameter of output sub-graph is at most
2x diameter of optimal sub-graph
System Architecture
Data Center Selection
System Architecture
Data Center Selection

Running Time
FindMinStar has to be sorted  O(nlogn)
 N  number of DCs


Computing diameter  O(n2)
O(FindMinGraph) = n * O(FindMinStar) = O (n3)
System Architecture
Machine Selection within DC


Goal : Find machines that reduce inter-rack traffic
DC topology is a tree topology
 Root
 Children
 Leaf


– core switch
– top-level switches
– racks
Given the tree representation of the DC (T) and total
number of VMs (s) to be placed
Find sub-tree with minimum height that has weight
at least equal to ‘s’
System Architecture
Machine Selection within DC
System Architecture
Machine Selection within DC
System Architecture
Virtual Machine Placement



Heuristic algorithms required for assigning individual
VMs to DCs and CPUs within DCs
Problem is a variant of graph partitioning and k-cut
problem
User request represented as graph G = (V,E)
 Nodes represent VMs to be placed
 Edges represent connections between them


Goal : Partition G into disjoint sets c1, c2…cm such
that communication along vertices is minimized
If traffic is asymmetric, take the average
System Architecture
Virtual Machine Placement
System Architecture
Virtual Machine Placement
System Architecture
Virtual Machine Placement



Algorithms 4,5 give heuristic solution to partition
problem
Optimized using Keringhan–Lin heuristics
Runtime :
 O(n2logn)
Simulation Results




Results compared to random approach and greedy
algorithm
Random approach selects random DC and places as
many VMs as possible in the DC
Greedy selects DC with maximum VMs
To measure performances
 Random topology created
 Random user requests generated
 Maximum distance between any two VMs measured
Simulation Results



Location of DCs randomly selected within a
1000x1000 grid
Distance between DCs is the Euclidean distance
between points
Five different distributed cloud scenarios






100
75
50
25
10
DCs
DCs
DCs
DCs
DCs
However, average machines on each cloud is the
same
Simulation Results
I Experiment



Measuring diameter of placement for a single
request of 1000 VMs
Approximation algorithm performs 79% better
Note : Diameter decreases as number of DCs
decreases
Simulation Results
II Experiment


I.
Study cloud systems with series of user requests
Two experiments
100 requests for 50 – 100 VMs
 Requests are uniformly distributed
 Large requests
II.
500 requests for 10 – 20 VMs
 Small requests

Note : In both experiments, average VMs requested
is the same
Simulation Results
II Experiment
Simulation Results
II Experiment



Greedy performs better than random by 32.6% and
66.5%
Approximation algorithm performs better than
greedy by 83.4% and 86.4%
Why do larger requests require higher diameter?
Simulation Results
III Experiment




Studies performance of cloud system when
additional constraints are given
Same requests as previous experiment
Resilience is defined as ratio of total VMs to
maximum VMs at any DC
Requests need to be placed in at least resilience
number of DCs
Simulation Results
III Experiment



Larger requests have longer diameter
As resilience increases, diameter increases
What is different about these results?
Simulation Results
III Experiment






Performance of heuristic algorithm
Given communication requirements and available
capacity of DCs, algorithm computes optimal
placement of VMs that minimizes inter-DC traffic
Comparison of heuristic algorithm with greedy and
random algorithms
Random assigns random DC to each VM
Greedy selects DCs in decreasing order of
availability
While selecting VMs, it chooses VMs with maximum
total traffic first
Simulation Results
III Experiment





Experiment assigns a request of 100 VMs to DCs
Bandwidth fixed randomly between 0 and 1 Mbps
Inter-DC traffic for assignment of these VMs to k
DCs (k = 2,…,8) was studied
Available resources at each DC were between 100/k
and 200/k
Hence 100 VMs were being assigned to DCs
consisting of 100 – 200 VMs
Simulation Results
III Experiment



For all algorithms inter-DC traffic increases as
number of DCs increase…Why?
Greedy algorithm performs better than random by
10.2%
Heuristic algorithm performs better than greedy by
4.6%
Simulation Results
III Experiment



When the DCs did not have excess capacity, interDC traffic was higher for heuristic algorithm by
28.2%
Heuristic algorithm performed better than the other
two algorithms by 4.8%
Greedy and Random had similar performances
Simulation Results
IV Experiment




In this experiment, effect of VM traffic on inter-DC
traffic is studied
The percentage of links with traffic is varied between
20% and 100% and inter-DC traffic is measured
The DCs have no excess capacity in these
experiments
Result: inter-DC traffic grows linearly with
percentage of links with traffic for all algorithms
Conclusions





Main contribution is development of algorithms for
network-aware resource allocation of VMs in
distributed cloud systems
Need for these efficient algorithms :
Inter-DC traffic may be very expensive
2-approximation algorithm provided for selection of
DCs
This algorithm can also be used for rack selection
within DC but using prior knowledge about network
topology within DC gives better results
Heuristic algorithm for mapping VMs to resources
within DC
Related Work




Graph partitioning problems
K-cut problem
Maximum sub-graph problem
Assigning VMs inside DCs studied in
Improving the scalability of data center networks
with traffic-aware virtual machine placement
Download