6.1 Algorithm

advertisement
Distributed Control Law for Load balancing with Delay
Adjustment and Primary Back-up Approach in Content
Delivery Network
Mr. Nitish M. Shinde
Mrs. S. R. Khiani
Department of CSE
G.H. Raisoni College of Engg &
Mgmt. Pune
Department of CSE
G.H. Raisoni College of Engg &
Mgmt. Pune
snitish1234@gmail.com
simran.khiani@raisoni.net
ABSTRACT
A Content delivery network or content distribution
network is a large distributed systems of servers deployed in
multiple data centers across internet. The goal of CDN is to
server with high performance and availability. Critical
component of CDN is request routing mechanism i.e. request
for content to the appropriate server based on a specific set of
parameters .The proposed system will implement the model
based on global balancing that will equally balance the
requests in system queues which also considers various delay
adjustment scheme and backup mechanisms with random
node crash or failure.
Keywords
Content Delivery Network (CDN), control Theory, Delay
Adjustment, Primary Backup approach.
1. INTRODUCTION
We ask that authors follow some simple guidelines.
In essence, we ask you to make your paper look exactly like
this document. The easiest way to do this is simply to
download the template, and replace the content with your own
material. Distributed systems are characterized by resource
multiplicity and system transparency. Every distributed
system consists of number of resources interconnected by
network. Besides providing communication facilities, network
facilitates resources sharing by migrating local process and
executing it at remote node of the network. A process may be
migrated because the local node does not have the required
resources or local node has been shut down. A Process may
also execute remotely if expected turnaround time will be
better. When user submits a process for execution, it becomes
responsibility of the resource manager of distributed system to
control assignments of resources to process and to route the
processes to suitable node of the system.
A server program that is installed on a targeted
system has limited resources. These resources include system
memory, hard disk space, and processor speed. Since the
server capacity is limited, so it can handle only certain
number of clients. With more number of clients a server will
be overloaded and that may lead to slow down of the
performance, hang and crash issues. So, it is crucial to balance
the load on server and this can be achieved by keeping copies
of servers and distribute the Load among them. Load
balancing approach in which all the processes submitted by
the users are distributed among the nodes of the system so as
to equalize the workload among the nodes. A variety of
techniques and methodologies for scheduling processes of a
distributed system have been proposed, these techniques can
be broadly classified into task assignment approach, load
balancing approach, and load sharing approach..
The general purpose of load balancing is to increase
availability, improve throughput, reliability, maintain
stability, optimize resource utilization and provide fault
tolerant capability. As the number of servers grows, the risk of
a failure anywhere increases and such failures must be
handled carefully. The ability to maintain unaffected service
during any number of simultaneous failures is termed as high
availability. Load balancing means an even distribution of the
total load amongst all serving entities. Load balancing is very
essential in distributed computing systems to improve the
quality of service by managing customer loads that are
changing over time. The request demands of incoming
requests are optimally distributed among available system
resources to avoid resource bottlenecks as well as to fully
utilize available resources. Load balancing also provides
horizontal scaling e.g. adding computing resources in order to
address increased loads. Load balancing basically done to do
following things.









Load balancing improves the performance of each
node and hence the overall system performance.
Load balancing reduces the job idle time
Small jobs do not suffer from long starvation
Maximum utilization of resources
Response time becomes shorter
Higher throughput
Higher reliability
Low cost but high gain
Extensibility and incremental growth.
2. LOAD BALANCING APPROACH
Scheduling algorithms using this Approach known
as load balancing algorithms or load-leveling
algorithms.
Load balancing algorithms tries to balance the total system
load by transparently transferring the workload from heavily
loaded nodes to lightly loaded nodes in an attempt to ensure
good overall performance relative to some specific metric of
system performance .However when performance is
considered from the resource point of view, metric involve is
the total system throughput. In contract to response time
throughput is concerned with seeing that all the users treated
fairly and that all making progress.
3. TAXONOMY OF LOAD BALANCING
ALGORITHMS:
3.1 Static Vs Dynamic:
At the highest level, we distinguish between static
and dynamic load balancing algorithms. Static algorithms use
only information about the average behavior of the system,
ignoring current state of the system. On the other hand
dynamic algorithms reach to the system state those changes
dynamically.
Static load balancing algorithms are simpler because
there is no need to maintain and process system state
information. The attraction of dynamic algorithm is that they
do respond to system state and so better able to avoid those
states with unnecessary poor performance. However since
dynamic algorithms must collect and react to system state
information so more complex than static algorithms.
3.2 Deterministic Vs Probabilistic:
Static load-balancing algorithms may be either
deterministic or probabilistic. Deterministic algorithms use
information about the properties of the nodes and the
characteristics of the processes to be scheduled to
deterministically allocate process to nodes.
Probabilistic algorithms use information about static
attributed of the system such as number of nodes, processing
capacity of each node, network topology and so on.
3.3 Centralized Vs Distributed:
Dynamic scheduling algorithms may be centralized
or distributed. In a Centralized dynamic scheduling algorithm,
responsibility of scheduling physically resides on a single
node. This node is called as centralized server node which
decides placement of new process using state information
stored in it. The centralized approach can efficiently make
process assignment decisions because the control node knows
load at each node. In this method other nodes periodically
send status update message to central server. A problem
associated with this approach is reliability. If the central
server failed, all the scheduling in the system ceases.
In Distributed dynamic algorithm, work involved in
making process assignment decisions is physically distributed
among various nodes of the system. A distributed scheme
does not limit the scheduling intelligence to one node. It
avoids the bottleneck of collecting state information at single
node and allows the scheduler to react quickly to determine
changes in the system state.
4. ISSUES IN DESIGNING
BALANCING ALGORITHMS:4.1 Load estimation policy:
LOAD
The main goal of load- balancing algorithms is to
balance the workload on all the node of the system. However
before an algorithm can attempt to balance the workload, it is
necessary to decide how to measure the workload of a
particular node. So any load balancing is to decide method
used to estimate the workload of the particular node. A node
workload estimated based on some measurable parameters.
This parameter includes

Total numbers of processes on the node at time of
load estimation.



Resource demand of these processes.
Instruction mix of these processes
Architecture and speed of the processor
4.2 Process Transfer Policy:
The strategy of load balancing algorithms is based
on the idea of transferring some processes from heavily
loaded node to the lightly loaded nodes for processing. To
device a policy whether node is lightly or heavily loaded, the
load balancing approach uses the threshold policy to make
this decision. The threshold value of node is the liming value
of its workload and is used to decide whether node is heavily
or lightly loaded. The new process at the node is accepted
locally for processing if the workload of the node is below its
threshold value at that time. Job is transferred if queue length
at local node exceeds certain threshold. Otherwise job is
executed locally.
4.3 Location policy
Once decision has been made through the transfer
policy to transfer a process from a node, the next step is to
select the destination node for that process execution.
4.4 State Information Exchange Policy:
The transmission improves the ability of the
algorithm to balance the load. On the other hand, it raises the
expected queuing time of massages because of the increase in
the utilization of the combination cannel. Because of the
transmission delays load collected for load balancing decision
may differs from current load.
5. REQUEST ROUTING MECHANISMS:
The critical component CDN architecture is request
routing mechanism. It allows direct users requests to the
appropriate servers based on a specific set of parameters like
traffic load, bandwidth and server computational capabilities.
This projects aims at to provide a global balancing
mechanisms among distributed servers to improve system
throughput and Availability. To implement model that uses
global balancing method which equally balances request in a
system queues?
CDN can be considered as a set of servers with its
own queue. We assume fluid model approximation for
dynamic behavior of each queue. CDN is designed with
adequate resources in order to satisfy large volume generated
by end users. it must ensure that input rate always less than
service rate Rather we have local instability condition where
input rate is greater than service rate. Balancing algorithms
helps prevent a local instability condition by redistributing the
excess load to less loaded servers.
5.1 Queue Occupancy:



Let qi(t) be the queue occupancy of server i at time
twee consider instance arrival rate αi(t) and instance
service rate ∂i(t)
To implement model that uses global balancing
method which equally balances request in a system
queues
If arrival rate lower than the service rate we can
have decrease in a queue length. We can operate on
a fraction of traffic exceeding servers capacity .such
excess traffic can be accommodated by
redistributing to servers neighbor based on
appropriate control law.
Fluid model of CDN servers queue given by
dqi (t )
 qi (t )   i (t )   i (t )
dt
(1)
5.2 Cooperation among Node:
The main idea of the control law we propose is that
properly redistributing server’s excess traffic to one or more
neighboring servers queue is less loaded than the local queue.
The main feature of proposed balancing law allows system
equilibrium through the proper balancing of the servers loads
.The algorithms consists of two independent parts. The
algorithm consists of the two independent parts procedure in
charge of updating neighbor’s status i.e. their computational
capabilities and a mechanism representing core of the
algorithms which in charge of distributing request to neighbor
nodes.
6. LOAD BALANCING ALGORITHM:
To address limitations Dynamic load balancing
schemes are introduced which try to balance loads among
server dynamically during runtime. Existing dynamic load
balancing method s can be classified into centralized and
decentralized methods. In centralized methods, a central
server makes the load migration decisions based on the
information collected from all local servers, which manage
individual partitions, and then passes the decisions to local
servers for them to carry out the load migrations. In
decentralized methods, each local server could make load
migration decisions with information collected from its
neighbor servers, which manage the adjacent partitions the
network latency may no longer be neglected. As a result, there
will be a delay from the moment that each server sends out its
load status to the moment when the collected load status is
used to make load migration decisions. As DVE systems are
highly interactive and the load on each server may change
dramatically over time, by the time when a server is to make
the load migration decision, the collected load status from
other servers may no longer be valid.
7.MODIFIED
APPROACH
DYNAMIC LOAD BALANCING:
FOR
We have main contributions in this work:

We formally model the load migration process
across servers by considering the latency among
them. Although our work is based on the centralized
approach, it can also be applied to the decentralized
approach.

We propose and compare two efficient delay
adjustment schemes to address the latency problem.

Propose primary back up approach in case of load
balancing.
7.1 Stability:A scheduling algorithm is said to be unstable if it
can enter a state in which all the nodes of the system are
spreading all their times in migrating processes without
accomplishing any useful work in an attempt to properly
schedule the processes for better performance. This form of
fruitless migration is called processor thrashing. e.g. it may
happens that node n1 and n2 both observed that node n is idle
and then both offload portion of their work to node n3 without
being offloading decision made by each other. Now if node n3
become overloaded due to the processes received from both
n1 and n2, then it may again start transferring its processes to
other nodes. This entire cycle may be repeated again and
again, resulting in an unstable state. The ultimate goal of load
balancing algorithms it to overcome above problem.
7.2 Scalability:
Scheduling algorithms should be capable of
handling small as well as large network. An algorithm that
makes scheduling decisions by first inquiring the workload
from all the nodes and then selecting the most lightly node as
candidate for receiving the processes has poor scalability
factor. Such algorithms may work fine for small network but
gets crippled when applied to large network.
6.1 Algorithm:
7.3 Fault Tolerance:
//peer status update
Prob_space[0]=0,load_diff=0;load_diff_sum=0;
For(j=1;j=n;j++){
If(load_i-peer[j].load){
load_diff=load_i -peer[j].load;
//insert the new difference
build_prob_space(load_diff,prob_space);
load_diff_sum=load_diff_sum+load_diff;
}
//normanlize the vector element
Update_prob_space(load_diff,prob_space);
}
//balancing process
If(prob_space[]==Null)//no neighbor with lower load
//server locally serve the request
Serve_request();
Else
{
float x=rand();//random number generator
int req_sent=0;int =0;
while(prob_space[i]==-1 or req_sent==1{
if(prob space [i-1] <=x <prob space[i]){
//send request to the chosen peer
send_to(peer[i-1].addr);
Req_sent=1;
}
i++;}}
A good scheduling algorithm should not be disabled
by the crash of one or more nodes of the systems. An
algorithm that have decentralized decision making capability
and considers only available node in their decision making
approach have better fault tolerance approach.
In distributed computing systems (DCSs) where
server nodes can fail permanently with nonzero probability,
the system performance can be assessed by means of the
service reliability, defined as the probability of serving all the
tasks queued in the DCS before all the nodes fail. The
framework also permits arbitrarily specified, distributed loadbalancing actions to be taken by the individual nodes in order
to improve the service reliability. The dynamics of DCS
becomes further complicated in volatile or harsh
environments in which nodes are prone to fail permanently In
such cases, messages have to be broadcasted among working
nodes in order to detect and isolate faulty nodes.
The objective is to maximize the service reliability,
while the response time of the workload is simultaneously
minimized at t =0 all the nodes are functioning and tasks are
allocated on the nodes so that the jth node has in its queue mj
tasks. Task redundancy is provided by means of a backup
system that is attached to each node. It must be noted that the
backup system does not service any tasks. More specifically,
in the event of node failure, the backup system broadcasts an
FN packet to alert the nodes about the change in the number
of functioning nodes, reallocates all the unfinished tasks
among those nodes perceived to be functioning and handles
the reception of tasks that were in transit to the jth node before
its failure, and next, reallocates the received tasks among the
functioning nodes and functioning of the tasks at the
respective jth node.
7.3.1Definition and assumptions:
Let the random variable Wki be the service time of
the ith task at the kth node, and let XjkQ be the transfer time of
the QI packet sent from the jth to the kth node, j ≠ k. The
failure time of the kth node is represented by the random
variable Yk, and the transfer time of the failure-notice
(FN)packet sent from the jth to the kth node is represented by
the random variable XjkF(j ≠ k). Finally, let the random
variable Zik be the transfer time of the ith group of tasks sent to
the kth node.
In order to maximize the service reliability, LB is
performed at time tb ≥ 0 so that each functional node, the j
node, say, transfers a positive amount, Ljk, of tasks to the kth
node with j≠k, which is functioning according to the
knowledge of the jth node. Naturally, these task exchanges
over the network take random transfer times. Additionally, we
have assumed that, at t = 0, each node broadcasts a QI packet
that takes a random amount of time to reach the destination
nodes. The dynamics of the DCS are governed by the random
times associated to the service of tasks, the failure of nodes,
and the transfer time of both information and tasks in the
network.
8. Conclusion:
Scheduling in distributed operating systems has a
significant role in overall system performance and throughput.
The process scheduling decisions are based on such factor as
based on such as resource requirement of the processes, the
availability of various resources on different nodes of the
systems. A good scheduling algorithm should posses features
such as having no prior knowledge about processes, being
dynamic in nature ,having quick decision making capability,
balanced system performance ,stability, scalability ,fault
tolerance and fairness of services
Thus in this paper we propose modified load balancing
approach that will overcome delay problems with dynamic
load balancing decisions. We propose primary backup
approach that improves fault tolerance of distributed systems.
The basic load balancing algorithms works with queue
occupancy formulation that will determine the number of
processes at each node. The new process at the node is
accepted locally for processing if the workload of the node is
below its threshold value at that time. Job is transferred if
queue length at local node exceeds certain threshold.
Otherwise job is executed locally. Load balancing algorithm
considers multi objectives in its solution evaluation and solves
the scheduling problem in a way that simultaneously
minimizes execution time and communication cost, and
maximizes average processor utilization and system
throughput.
REFERENCES
[1] Jasma Balasangameshwara and Nedunchezhian Raju “Performance-Driven Load Balancing with a Primary-Backup Approach for
Computational Grids with Low Communication Cost and Replication Cost” in IEEE TRANSACTIONS ON COMPUTERS,
VOL. 62, NO. 5, MAY 2013.Ding, W. and Marchionini, G. 1997 A Study on Video Browsing Strategies. Technical Report.
University of Maryland at College Park.
[2] Hung-Chang Hsiao and Che-Wei Chang “A Symmetric Load Balancing Algorithm with Performance Guarantees for Distributed
Hash Tables” in IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 4, APRIL 2013
[3] Yunhua Deng and Rynson W.H. Lau “On Delay Adjustment for Dynamic Load Balancing in Distributed Virtual Environments”
in IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 18, NO. 4, APRIL 2012
[4] Ioannis Konstantinou, Dimitrios Tsoumakos, and Nectarios Koziris “Fast and Cost-Effective Online Load-Balancing in
Distributed Range-Queriable Systems” in IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL.
22, NO. 8, AUGUST 2011
[5] Jorge E. Pezoa, Student Member, IEEE, Sagar Dhakal, Member, IEEE, and Majeed M. Hayat, Senior Member, IEEE
“Maximizing Service Reliability in Distributed Computing Systems with Random Node Failures: Theory and Implementation:”
in IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 21, NO. 10, OCTOBER 2010
[6] S. Manfredi, F. Oliviero, and S. P. Romano, “Distributed management for load balancing in content delivery networks,” in Proc.
IEEEGLOBECOM Workshop, Miami, FL, Dec. 2010, pp. 579–583.
[7] H. Yin, X. Liu, G. Min, and C. Lin, “Content delivery networks: A Bridge between emerging applications and future IP
networks,” IEEENetw., vol. 24, no. 4, pp. 52–56, Jul.–Aug. 2010
[8] J. D. Pineda and C. P. Salvador, “On using content delivery network to improve MOG performance,” Int. J. Adv. Media
Commun., vol. 4, no. 2, pp. 182–201, Mar. 2010.
Download