A Labeling Algorithm for Just-in-Time Scheduling

advertisement
A Labeling
Algorithm
for Just-in-Time
Charles
G. Boncelet
Department
Jr.*
This
paper
is to
describes
find
a burst
from
source
of
and
determine
information
i.e.,
that
an algorithm
that
within
a simplifying
assumption,
a route
if one exists.
Simulations
runs
a variation
for
in practice
the
we show
may be achieved.
this
problem
scheduling
so
NP-
are
is guaranteed
indicate
of the paper
is organized
this
scheduling
We also describe
is to find
to adapt
a repeating
the basic
3
algorithm
Overview
of
Highball
variation.
The University
of Delaware
Highball
search effort geared toward developing,
2
Summary
We address
high
the
speed
accurate
issue
of scheduling
networks.
time
protocols),
The
keeping
crossbar
connect
to many
between
others
bursts
networks
in
nodes
of
data
question
(allowing
on
have
TDMA
in-
do so on a bursty
users by providing
which
with
can
connect
any
Users can transmit
a megabit
1 ms.
gateways.)
The
authors
Permission
granted
direct
to
commercial
of the
that
copying
and/or
copy
that
specific
reached
without
at
all or part
otherwise,
of this
the ACM
of the Association
notice
is
for
for Computing
or to republish,
requires
/92/0008
/0170
consists of directed
The scheduling
problem
switches and assign starting
a fee
permission.
COMM’92-81921M0,
USA
@l1992 ACM 0.89791
-526-7
“bursts”
of data, say
rates, a megabit
links,
burst
takes
probably
fiber
is done at the source nodes.
materiel
and notice
large
these
periods.
and the
and
or distributed
copyright
appear,
relatively
accommodates
access for brief
is given
boncelet@udel.edu
ara not made
and its date
is by permission
To copy
fee
the copiee
advantage,
publication
Machinery.
be
respectively.
provided
titla
can
Highball
dedicated
optic, connecting
nodes.
Distances involved
are relatively large.
Individual
links have delays from 3-18
ms, while the network diameter is approximately
30 ms.
Links are connected together by crossbar switches that
can connect any input to any one or several outputs.
The switches are controlled
locally. Each switch has an
accurale clock (to within about a microsecond).
Thus,
the switches can perform
“just in time” switching just
before a burst arrives. Nowhere within the network are
the bursts stored or otherwise manipulated.
All queuing
by Defense Advance Research Projects Agency
Ames Research Center contract number NAG 2–
mills@ udel.edu,
basis.
or so. At gigabit
The network
In this paper, we propose two scheduling
problems
for these networks.
We show that simple variations
of
this problem are NP-complete.
We present algorithms
*Supported
under NASA
project
is a reanalyzing,
and
protolyping
a novel high speed computer network based
on TI)MA
protocols
[2, 4]. The network
is designed
for “big” users, such as supercomputers,
visualization
workstations,
and immediate
disk to disk transfers, who
can individually
demand gigabit or so rates, but usually
directed links.
The assumed
so that typical link delays are
number of nodes is moderate,
(These nodes may, of course,
switches
put to any output,
and
geographical
size is large
on the order of 10ms. The
say between 10 and 100.
638.
as follows:
Section 4 specifies the scheduling problem and Section 5
presents the NP-completeness
results.
Algorithms
are
presented in Sections 6 and 8. Finally,
the paper concludes with comments and suggestions in Section 9.
to
that
tasks on machines.
The remainder
schedules,
and overall
problem
how
80-90%
high
network
finds
can be achieved.
in which
and
this
quickly
of 80-90%
schedule
19716
the
variations
and
efficiencies
DE
problem
schedules
that
simple
find
algorithm
*
Engineering
The work has its immediate
application
to the UD
Highball
Network, discussed briefly in Section 3. Other
applications
may lie in scheduling
traffic on roads or
traverse
We show
for
The
switch
can
several
We present
algorithm
networks.
to destination.
is difficult,
complete,
a scheduling
communication
a route
that
L. Mills
Networks
for creating schedules and present the results of simulations
which indicate
that scheduling
efficiencies
of
TDMA
speed,
David
in TDMA
of Delaware
Newark,
Abstract
and
of Electrical
University
1
Scheduling
is how to schedule
the
times for the bursts so that
as many bursts as possible can successfully reach their
destinations.
The scheduling
problem
is perhaps best
... $1 .50
170
understood
with
on a directed
cannot
an analogy:
network.
be stopped
that
of scheduling
The trains
prior
to reaching
their
the burst
trains
must not collide
size is crucial
Bursts that
conveniently
and
destinations.
are long
switched
to the nature
of the network.
compared
to link delays can be
with virtual circuits.
Conversely,
This
bursts that are very short (e.g., ATM packets) are probably best handled with source routing techniques.
The
1 ms burst is an interesting
middle position and is convenient for many “big” users, such as supercomputers
analogy gives the Highball
network its name, In railroad parlance,
“highball”
means the train has priority
and can go as fast as the tracks allow.
or visualization
workstations.
We use the word “bursts”,
rather than “packets”, because the bursts do not contain routing
information.
Presently, we are studying two scheduling paradigms.
The first is termed
Reservation- TDMA
(R- TDMA).
The network does not require any sort of header on the
burst (except possibly for synchronization).
On a sep-
In R-TDMA,
the nodes transmit
requests to transmit
bursts on a seprate reservations
channel (which maybe
arate channel, the nodes transmit
requests to transmit.
(These requests are assumed to be much shorter than
physically
inband, even though logically
out of band).
All nodes receive these requests, schedule them, and
arrive at the same schedule.
At the appropriate
time,
the source transmits its burst and intermediate
switches
The second paradigm
is a reare set “just in time.”
the actual bursts.) All nodes receive these requests
must schedule them.
In abstract, the scheduling problem is as follows:
Maximum
traversing
real train
trains
throughput
demanck that many trains are
the network
at once.
Furthermore,
unlike
networks, Highball camnot signal ahead of the
since the trains
go as fast as the signals.
Given a directed graph G = (V, A) with V
being a set of vertices and A a set of directed
arcs, a distance metric associated with each
arc, and a list of reservation
requests, find a
peating schedule determined
clnce at network turn on
time, and modified
in an attempt
to meet instantaneous demand. We refer to this paradigm is referred as
Adaptive Circular- TDMA (A C.. TDMA). In AC-TDMA,
schedule
rations
nodes transmit
requests to modify the repeating schedule for an extended number of bursts.
The principal
Note that
advantage of AC-TDMA
versus R-TDMA
is less computational
overhead, while the disadvantage
is possibly
slow adaptation
to instantaneous
demand.
4
The
4.1
Scheduling
Size:
about
the network:
The network is “large”, about 5000 km in diameter. At approximately
2/3 the speed of light, this
size corresponds to a delay of about 30ms. Typical
links
Links:
Links
be fiber
are directed.
optic
of nodes, n, is small
links
Minimize
the arrival
time
●
hIinimize
bursts.
the
transit
Minimize
the total
of data
to satisfy
all the re-
as many
requests
total
waiting
over which
as possible.
of the final
time
burst.
of all scheduled
time for transmission
of
bursts.
and waiting
times.
connections.
5
many
has 1 mega,bit
bursts
1 ms long.
of data.
in transit
At
Thus
at one time.
NP-Completeness
The general scheduling
problem
posed above is quite
difficult.
In this section, we show that several specific
variations
are NP-Complete.
5.1
are approxinmtely
each burst
can have
configu-
hlinirnize
the total time for the satisfied requests
to be received.
The total time is the sum of the
transit
instant aneously.
rates,
●
these will likely
purposes of this paper, we will ignore this reconfiguration
time and assume the switches reconfigure
Bursts
Satisfy
●
Switches:
Links are connected by cross-bar switches.
These switches can connect any input to any output
or to several outputs.
The switches can reconfigure quickly, although
not instantaneously.
For the
bit
it may be impossible
the scheduled
In practice,
and switch
●
are 5–15ms long.
Number
of Nodes:
The number
to moderate, say 10-100.
times
the requests.
schedule. The following
are some of criteria
one might wish to opti-mize:
Problem
assumptions
of starting
to satisfy
quests if there are lifetimes
associated with them or if
the requests are arriving over time with a rate that exceeds the scheduling
capacity.
Furthermore,
it is not
obvious what criteria should be optimized
in selecting a
Statement
We make the following
and
giga-
Transformations
Problem
of
the
Partition
The partition
problem
is the following:
Given a set
of integers, xl, ZZ, . . . . ~n, and a sum, S = ~~=1 xi,
typical
Note
171
is there
a subset
of the
A, such that
integers,
&A
xi?
It is well known that this
complete (See, for instance, [1, 3].)
We show two different
problem
ways of solving
S/2
The
the partition
with
arrival
single
problem
problem:
path.
that
S/2?
In this
case, a schedule
the NP-completeness.
Firstly,
lem
construction
It shows
expect
mation
any exact algorithm
to run in exponential
In particular,
termines
ble
time.
all
the
algorithm
times
destinations.
However,
that
presented
partitioning
it
Pack-
Given
and a bin
a colsize B,
can
channel
is a
transform
bin
assume
that
This
problem
which
transmits
resources
is
lets
the bin
a scheduling
prob-
these
consider
is slotted,
is NP-complete.
Treat
n bursts
the
of sizes
Again
bin
if the
ample,
if all the
to
other
all
nodes.
A
allocate
Normal
traffic
then
network
a Sourcej
them,
the scheduling
available
of bins
with
connecting
period
necessary
If
problem
as a bin,
to transmit
Z1, Z2, . . .,zn.
packing
difficult
link
each
number
is a reservations
reservations.
a trivial
a single
For
there
is to periodically
the
access to the link
siotted,
is
reservations.
around
and
minimize
access
requests
in-band
for
be scheduled
For instance,
to
network
we assume
to do this
a Destination,
and
is minimal.
packing
that
in Highball
way
network
must
the
of subsets
integers
xi =
is pseudo-polynomial
involved
1, then
can
bin
get
packing
and
large.
is only
For
ex-
is trivial.
at a destination
at
Therefore,
we can
which provides
time (assuming
a burst
Therefore
is as follows:
XZ, . . .. x~,
as is the generalization
if we
simple
is particularly
interesting
to this
that
the problem
of determining
whether
a single burst can arrive
a particular
time is NP-complete.
number
instance,
scheduling problem is in NP. This is trivial.
An instance
can easily be checked in polynomial
time by adding the
n delays (either Zi or O). Secondly, we need to show that
this construction
correctly solves the partition
problem.
If there is a path, take A to be the nonzero links along
the path. This set will satisfy the partition
problem.
This
paper.
Bin
vary.
We
to establish
we need to show that
xl,
the
sizes
can be solved by the following
We need to show two things
problem
NP-complete,
Is there a schedule for a single burst
time
packing
of integers,
of the
partition
the integers into subsets such that the sum
of the integers in each subset does not exceed B and
as the Destination.
Connect each node to the next in
line by two arcs: one with zero delay, and one with delay
equal to xi. All arcs go from left to right. The network
is illustrated
in Figure 1.
The partition
bin
lection
problem with a scheduling
algorithm.
For the first solution, consider a simple network of n + 1 nodes. Order
the nodes linearly from left to right and identify the first
(the leftmost)
as the Source and the last (the rightmost)
scheduling
A Transformation
ing Problem
5.2
=
is NP-
can
must
is an
in Section
arrive
run
6
this inforP # NP).
at
all
of
Algorithm
for
Scheduling
6 depossi-
To reiterate
the general scheduling problem, given a directed network and a set of burst requests, find a schedule for transmitting
the bursts and setting the various
crossbar switches along the way. In practice these re-
in exponential
example
Labeling
a class
as pseudo-polgnomiai
of NP-complete
problems
known
because (loosely
speaking)
they can be solved in polyno-
mial time if the integers involved are bounded in size. In
other words, if the integers can be represented in unary,
quests are made over time and we would like to schedule
each as soon as possible. In order to facilitate
this task,
we make a simplifying
assumption:
We freeze the cur-
rather than binary, then
polynomial
time [3].
can be solved in
rent schedule and try to schedule the next burst without
changing any of the previous bursts. In other words, we
In the scheduling
problems
considered here, the integers are bounded by the size of the network (about
30 ms) divided by the quantization
size (about 0.25-1.0
ms). Thus we have good reason to believe that in practice the labeling algorithm
will run in reasonably short
times.
assume that we have scheduled the first L bursts and
will not change any of that schedule in scheduling
the
is a greedy one and leads to
(L+ l)st. This assumption
one at a time processing.
Greedy algorithms
are widely
used heuristics in solving combinatorial
problems.
the problem
The current schedule creates holes or slots into which
we can try to fit the next burst. This burst cannot overlap any of the previous at any link. Thus we conclude
that the schedule is link based, Each link has a timeline
that stores the status of the beginning
of that link at
The second construction
uses a trivial
network consisting of a Source and Destination
and two links between them, from source to destination.
Each link has
zero delay. Assume there are n bursts to be scheduled,
each with size xi. Then the partition
problem is equivalent to the following
scheduling
problem:
Is there a
schedule of these n bursts such that all bursts are finished by time S/2? Such a schedule must, by necessity,
partition
the bursts
equal to S/2.
into two subsets each with total
each time.
A link
can be in one of four states:
BUSY:
The beginning
of the link
time by a previously
scheduled
size
is occupied
burst.
at this
MAY13E:
The beginning
of the link is unoccupied
at
this time and we do not know whether the burst
172
sCxx”””JJD
o
0
Figure
can reach this link
YES:
o
1: Network
configuration
for the partition
at this time.
The beginning
of the link is unoccupied
burst can reach this link at this time.
and the
●
Let b = burstsize,
denote
NO:
The beginning
of the link is unoccupied
at this
time, but the burst in question cannot arrive now
because it would overlap with an already
burst.
In other words, the tail of this
overlap with the head of another burst.
●
scheduled
burst will
First
●
While
link,
then this problem
cannot
the
Source link
FIFO
●
uled, the timelines are markecl BUSY for the appropriate durations.
In the effort to schedule the next burst,
we must do two things: Firstly, find and mark the NO’s
where the burst to be scheduled cannot fit because it
burst.
Note,
whether
or not
a link
that
can be connected
1, by s =SUCCESSOR(l),
that
as YES and put on FIFO.
Third
step, did we succeed?
marking
appropriate
a
●
Fourth
step, return
for each link
for(t
can be
Figure 2: Labeling
networks.
173
to MAYBE’s:
1,
==
TL(l,t)
connected directly to 1 by a crew.sbar switch. The crucial
idea is that, if link 1 is in state YES at time t and s is in
state MAYBE
at time t+delay(l),
then s can be marked
as YES at time t + delay(l).
In this basic fashion, we
propagate YES labels until powibly the destination
link
is marked YES. The basic scheduling algorithm
is given
in Figure 2.
NO’s and YES’s
= O;t < T;t++)
if(TL(l,t)
to a given link,
i.e., s is a link
forward:
not empty,
Check destination
for YES’S.
If so, backtrack
to SOURCE
YES’s as BUSY’S.
link state is NO is a function
of the burst size. Secondly, try to turn MAYBE
states into YES’S.
If the
destination
link can be marked YES at some time, then
we have succeeded in finding a path.
Denote
t)
S t + b)
)
Initially,
before any bursts have been scheduled, all
timelines are in the MAYBE
state. As bursts are sched-
another
YES’s
for any t <0
{
TL(s, t+delay(l))
= YES
if(s is not in fife)
add s to FIFO.
to the node. These fictitious
links carry the node state,
both in sending and receiving.
Thus, the entire schedule
is now link based. Since these links are not part of any
loops, they do not affect the minimal
burstsize.
overlap
and TL(I,
pop link 1 off fifo
for each s=successor(i),
if (TL(I, t) == YES and t+delay(l)
< T
and TL(s,t+delay(l))
== MAYBE)
occur.
Furthermore,
for each node connected directly
to a
crossbar switch, we augment the network with two zero
delay, fictitious
links between that node and the crossbar switch.
One link goes to the switch and the other
would
==BUSY
= NO.
Second step, propagate
Mark
twice
time,
t.
NO’s:
if (TL(l,@)
TL(l,t)
say that
is less than
T = maximum
1’s state at time
step, mark
head (much like the need to mark the NO state). Since
the loop time is at least twice the shortest link, we can
as long as the burstsize
link
for each link, 1,
for (t= O;t < T;t++)
One restriction
that must be considered is that the
burstsize must be smaller than duration
of the smallest
loop.
If not, the tail of a burst can overlap its own
shortest
problem.
NO or TL(l,t)
==
YES)
= MAYBE
algorithm
for scheduling
in TDMA
and 19 connections.
The network is well connected with
many paths between pairs of nodes, so that it provides
a good test of the path finding ability of the algorithm.
The algorithm
begins by determining
and setting the
NO’S. Then all available times on the source link are
marked YES and the link is placed on a fife. The fifo
contains links which recently marked YES ‘s. While the
We model the connections
as pairs of fiber optic cables,
one in each direction.
Combining
with the 26 fictitious
links (two for each node), we get 26+38=64
links. Thus
Link delays were
the number of links is reasonable.
fifo is not empty, a link is popped off and its successors
checked. When this link is YES and its successor, offset
by the link delay, is MAYBE,
then the successor can be
reached. It is marked YES and placed on the fife.
This
algorithm
be reached
from
finds
all links
the source.
at all times
It is a simple
that
estimated
matter
the longest
to
check the destination
link and see if and when it can be
reached. It is also a simple matter,
if desired, to keep
track of auxiliary
quantities
besides reachability.
For
instance, hop count, transit
time, maximum
or minimum starting
iary quantities
The complexity
that the maximum
and to
ways selected the first
chose arbitrarily
time.
ited to W by the nature of the crossbar switch. Typical
values are W = 4 — 32. There are IV links, each of
which can be placed on the fifo at most T times, and
the inner loop over successors can be executed at most
that
T
is exponential
the integers
in binary.
There
involved
are various
(the
tricks
link
delays)
to decrease
All bursts
a Poisson
ated.
since
are represented
computational
on
a per
burst
basis
scales
Simulation
a uniform
with
that
arrived
and we
at the same
Since the smallest
parameter,
loop is
A, were gener-
were randomly
distribution.
With
in per cent of all requests
The results are presented
the scheduling
cho-
13 nodes,
one
efficiency
in Table
that
are satis-
1. Even for A = 13,
is approximately
90Y0. It takes,
on average, 25.6 ms for the burst to be started, and 48.0
ms before it arrives at the destination.
The average
transit time is about 21 rns independent
of the load.
10
Lambda
Efllciency
(%)
Start delay (ins)
Encl time (ins)
Table 1: Scheduling
backbone.
lin-
early with N. One might expect the number of bursts
to also increase linearly with N. Therefore,
the overall
scheduling complexity
appears to scale as N2.
7
were 1 ms long.
distribution
as the fraction
fied.
One interesting
question is how does the labeling algorithm
scale as the network grows? We expect that
W would remain bounded.
It is a function of the crossbar switch technology
and network topologies.
We believe these might well remain constant, or at worst, grow
slowly. We expect that T will be constant as N grows.
It is determined
by how long users are willing to wait for
their message to be transmitted
and is not likely to increase just because the number of users increase. Thus
,complexity
to the destination
cannot expect to satisfy all requests if A > 13. In all
cases, the simulations
were run over 2000 time periods,
generating
at least 20,000 requests. We define efficiency
cease whenever the current time plus the minimal transit time to the destination
exceeds the target time.
overall
arrival
among routes
The sources and destinations
sen from
time somewhat.
First of all, one can break out of the
loops in Step 2 as soon as any path to the destination
is found. Secondly, one can compute the minimal
time
required
to travel from any intermediate
node to the
destination.
Processing at any intermediate
node can
the
is 3 ms and
6 ms, the labeling algorithm
will work correctly.
Each
1 ms interval,
a random number of requests, following
operathe fact
in the size of the problem,
link
18 ms.
ms of when that request was generated, we dropped that
request and moved on to the next. The requests were
taken in the order received. No effort was made to sort
the requests to enhance scheduling
efficiency.
Among
possibly many arrival times at the destination,
we al-
can be derived as follows: We assume
number of successors of a link is lim-
W times. Combining
these, we obtain O(NTW)
tions. The exponential
complexity
comes from
and were quantized
The shortest
In the simulation,
we assumed a horizon of 60 ms. If a
request could not be satisfied so that it arrived within 60
time, etc., can be computed.
These auxilcan be used to select among the possibly
many times that the destination
can be reached
select from the possibly many different paths.
by 2/3 the speed of light
to the half millisecond.
can
11
12
13
14
99,9
99.0
94.6
90.1
84.3
11.4
32.1
17.8
39.8
22.3
44.8
25.6
48.0
27.7
50.0
efficiencies
versus
A for the NSF
Note the times indicated
above do not include any
initial
delays caused by the reservations
channel.
one
can expect that this delay would be about 30 ms for
each reservation.
However, one reservation
may request
communications
access for multiple
bursts.
The reservations delay for many bursts may be zero.
Results
We simulated the performance
of the labeling algorithm
on a model of the NSF backbone.
There are 13 nodes
174
8
Extension
to
Circular
Sched-
[2] D. L. Mills,
ules
One deficiency of the scheduling
scheme considered so
far is the computational
burden of scheduling each burst
one at a time.
In the simulation
discussed above, it
took about 5 ms to schedule ei~ch 1 ms burst on a Sun
SPARC workstation.
We need to schedule 10-13 bursts
each millisecond.
Even with coding improvements
and
a faster processor, it may be impossible
to get the necessary speed.
we implement
a repeating
node to communicate
with
sclhedule that allows
each other once during
each
each
periods.
We can adapt the labeling
algorithm
given above
to compute repeating
schedules as follows:
Guess the
frame time and compute
all times modulo the frame
time. Try to schedule the n(n - 1) bursts by picking an
order of bursts and scheduling each one at a time.
If
successful, stop; else, try a different ordering or increase
time.
Since the repeating schedule is computed rarely, perhaps only once, the computing
time is almost irrelevant.
Therefore, the “guessing” above is tolerable.
9
Conclusions
We believe
simulations
these results are very encouraging.
In the
presented above we achieved scheduling effi-
ciencies of approximately
90% and better.
network promises high utilizations.
The Highball
There are several improvements
one can propose to
the scheduling algorithms
presented here. We have done
limited experiments
on some of these enhancements,
but
there is little room for improvement,
owing to the high
efficiencies already achieved.
One enhancement
that
does seem to improve the results by a few percent is to
choose the minimum
hop path when choosing between
alternative
paths.
The simulations
presented
account inefficiencies
caused
here do not take into
by in-band
reservations
scheduling.
Before in-band reservations
can be recommended with confidence, this effect must be quantified.
References
[1]
A.
V.
Jr.,
J. G. Elias,
P. A.
Aho,
J.
E.
Hopcroft,
The Design and Analysis
Addison- Wesley, Reading,
and
J.
of Computer
Mass., 1974.
D.
Unman.
Algorithms.
175
of Delaware, Department
September
1990.
[3] C. Papadimitriou
and K. Steiglitz.
Combinatorial
Optimization:
Algorithms
and Complexity.
Prentice-Hall,
Englewood
Cliffs, NJ, 1982.
[4]
P. Schragger. Scheduling algorithms
for burst reservations on wide area high speed networks,
In Pro6B,2.8,
frame. The frame is the repeating period. For instance,
with n nodes, the frame cannot be shorter than n — 1
the frame
Report 90-9-3, University
of Electrical
Engineering,
ceedings of the IEEE
We have proposed an alternative
scheduling strategy:
Adaptive Circular TDMA
(AC-TDMA).
In AC-TDMA,
burst
C. G. Boncelet
Schragger, and A. W. Jackson.
Highball:
a high
speed, reserved-access wide area network. Technical
April
1991.
INFOCOM
’91, pages
6B.2. l–
Download