The Turn Model for

advertisement
The
Turn
Model
for
Christopher
Advanced
Adaptive
J. Glass
and
Computer
Department
Systems
State
Lansing,
{glass
MI
that
a model
are deadlock
free, livelock
free,
maximally
adaptive.
A unique
is not based on adding
physical
topologies
(though
nels).
Instead,
which
packets
wormhole
minimaf
the model
or nonminimal,
to networks
is based
with
on analyzing
in a network
can form.
Prohibiting
cles produces
routing
algorithms
and
Laboratory
48824-1027
.msu.
edu
extra
that
just enough
turns to break
algorithms
that are deadlock
network,
local
input
devices,
and
network
nique
in
the
output
and
which
by first
into
flow
dividing
coniroi
routing
the
digits
message
the tits in the packet
to become
available.
of the turns
three
in routing.
Partially
for 2D meshes,
cubes.
algorithms
adaptive
to prevent
for 2D meshes
permit
routing
k-ary
adaptive
are described
n-cubes,
show
and highest
The
adaptiveness
that
which
sustainable
each router
a few
for
tilts
to store
routing
algo-
on the pattern
of message tratlic.
For nonuniform
trfic,
adaptive
routing
algorithms
perform
better
than nonones.
ture
for
become
constructing
computers
and
net works
scalable
offer
than
Systems
nodes,
have
interconnection
multiprocessors,
shared-memory
massive
based
where
large-scale
scalable
other
a popular
parallelism
approaches
on dh-ect
each node
such
[1, 2, 3, 4, 5] and
to multiprocessor
networks
are organized
has its own processor,
and broadcast
Direct
The
majority
of
as ensembles
locaf
work
are
memory,
of
meshes
in part
by the
Touchstone
advantage.
Its date
the
ACM
appear,
and
Association
requires
copy
for
ACM
fee
are not
copyright
notice
Computing
a fee and/or
@ 1992
without
the copies
specific
all
or
made
notice
IS gven
part
of
th]s
or dmtrlbuted
and
that
Machinery.
To
the
copying
copy
title
material
for
of
direct
the
is by
all nodes
otherwise,
IS granted
publication
on their
have
n-cube
the
or to republish,
permission.
0=89791 -509.7/92/0005/0278
$1.50
=
(Zj
bors if k
which is
n-cubes.
for O < i
and
of
routing
also permits
them
over multiple
communications
topologies
and
k-ary
[2], the
for
wormhole
n-cabcs,
[9, 1].
routing
particularly
A 2D mesh
Intel
channels,
possible
Paragon,
is used
and
the
lowin the
Symult
MIT J-machine
[11], and
is used in the nCUBE2
the
[1].
location
in the
the same number
differs
from
that
mesh.
In a, k-ary
of neighbors.
The
of an n-dimensional
to k and two nodes
n-cube
definition
mesh
[8],
of a
in that
X and Y are neighbors
if z, = vi for all i, O < i < n— 1, except
one, j, where
1) mod k. The change to modular
arithmetic
in the
definition
adds wraparound
channels
to the k- ary n-cube,
giving
it symmetry,
Every node has n neighbors
if k = 2 and 2n neigh-
commercial
permission
In
bors if and only if z, = y, for all i, O < i < :n – 1, except one,
j, where y, = ZJ & 1. Thus, nodes have from n to 2n neighbors,
if aud only
to
of
are low.
X is identified
by n coordinates,
(c., c1, . . . . Zn-2, Zn–l ), where
1. Two nodes X. and Y are neighO<xl<k,–lforO~i<n-
YJ
that
latencies
Formally,
an n-dimensional
mesh has k. x kl x . . . x kn-z
x Icn-1
nodes, k, nodes along each dimension
i, where k, ~ 2. Each node
k-ary
provided
space
attraction
Wormhole
hypercubes.
DELTA
all of the k,’s are equaf
Permission
Another
reservation
but are sep-
2010 [10]; a 3D mesh is used in the
Caltech
MOSAIC;
and a hypercube
grant EC S.88.14027.
NSF
network
and
virtual
buffer
Channel
routing,
and output
meshes
n-dimensional
Intel
and
its neighbor
by sending
a message through
one
To handle
the complexities
of routing
messages
was supported
switching.
multicast
dimensionaf
to store
and
enough
circuit
switching.
part of wormhole
making
are far more
space
routing,
virtua~ cut-through,
and
hand, are proportional
to the sum
to travel.
Wommhole
routing
ahro
as multi-
depending
*This
for wormhole
on the other
and distance
in circuit
the
fashion.
a packet
the latencies
for store-and-forward
are
of packet length ancl distance
to travel
e flits
other supportive
devices.
The nodes communicate
by sending
messages. The network
connects
each node directly
to only a few
other nodes, its n eighbora.
Which
nodes are neighbors
is defined
by the topology
of the network.
A node communicates
with a
node that is not
of its neighbors.
its communication
phases
and
routes
the header flits
available,
all of
buffer
[7] require
is that
rout ers to repficat
intercomection.
enough
routing
of contention,
to the product
When
channel
store-and-forward
for each channel.
architec-
multiprocessors.
just
The
techniques
has some advantages
over
and release are an integral
arate
networks
requires
channel.
packet
[s]. The latencies
cir-ctiit switching,
of packet length
Introduction
Direct
each
switching
the absence
proportional
packets
then
where they are for a suitable
channel
of the attractions
nf wormhole
rout-
an entire
wormhole
throughput
wait
One
ing is that
cut- through
and hyper-
and nonadaptive
and hypercubes
latencies
deadlock.
partial
algorithms
meshes,
of partially
has the lowest
depends
partially
adaptive
1
be prohibited
of the turns
n-dimensional
Simulations
rithm
must
quarters
into
It
tech-
a message
through
the network
in a, pipeline
all of the routing
information
for
topologies
for wormhole
routing,
n-cubes, without
extra channels.
quarter
connect
switching
wwitches
or flits.
and lead each packet through
the network.
reach a router that has no suitable
output
remaining
router
it to local
which
a popular
Wormhole
free, minimal
or nonminimal,
and maximally
adaptive
for the network. In this paper, we focus on the two most common
network
n-dimensional
meshes and Ic-ary
In an n-dimensional
mesh, just a
The
connect
channels,
[6] is becoming
tiits in each packet
Header jlits
contain
all of the cyfree, livelock
has a router.
input
networks.
a network
packets
often
channels,
routing
in direct
along
the turns
each node
and output
routers.
Worrahole
chan-
the dkections
and the cycles
direct
controls
it to neighboring
feature
of this model
is that it
or virtual
channels
to network
it can be applied
can turn
routing
Ni
Science
in the
for designing
M.
*
University
,nl}@cps
Abstract
We present
Lionel
of Computer
Michigan
East
Routing
278
+
>2.
Another
topology
with symmetry
is the hypercube,
a special case of both n-dimensional
meshes and k-ary
A hypercube
is an n-dimensional
mesh in which k, = 2
< n – 1 or a 2-ary n-cube.
Meshes
and
k-ary
n-cubes
simplify
are pop&r
In general,
because
topologies
meshes
cubes.
are preferred
over high-dimensional
meshes and
Low-dimensionaf
meshes have low, fixed node
which makes them
and k-ary
n-cubes.
channel
bandwidth
routing.
in part
regular
their
idea for nonadaptive
low-dimensional
have
extended
to a physical
k-ary ndegrees,
channel
cal channel,
more scalable
than high-dimensionaf
meshes
They
also have fewer channels
and higher
per bisection
density
and have lower con-
but
tention
and blocking
latencies,
which result in lower comunicm
tion latencies
and higher hot-spot
throughputs
[8]. On the other
hand, high-dimensional
vantages
that partialf
routing
algorithms
al,
fully adaptive
some admeshes.
They
They
more
tend to have lower diameters,
which shortens
path lengths.
also have more paths between
pairs of nodes, which permits
fault tolerance.
Finally,
the symetry
of k-ary n-cubes can
make
it easier
to spread
Algorithms
topology
for
should
have
latency,
high
in VLSI.
Features
packet
routing
three
network
traflic
more
message
characteristics:
throughput,
contributing
low
and
a network
communication
ease of implementation
to ease of implementation
are lit-
tual
livelock,
spreading
Many
fault
packet
of these
tolerance,
tratlic
terms
evenly,
require
have been used in different
occurs when a packet waits
example,
a packet
by a packet
lease some
may
that
routing
packets
and
along
routing
packets
some definition
for a network
is, in turn,
waiting
because
toward
only
when
and
the destination.
never
leads
routing
n on minimal
nation
to re-
(possibly
away
Minimal
paths.
(not
occurs
along
from
routing,
Although
when
or equidistant
routing
may
Deadlock
destinations
can keep
and
occurs
many
readily
cludes preventive
measures.
are less likely to occur and
for packets
that
encounter
or hot
spots
unless
from
a routing
feature,
[12].
It
because
provides
continuously
in trat%c
of adding
algorithm
r.tittng
routes
the
a packet
first
higher
and higher
way ensures that
but
it
avoid
virtual
also
ensures
deadlock,
channels
along
dimensions.
the zu and
and
possibly
to networks.
Another
to provide
Dally
and
a path
prohibit
Seitz
more
are deadlock
and
maximally
partially
free and
respecadap-
hypercubes
in Section
6.
waiting
on
packets
wait.
would
deadlock
at least
same
turns
pair
than
would
turns
one of the
been
avoided.
in a network,
a
altogether.
The routing
one turn
in each of the
time,
every
If only
have
the
algorithm
of nodes.
necessary;
In
would
addition,
otherwise,
the
be reduced.
wormhole
maximally
routing
adaptive
algorithms
for a network,
we
The model involves
analyzing
the dican turn in the network
and the cycles
It prohibits
just enough turns to break
adaptive
adaptive
mesh-connected,
connected
faulty
routing
current
cycle
for
routing
k-ary
the
network.
algorithms
n-cube,
networks.
hexagonal,
Adding
allows
The
model
for such basic
extra
the model
produces
topologies
octagonal,
and
physical
or virtual
to produce
fulfy
algorithms,
the topic of a forthcoming
paper focuses on improving
performance
as
cubechan-
adaptive
paper
[18].
The
without
the ex-
pense of extra channels,
as might be done by changing
the routing
algorithms
in existing
routers.
The steps of the turn model
are given in more detail below.
Unless specified
otherwise,
the term “channel”
will be used to
indicate
either a virtual
or physical
channel.
along
1. Partition
the channels
in the network
the directions
in which they route
v channels
in a physical
dhection,
to
being
is to add
[14] introduced
of
all of tbe cycles.
Routing
algorithms
that employ
the remaining turns are deadlock
free, livelock
free, minimal
or nonminimal,
in-
way
by
certain
of designing
their
popular
is caused
the
of the algorithm
dead-
adaptiveness,
and
are described
deadlock
between
from
then
traffic
many
of partially
2D meshes
up in a circular
At
reaching
and
in
to prohibit
propose
the turn model.
rections
in which packets
that the turns can form.
Ordering
the dimensions
in thk
e-cube algorithms
avoid deadlock,
nonadaptiveness.
have
cycles.
nels to the topologies
dimension
to
for 2D meshes,
and hypercubes
performance
prevent
that
channels,
or vir-
5 describe
derived
n-cubes,
by prohibiting
toler-
algorithm
feature
physical
it can be applied
3, 4, and
the
this
To solve this problem
sound
then along the y dimension
algorithm
for hypercubes
[13]
lowest
not
adaptiveness
patterns.
O) and
to leave
wind
might
would
it should
phyiicaf
or virtual
channels.
The popular
zy
for 2D meshes [10, 2] routes a packet first along
the z dimension
(dimension
(dimension
1). The e-cube
k-ary
routing
turned,
that,
possible
have
The drawback
to previous
routing
algorithms
for meshes and
k-ary n-cubes
is that they either sacrifice
adaptiveness
for deadlock freedom or achieve adaptiveness
and deadlock
freedom at the
expense
routing
and
not
algorithm
algorithm
it contributes
to
alternative
paths
blocked
left
suggests
many
packets
fault
or non-
A unique
Model
had
routing
its
Indefinite
postponement
and livelock
are generally
easier to prevent.
Adap-
tiveness
is also an important
severaf of the other features
hardware,
or all packets
This
to the desti-
more promising,
nonminimal
routing
provides
better
ance.
Of these features,
the most important
is freedom
lock.
and
studying
in wormhole
to turn
packets
path)
initially
Turn
try
is possible
restricts
routing
each other in a cycle. Figure
1 shows one way that deadlock
can
occur in a 2D mesh. Four packets travellhg
in different
directions
injected
In con-
a predetermined
and
2
the routing
Llvelock
in contrast,
minimaf
meshes
Simulations
Deadlock
are always competBoth deadlock
and
rather
and Harden
the model is based on
can turn in a network
algorithms
of message
the paper.
but
(though
Sections
routing
for different
patterns
Section 7 concludes
it to its destination.
is adaptive
at times).
to shortest
Livelock
channels.
adaptive
packet
movement,
A minimany of the
minimal
on adding
topologies
the
stop
of a packet
based
algorithms
trast,
physi-
space and
wormhole
free,
for networks.
is not
nonadaptive
can stop a packet
from being
moving
once it is in the network.
progress
extra
tively.
resources.
Indefinite
postponement
is similar
to deadlock,
but
occurs when a packet
waits for an event that can happen
but
never does.
For example,
a packet
may wait forever
to acquire
a packet’s
it
designing
livelock
adaptive
tive
indefinite
postponement
into a network
or from
for
free,
to be released
a network
resource
for which other packets
ing successfully.
The issue here is fairness.
buffer
[17] and Lhder
with extra channels).
Instead,
the directions
in which packets
n-dimensional
In wormhole
routing,
such a circular
wait
condition
will cause deadlock,
because a packet holds resources
while waiting
and excludes
other packets from acquiring
the held
does not
a model
to network
the partially
resource.
livelock
Dally
are deadlock
is that
channels
without
they
authors).
Deadlock
cannot happen.
For
first
adding
channel
a new
at the ends of the physical
channel
can share the physicaf
channel
and
bandwidths
of the virtual
channels
channel.
An advantage
of adding
however,
is that they can support
resource
for
adding
and the cycles that the turns can form.
Section
2 presents
the
model
in general
terms and applies
it to n-dimensional
meshes
paths,
adaptively.
(partly
ways by d,fferent
for an event that
wait
short
presents
model
networks
analyzing
than
[15, 12, 16]
a virtual
with a high degree of adaptiveness.
algorithm
can route packets along
and maximally
of this
researchers
Adding
It involves
in the topology.
that
minimal,
several
adapsuch an algorithm
for 2D meshes.
A partially
cannot route packets along every shortest
path.
paper
algorithms
tle hardware
for channels,
buRem,
and control
logic.
Features
cent ributing
to low latency
and high throughput
are freedom
from deadlock,
freedom
from indefinite
postponement,
freedom
from
paths
[16] describe
tive algorithm
This
through
free.
logic to the two routers
the virtual
channels
It also reduces the
sharing
the physical
or physical
channels,
shortest
evenly.
packets
and
routing.
is less expensive
it is not
control
so that
routers.
already
virtual
meshes and k-ary n-cubes have
y offset those of low-dimensional
routing,
it to adaptive
in v dktinct
virtual
directions
into
sets according
to
packets.
If each node has
treat these channels
as
and divide
them
into
v
dktinct
sets accordingly.
Put any wraparound
channels
(for
tori) in a separate
set to be incorporated
during
Step 5.
the
279
1
by modifying
them to use channels
only when they lead toward
the destination.
Nomninimal
routing
is more adaptive
and fault
❑ lmil
m
tolerant,
II
❑
though.
To make
this section
the steps of the turn model clearer, the remainder
of
applies them to n-dimensional
meshes, starting
with
2D meshes.
For 2D meshes,
in the definition
a“’n
we first
of n-dimensional
simplify
meshes.
south,
packet 4
rmd
turns
north.
From
can be formed:
these
left
rmd
four
directions,
right
turns
four turns cannot form a cycle; they
tiveness.
Deadlock
can be prevented
four
ited,
Prohibiting
however,
packet 1
Figmre
as F@re
and
1.
2.
A wormhole
Identify
the possible
turns
other, ignoring
180-degree
is only
possible
when
direction.
It represents
to another
when
icaf direction,
Identify
four
routers
and
will
The
three
to the prohibited
is possible
of turns
the partially
four
turns
not
(Figure
can be prohibited
adaptive
routing
4c).
prevent
left
turns
right
3 describes
deadlock
algorithms
in Figure
in
4b,
41b are equivalent
to
bc,th cycles still exist
Section
to prevent
turn
deadlock,
allowed
based
which
and describes
on these
turns.
one virtual
direction
to anO-degree turns.
A O-degree
there
are multiple
a transition
different
identifying
topology
from
and
from
the two sets route
but
the cycles
erally,
4.
involving
two
in a 2D mesh.
turn
3.
deadlock
west,
abstract
cycles
prevents
deadThe remaining
also do not allow any adapby prohibiting
fewer than
right turns allowed
in Figure
left turn in Figure 4a. Thus,
deadlock
pairs
any
4 illustrates.
4a are equivalent
and the three
the prohibited
+
90-degree
traveling
turns,
however.
In fact, only two turns
need to be prohibone from each abstract
cycle, allowing
for some adaptiveness
in routing.
+
Figure
eight
when
east, south, and north.
The eight turns form two
as shown in Figure
2. The ny routing
algorithm
lock by prohibiting
four of the turns (Figure
3).
packet 2
used
O and 1 be-
kO
and kl, become
become west, east,
come z and V. The lengths
of the dimensions,
m and n. The directions
—z, +a, -g, and +y
cm
1
packets
the terminology
Dimensions
virtual
that
these
the
simplest
channels
in one
one set of channels
packets
in the same phys-
directions.
abstract
turns
cycles
can form.
in each
Gen-
plane
F&gure
2.
The
possible
Figure
3.
Only
abstract
cycles
and
turns
in a 2D
of the
is adequate.
Prohibit
one turn in each abstract
cycle so as to prevent
deadlock.
The turns must be chosen carefully
in order to
break
every possible
cycle,
including
complex
cycles not
identified
in Step 3. A useful approach
is first to break the
cycles
more
in each plane
complex
5. Incorporate
as
wraparound
one turn
and
then
to check
whether
this
allows
from
the
set
cycles.
many
channels,
turns
without
for each wraparound
as possible
reintroducing
channel
cycles.
can always
of
zy routing
four
turns
(solid
lines)
are alllowed
in the
algorithm.
At least
be incor-
porated.
6.
Incorporate
as many 180-degree
and
sible, without
reintroducing
cycles.
Routing
identified
other
algorithms
in Step
allowed
that
1 and
by Steps
route packets
use only the
along
turns
4, 5, and 6 are deadlock
minimal
or nonminimal,
and maximal!y
They are deadlock
free because breaking
circular
routing
O-degree
turns
as pos-
the sets of channels
from one set to anfree, livelock
free,
adaptive
for the network.
all of the cycles prevents
waits.
(We demonstrate
deadlock
freedom
for individual
algorithms
later. ) Preventing
circular
waits in thk way
decreasing
(or increasing)
that a network
contains
order
a finite
Figure 4.
deadlock.
[14]. This, together
with the fact
number
of channels,
means that
a packet will reach its destination
after
Thus, routing
is Iivelock
free. Routing
the network
because the model prohibits
turns from one dkection
to another.
produces
will also be nomninimal,
but
(c)
(a)
means that it is possible to number
the channels in the network
so
that each algorithm
routes every packet along channels in strictly
a limited
nmuber
of hops.
is maximally
adaptive
for
the minimum
number
of
Six turns
that
complete
the abstract
cycles
and
allow
In an n-dimensional
mesh, each node has up to 2n input chanFor each of these 2n directions,
nels and 2n output
channels.
there are 2n – 2 possible
90-degree
turns to a different
direction,
The algorithms
the model
can easily be made minimal
making
280
4n(n
— 1 ) total
turns.
These
turns
form
two abstract
cy -
cles in each of the n(n – 1)/2
of four turns.
planes,
Theorem
number
1
The
minimum
making
n(n – 1) total
O! turns
that
hibited to prevent
deadlock in an n-dimensional
OT a quarter
of the possible turns.
Proofi
The
partitioned
4n(n
into
of the n(n
one of the
– 1) turns
n(n
cycles
of four
mesh
a quarter
3
of the
•1
Adaptive
Routing
in
(a) The
a 2D mesh,
and
two
must
there
abstract
are four
cycles
be prohibited
dkections,
of four
turns.
to prevent
eight
One
deadlock.
from
these
two turns,
12 prevent
deadlock
a deadlock
account.
case)
and
are unique
if symmetry
3.1
three
West-First
Routing
each cycle
a packet
must
west-jimt
t-outing
sary, and
then
for
the
route
south,
direction.
a packet
east,
and
are shown
represent
nodes,
requiring
paths.
in that
is taken
that
Note
packets
that
both
This
fist
into
suggests
west,
north.
5b.
wait
(dashed
lines)
algorithm.
■
■
F
Ii
■
LA
■
m
■
■
■
■
A
■
paths
fig-
blocked
or take
and nonminimal
■
the
In this
and gray bars indicate
minimal
■
if neces-
Example
in Figure
blllil
4 for
Algorithm
algorithm
squares
channels
native
out
adaptively
west-first
ure, black
start
algorithm:
by the west-first
■
WI
ways
(see Figure
Figure
5a shows one way to prohibit
two turns in a 2D mesh.
The prohibited
turns are the two to the west. Therefore,
to travel
west,
lines)
—
+“
turns,
Of the 16 different
to prohibit
(solid
U:m::
2D
90-degree
turn
allowed
A
Meshes
In
six turns
L.-i
adapmesh,
mesh, k-ary n-cube,
and hypercube
topologies.
for the 3D mesh can be found in [19].
Partially
r~
1-d
Each
At least
in order
In the following
sections,
we introduce
specific partially
routing
algorithms
based on the turn model
for 2D
n-dimensional
detailed
study
1),
can be
each.
– 1)/2 planes contains
two of these cycles.
four turns in each cycle must be prohibited
-~
r
be pro-
is n(n–
turns
to prevent
these cycles.
Therefore,
prohibiting
turns is necessary
to prevent
deadlock.
tive
must
mesh
in an n-dimensional
– 1) abstract
cycles
alter-
paths
are
(b)
Examples
of the west-tirst
algorithm
in an 8 x 8 mesh.
shown.
Proof that the west-first
partially
adaptive
algorithm
is deadlock free is based on the work of Dally and Seitz [14], who show
that
a routing
algorithm
interconnection
routes
every
creasing)
ets first
numbers
sign
is deadlock
network
packet
numbers.
along
channels
Because
still
Theorem
lower
numbers
the farther
2
with
if the
they
The west-first
decreasing
northward,
Figure
in the
5. The
west-tirst
routing
algorithm
for 2D meshes.
the algorithm
algorithm
other directions,
the farther
west
to eastward,
east
channels
so that
strictly
the west-first
west and then in the
to westward
channels
channels
free
can be numbered
routes
(or inpack-
we assign lower
they are, and asand
southward
are.
routing
algorithm
is deadlock
free.
2m-2-2x@-y
Proofi
Assign
each channel
in the m x n mesh a two-digit
number
a, b in base r, where T is the greater of 3m — 2 and n — 1.
Figure 6 shows the particular
numbers
to assign to the channels
leaving node (z, V). Figure 7 illustrates
the numbering
of a 4 x 4
mesh. To show that west-first
routes every packet along channels
with strictly
decreasing
numbers,
it is sutiicient
to show that, for
each channel into an arbltrru-y
node, the algorithm
can only route
the packet
shows the
out
four
case, the channels
numbers
than
t
along a channel
with a lower number.
Figure
8
possible
cases. Examination
shows that,
in each
the
used to route
input
can make a 180-degree
imal routing.
a packet
channel.
turn.
This
Note
turn
out of a node
that
is only
in part
useful
2m-2-2x,y-l
have lower
(c) a packet
Figure
for nonmin-
the west-fist
❑
281
6. Numbering
routing
of the channels
algorithm.
leaving
each node
(z, V) for
7,0
9,0
5,0
1,0
3.2
North-Last
F@re
9a shows
The
prohibited
turns
fore,
a packet
should
rection
it
algorithm:
and
~1
6,1
4,1
4,1
2,1
then
shown
2J
0,1
Figure
7.
&o
9,0
5,0
3,0
1,0
&o
9,0
3,0
1,0
Numbering
of a 4 x 4 mesh
for
the
west-first
north.
Algorithm
way to prohibit
are the
only
two
when
travel
needs to travel.
route a packet
north
two turns
in a 2D mesh.
traveling
north.
when
This suggests
tlrst adaptively
Exrunple
in Figure
paths
for
that
There-
is the last
di-
the north-last
routing
west,, south, and east,
the north-last
algorithm
are
9b.
0,1
(a) The
7,0
Routing
another
six turns
routing
allowed
(solid
lines)
L T
■
by the north-l=t
algorithm.
algorithm.
2m-2-2x,n-2-y
2m-2-2x,y
Jl-
2m-1-2x,0
X,y
T
from
west
(b) input
from
J-t
T
2m-l+x,O
in an 8 x 8 mesh.
routing
algorithm
for 2D meshes.
north
Proofi
proof
6 rmd
2m-3-2x,0
routing
is similar
bers.
algorithm
to that
7 counterclockwise
rections
of the channels.
routes every packet along
2m-2-2x,n-l-y
east
The
Figures
X,y
2m-2-2x,y-l
3 The north-last
for
is deadlock
Theorem
90 degrees,
and
2.
reverse
jree.
Rotate
the di-
The figures now show that north-last
channels
with strictly
increasing
num•1
b
2m-3-2x,0
from
north-laxt
9. The
Theorem
2m-2-2x,n-2-y
X,y
(c) input
Figure
algorithm
+2m-2-2x,y-l
2m-2-2x?n-2-y
2m-2+x,0
of the north-last
X,y
2m-2-2x,y-l
(a) input
(b) Examples
2m-3-2x,0
2m-3-2x,0
(d) input
from
south
3.3
Negative-First
Figure
10a shows
The
prohibited
negative
Figure 8. For each input channel,
the output
channels
the west-first
routing
algorithm
have lower numbers.
allowed
packet
by
negative-jirat
west
turns
direction.
must
and
Routing
a third
start
Touting
south,
and Jesshope
al
routing.
or nonminimal,
and
two
from
to travel
in a negative
algorithm:
then
paths
two turns
route
in a 2D mesh.
a positive
direction
in a negative
direction.
adaptively
propose a similar
The negative-tirst
the nonminimal
fault tolerant.
Example
shown in Figure 10b.
282
are the
Therefore,
out
Algorithm
way to prohibit
a packet
east
and
to a
direction,
This
suggests
a
the
th-st adaptively
north.
Yrmtchev
algorithm
[15], lbut only for minimalgorithm
can lbe either minimal
version being more adaptive
and
for the negative-first
algorithm
are
Theorem
jree.
4
The
is a special
proof
meshes
The
negative-
in Section
fir8t
routing
case of the
algorithm
one
given
for
is
more
deadlock
n-dimensional
adaptive
the
tive
routing
the
source-destimtion
algorithm
algorithms,
is.
For
the
three
partially
adap
however,
SP = 1 for at least half of
Another
measure
of the degree
pairs.
to which a minimal
routing
algorithm
is adaptive
is the ratio
Salgm,thm/Sf.
Sp/Sf
ranges from O to 1, but averaged
across
4.
all source-destination
pairs, Sp/Sf
> 1/2, which indicates
a significant
degree of adaptiveness.
Note that Sp = 1 for a sourcedestination
pair does not imply
that
for that pair.
The partially
adaptive
algorithm
algorithms
p is nonadaptive
all permit
non-
rninimed
routing.
In the case of the negative-first
algorithm,
this
means that rout ing can be adaptive
even when d= < s= and
dv ~ Su or when d= > s= and du < Sg.
this, see the bottom
path in Figure
10b.
(a) The
six turns
allowed
(solid
lines)
by the negative-tirst
4
■ !9.
■
■
an illustration
of
algo-
rithm.
99
For
Partially
Adaptive
n-Dimensional
■
4
This
from
Routing
in
Networks
section describes
the adaptive
routing
algorithms
the application
of the turn model to n-dimensional
and k-ary n-cubes.
In general,
these
unless n is small or the k’s equal 2.
4.1
n-Dimensional
Of the
many
per
cycle,
analogs
algorithms
are
gorithm
is the
all-but-
first
adaptively
for
north-last,
for 2D meshes.
a packet
formed
noteworthy
of the west-first,
gorithms
are not
practical
Meshes
routing
three
topologies
resnlting
meshes
The
by prohibiting
their
one turn
simplicity.
They
and negative-first
analog
of the
one-negative-jirst
west-first
routing
in the negative
are
routing
al-
routing
al-
algorithm:
directions
route
of all but
one
dimension
(n – 1) and then adaptively
in the other directions.
The analog of the north-last
routing
algorithm
is the all-but-onepositive-last
the negative
(b) Examples
Figure
3.4
How
of the negative-tirst
10. The
negative-fist
Degree
adaptive
of
are
sion (0) and then
the negative-tlmt
u.
mmm
■
algorithm
routing
muting
Theorem
5
n-dimensional
for 2D meshes.
Proofi
partially
adaptive
routing
adaptive
adaptive
algorithms,
algorithm,
Ax
(Ax
‘f
=
Ay
partially
a node
= IdY – Sg [. Then,
channel
=
{
AZ!AY!
&J!#
if d% ~ s%
1
otherwise
(.4a&2
~ ~z!AY!
Snorth_la.t =
when
channels
{
For minimal
routing
if dv ~ Sv
< SY)
or (d~ z so and
dv ~ SU)
1
otherwise
the
the
traveling
but
the only
sum
of
the
k,
in a negative
larger
proof
All
for
an
of
we present
algorithm
direction,
K – n – X – 1, which
–1,
channels
which
leaving
the
the negative-first
with
strictly
dimensional
mesh.
following
theorem,
if (dz < s= anddg
algorithms,
in the negative
directions.
foT
n-dimensional
along
a
K – n – X
channels
leaving
the node
If a packet ente~ a node
it enters along a channel
islessthan
node
K-n+X,
thenurn-
in the
positive
routes
every
algorithm
increasing
it enters
is less than
numbers
directions.
packet
and is deadlock
along
free.
❑
The negative-fit
algorithm
is deadlock
free as a result of prohibiting
just one turn per abstract
cycle, the turn from a positive
direction
to a negative
direction.
Therefore,
prohibiting
some
quarter
of the turns
is sntiicient
to prevent
deadlock
in an n-
w
=
be
K-n+X
of the
produces
s negattve-fsrst
K
nmnbered
Therefore,
otherwise
{
adaptively
positive
and K – n + X, the numbers
of the
in the negative
and positive
directions.
when traveling
in a positive
direction,
+ Ay)!
ber
-f:rst
first
in the
The
negative- fivst
routing
meshes is deadlock free.
Let
numbered
s west
a packet
adaptively
X
be the
snm
of the
z,
for
any
node
Zn-I ). Nnmber
each channel
leaving
a node
in a positive
direction
K — n + X and each channel
leaving
a
node in a negative
direction
K — n — X. Then, if a packet enters
algorithms?
p be one of the three
= Idz — s= 1, and
route
then
mesh,
and
let
(ZO , Z* , . . . . z.-2,
Let Salgor,thm
be the number
of shortest
paths the algorithm
allows from source node (s=, SY) to destination
node (d=, dv). Also,
let j be a fully
and
these algorithms
are deadlock
free,
is for the negative-fist
algorithm.
in an 8 x 8 mesh.
algorithm
adaptively
in the other directions.
The analog of
routing
algorithm
is also called the n egative-jirst
alg orithrn:
directions
Adaptiveness
these
routing
algorithm:
route a packet tirst adaptively
in
directions
and the positive
direction
of one dimen-
Theorem
n(n — 1))
to prevent
Salgor,t~~
is, the
283
maximally
6
Combining
this with Theorem
which supports
our claim that
adaptive
Prohibiting
routing
some
in an n-dimensional
deadlock.
algorithms.
quarter
mesh
1, we have the
the tnrn model
of
the
is necessary
turns
and
(that
suficient
is,
As
the
tially
number
adaptive
messages
k,
of
dimensions
algorithms
adaptively.
are large.
are
SP
But
=
averaged
increases,
more
the
likely
to
1 less often,
across
all
minimal
be able
especially
Algorithm:
Minimal
p-cube
routing
for hypercubes.
Input:
Current
address,
C, and destination
address, D.
Procedure:
par-
to
route
when
source-destination
the
pairs,
1. If C = D,
exit.
Sp/Sj
>1 /2n–1,
indicating
that the degree of adaptiveness
relative to fully adaptive
algorithms
decreases as n increases.
Again,
adaptiveness
4.2
can be increased
k-ary
The
adaptive
to use the
way is to allow
only
on its
lock free,
wraparound
than
routing
wraparound
a packet
first
of the
are routed
mesh
along
k-ary n-cubes
according
to
prove
of k-ary
along
that
channels,
channels
The
the
n-cubes.
routing
in strictly
decreasing
for meshes.
Note that
freedom
all of these
For k-ary
routing
n-cubes
deadlock-free
routing
is a simple
with
that
Linder
and
is deadlock
ever,
Harden
free,
enough
minimal,
channels
subnetworks
[16] construct
with
and
fully
to
n + 1 levels
per
k-ary
Overall,
the
paths,
Routing
for
hypercubes
n-dimensional
algorithms
meshes
in
h = 6,
hO
free
They
add,
how-
into
2n -1
and
cases of the
n-cubes.
Proofs
are corollaries
routing
routing
especially
channel
algorithm
3, and
algorithm
when
The
offers
compared
in a di-
for hypercubes.
a choice
of many
to tlhe nonadaptive
following
table
illustrates
of the 36 possible
transmitting
for
sends a
example,
shortest
the message,
e-
this
the source node (1011010100)
node (0010111001).
For this
h] = 3. One
For each node
kn
address
the last
paths
the number
of
hop
C.
destination
was m a posltwe
address
dmectlon,
D;
and
p.
nr”uti’gf”r
channels
1. If C = D,
route
meshes
routing
2.
Ifp=l,
and
al-
3.
Elseif
for
4.
Else R = C.
algorithms
that
of the
special
address
case of the negative-first
of the node
the header
address
has two phases.
the
packet
to the
local
processor
(CA
D).
and
increased
of the
proofs
algorithm
flits
destination
adaptiveness
the packet
for
5. Route
the
currently
node.
occupy,
The
and
fault
tolerance,
any dimension
flits
is D.
p depends
on which
C is a unique
input
buffer
constant
the
Figure
distance
of adaptiveness
bet ween S and D.
for
the p-cube
The
routing
the
any
packet
i for which
the
first
i for which
Wits occupy
measure
available
channel
in a di-
r, = 1.
p-cube
routing
algorithm
for hypercubes.
dimension
choices
comment
taken
‘-
g:
OO1OO1OOOL , .
nnlm
lnnnn
I 7
./
phase
c, = 1
-.-..-””
“
1
.
0010110001
I 1
0010111001
I
,
n
1
I
‘w
and
in the
6
Simulation
Experiments
To compare the partially
adaptive
routing
algorithms
all-but-one
negative-fist
(ABONF),
all-but-one-positive-last
(ABOPL),
and
negative-tirst
(NF) with the nonadaptive
routing
algorithms
Zy
and e-cube, we simulate
a 16 x 16 mesh and a binary
8-cube for
three different
tratlic
patterns.
Each of these topologies
contains
256 nodes.
of the degree
Is Sp—ctibe /.$f
12. The minimal
address
a fully adaptive
routing
=
l(S@D)[
is the Ham-
other
along
CV
the
router.
Konstantinidou
proposes
an algorithm
similar
to p-cube
[20], but only for minimal
routing.
The number
of shortest
paths from S to D, SP_cti~e, is hl !ho !,
where IX I represents
the number
of 1‘s in the binary
number
X,
hl=l(S
A ~)1,
and ho=l(s
A D)l.
For
algorithm,
~, Sf = h!, where h = hl +ho
R=
rout-
routing,
for each router,
header
O,then
and D
p-cube
In the case of minimal
along
D=
D.
has a particu-
and d, = 1. Then,
the steps can be computed
as shown in Figure 12. In both of these algorithms,
the only input transmitted
in
the header
CA
R=CA
the routing
fist phase routes the packet along a dimension
i for which c, = 1
and d, = O. When there is no such dimension,
the second phase
routes the packet along a dimension
i for which c, = O and d, = 1.
These steps are easily computed
using bitwise logic operations
as
shown in Figure
11. If nomninimal
routing
is desired,
because
can also route
then
mension
ing algorithm
ming
p-cube
algorithm.
=
whether
cases.
be the binary
of its
p-cube
Current
that
larly compact
expression,
the p-cube routing
algorithm.
Let S be
the binary
address of the source node for a packet,
C be the binary
available
choices based on the p-cube routing
is also shown.
The number
of choices in rmrentheses
indicates
the additional
choices available
with nonmi~mal
routing.
Hypercubes
are special
and k-ary
are deadlock
general
The
any
~i = 1.
addhg
n-cube
Hypercubes
are a special case of both n-dimensional
k-ary n-cubes.
Consequently,
the partially
adaptive
more
along
exit.
p-cube
gorithms
packet
10-cube
where
to the destination
is shown.
per level.
5
CAD.
i for which
routing
a binary
message
nonmini-
algorithm
subnetwork
and
to construct
a routing
the
the
shortest
cube
Again,
without
adaptive.
to partition
R=
11. The minimal
extra channels.
This is a result of the many cycles that do not
involve
turns in the topology.
By adding
channels
to a k-ary ncube,
processor
or-
of the proof
are strictly
are minimal
O,then
Route
Figure
channel
and then
channels.
k > 4, it is impossible
algorithms
local
packets
or increasing
modification
algorithms
to the
CAD.
mension
algorithm.
Thus, a node at the east edge
will have two channels
to the west: a mesh
immediately
to its west and a wraparound
of deadlock
If R=
packet
dead-
can be extended
at the west edge of the mesh
3.
4.
One
is still
in another
way: classify each wraparound
the direction
in which it routes packets
to a node
R=
channel
on whether
algorithm
2.
the
can be ex-
a wraparound
depending
negative-tirst
apPIY the negative-tit
of the mesh channels
channel
to the node
maL
To
for meshes
channels
to be routed
hop.
der in the proof.
the proof
algorithms
number
the mesh channels
as before
and assign the
channels
a number
that is either greater than or less
those
channel
routing.
n-cubes
partially
tended
by nouminimal
route
of neighboring
= 1/(:,).
284
A pair
of unidirectional
routers
and
channels
each router
connects
to its local
each pair
processor.
All
of the
channels
input
channel
The
routers
have
into
the
same
a router
operate
bandwidth,
has a bufTer
asynchronously
20 tlits/flsec.
the
and
synchronize
rithms
about
Each
size of a single
flit.
(Figure
13). At low throughputs,
the algorithms
perform
the same. For the nonuniform
tratiic patterns,
the partially
adaptive
to simult~
routing
algorithms
throughputs
(Figures
have the lower
high
are blocked
the source
mum sustainable
throughput
of the partially
is four times that of the nonadaptive
e-cube
contain
header
an input
flits
selection
consumed.
waiting
policy
must
local first-come-first-served
that
arrived
indefinite
channel
has multiple
selection
policy
decides in favor
multiple
arbitrate.
first.
This
postponement.
output
in favor
policy
When
channels
channels
channel,
of the header
a header
policy
along
and
queued
at their
the
source
network
processors
fit
an
output
used is called ZY and
the lowest dimension.
Average
communication
latency
and Teverse-jiip.
any of the other
the
The uniform
processors
with
matrix-transpose
cessor at row
umn i. In the
by
mapping
bors
in the
then
sent
mesh.
nodes
resulting
one d
determined
message
the
by
from
hypercube
so that
hypercube.
the
q ). The
0
100
200
Average
300
matrix
transpose
are
in the
reverse-tlip
Figure
14. Comparison
traffic
in a 16 x 16 mesh.
pattern
Overall
one at
of routing
in the hypercube,
are for the partially
These throughputs
sends each
able throughput
algorithm
and
is not
algorithms
the highest
I
I
I
Average
commun-
25
ication
Z.
due to shorter
(
longer
-
path
lengths
for matrix-transpose
than
tratlic
15
tion
-
xv
10
west-first
1
0
0
100
Average
1
I
200
300
network
negative-first
I
I
400
500
I
600
throughput
for reverse-flip
800
packets.
(flits/Psec)
The
maintained
Despite
The
and
simulations
hypercube,
latencies
at high
of routing
tratfic
for uniform
happen
tratlic
From
pattern
that,
nonadaptive
throughputs
for uniform
routing
than
the
traffic
algorithms
traffic
probably
partially
in the mesh
have
adaptive
hops)
than
algorithms
routing
to embody
pattern.
trafhc
(11.34
routing
adaptive
result
as well
the
is that
as when
superior
for uniform
provide
better
the
The
av-
for uniform
perform
algorithms
global,
with
for
long-term
a global,
starts
informa-
long-term
message
bet-
uniform
point
traffic
of
spread
and the zv and e-cube algoadaptive
algorithms,
on the
lower
A traffic
processes
algo-
applications,
performance
traiiic,
285
the
will
of uniform
tratiic
information
is used.
of the
nonadaptive
partially
perforrmmce
pattern
is determined
are mapped
to the
each node
evenness
global
tratiic
has been used in many
we know of no real applications
ind]cate
the
algorithms
tratlic.
tratlic
is 4.27 hops, versus 4.01
in the mesh, the highest sustti’n-
across the mesh or hypercube,
maintain
that evenness.
The
algorithms
Figure
13. Comparison
in a 16 x 16 mesh.
for the e-cube
in throughput
other hand,
select channels
based on local, short-term
information. These selections
tend to benefit
just the routed
packet and
ordy for the immediate
future
and tend to interfere
with other
-AI
700
they
this
the uniform
evenly
rithms
-Q--
partially
is that
about
view,
-Q-
north-last
5;
*
the
tratiic.
sustain-
throughput
in the mesh, which occurs for the
uniform
tratRc.
Again,
average path length
is
traflic
(10.61 hops).
The reason the nonadaptive
ter
throughputs
and reverse-tlip
the next highest
is for the negative-&t
algorithm
and matrixThis throughput
is 30~o higher than the second
highest
sustainable
xy algorithm
and
latency
(flsec)
800
for matrix-transpose
in the hypercube,
which occurs
uniform
tratiic.
This improvement
able throughput
transpose
trallic.
i
35
30
700
(flits/flsec)
sustainable
adaptive
algorithms
are 50% higher than
erage path length for reverse-flip
hops for uniform
tratlic.
Overall
I
600
sends each message
at (ZO, ZI, zz, Z3, ZA, Z5, ~Ij, $7) to the
I
500
throughput
neigh-
Messages
(Z-7, Z-f, $–~, ~–~,Z-3, @, Z–I, Z-o).
40
400
network
at row j and colpatterm
is derived
in the hypercube
the processor
by
the pru-
at (ZO, ZI, $2, Z3, X4, Z5, Z6, X7 ) to the
(3Y4, W, Z6, X7, ZZO,r~, w,
from
each
in the
dictated
pattern
the processor
message
to
are neighbors
F
pattern
sends each message to
equal probability.
In the mesh,
sends
a 16 x 16 mesh
mesh
zo
dependent.
Three
matrix-transpose,
i and column
j to the one
hypercube,
a matrix-transpose
to the
The
from
pattern
25
and bounded.
is largely
the message trafFic pattern,
which is application
network
workloads
are considered:
uniform,
algorithms
in an input
to it,
is small
performance
adaptive
algorithm.
flits
routing
is minimal.
For each simulation,
two characteristics
of network
performance are measured:
average communication
latency
(in Usec)
and average
sustainable
net work throughput
(in tlits delivered
per psec).
The throughput
is sustainable
when the number
of
Obviously,
at
therefore
All
packets
especially
For matrix-transpose
used is called
is fair
available
must arbitrate.
The
of the output
channel
input
output
The policy
and decides
in the router
prevents
When
for the same available
16).
traflic
in both the mesh and hypercube,
the maximum
sustainable throughput
of the partially
adaptive
algorithms
is twice that
of the nonadaptive
algorithms.
For reverse-fllp
traftic,
the maxi-
from immediately
entering
the net work are queued at
processor.
Messages that arrive at a destination
pro-
cessor are immediately
14, 15, and
latencies,
neously
transmit
the tlits in a packet.
The processors
generate messages at time intervals
chosen from
a negative
exponential
distribution.
Each message has an equal
probability
of being one packet of 10 or 200 tlits.
Messages that
previous
that
adaptive
in real
systems.
is not
routing
algorithms
Uniform
simulation
studies,
but
generate
uniform
traffic.
by the application
and how its
nodes of the network.
For most
communicate
with
some nodes
much
more
than
algorithms
30
Average25
communication
Z.
latency
.
ABONF
Q
ABOPL
p-cube
=
L
presents
illustrate,
is often
7
Conclusions
Our
goal
tions
because
has
been
to
poor
performance.
and
Future
make
the
interconnection
in which
packets
number
they
a problem
for the
are nonadaptive.
Just
maintain
the evenness
of uniform
tratlic,
the unevenness
of nonuniform
trafiic.
The
minimum
use
of
net works.
can turn
of turns
best
break
the
channels
Analyzing
in a network
that
they blindly
result, ss the
Work
cycles
faulty
ment
spot
hardware
and decreases
and livelock.
Nonminimal
avoidance
and fault
the
produces
free, minimal
freedom
and
livelock
freedom
are essentiaf
for routing
algorithms.
ness increases
the chances that packets can avoid hot
15 -
in
the direc-
and prohibiting
all of the
routing
algorithms
that are deadlock
free, livelock
or nonminimal,
and maximally
adaptive.
Deadlock
10
Adaptivespots and
the chances of indefinite
postponerouting
allows, even greater hot-
tolerance.
The
turn
model,
urdike
other
54
apprOmhes
to designing
~aptive
routing
algorithms,
is applicable to networks
with only the channels
required
by the network
o~
topologies
channels).
o
100
200
Average
300
400
network
500
throughput
600
700
without
800
tially
(flits/psec)
(as well
Applied
extra
15. Comparison
in an 8-cube.
ofroutingalgorittis
formatrix-trampose
While
the
disadvantages.
as to networks
with extra
to n-dimensional
meshes
chanuels,
the turn
routing
algorithms.
adaptive
adaptive
routing
than nonadaptive
tratiic.
Figure
traffic
tratlic
as they
maintain
wormhole-routed
35
(psec)
Nonuniform
e-cube
figures
40
others.
ZY and
algorithms
algorithms
turn model
Adaptive
trol
logic
for route
this
may
increase
the
need
for
produces
Simulations
tion
on more
selection
than
delay.
of the
between
information.
bases
one of the dimensions.
a selection
partially
does nonadaptive
Part
to decide
header
typically
new, par-
indicate
that they can perform
better
for nonuniform
pat terns of message
For
a selection
on the distance
remaining
and
is due
output
to
chan-
Another
part of the
to base the route selecdimension-order
on the
For adaptive
routing,
complexity
multiple
nels, all of which lead to the destination.
complexity
is due to the need for a router
a router
severaf
of these
has many advantages,
it also has some
routing
can require
more complex
con-
node
a router
model
physicaf
or virtual
and k-ary n-cubes
distance
routing,
remaining
routing,
a, router
in more
than
in
must
base
one, or all, di-
mensions.
Every extra bit of header information
that is required
for the router to select an output
channel increases router storage
requirements
of store
40
I
35
(flsec)
e-cube
*
on network
~
the
Q
-,&
cal channels.
I
tigate
the
turn
effects
model
to apply
octagonal,
15 10
o~
for future
input
In [18],
to networks
Other
that
models
for
networks.
and
more
like those
work.
In [19],
we inves-
output
selection
we illustrate
include
designing
Another
mit adaptive
routing
topologies,
the turns
without
are not
stract
necessarily
cycles
are not
task
butions,
the
extra
virtuaf
adaptive
policies
application
of
or physi-
routing
algo-
obvious
extension
do
for
of our work
is
is the
so that
the
the addition
of channels.
necessarily
90-degrees
and
formed
identification
results
by
four
of realktic
of future
simulations
turns.
workload
can
In such
the abA final
distribe more
meaningful.
500
network
1000
1500
throughput
2000
2500
(flits/psec)
Acknowledgments
The
Figure 16. Comparison
fic in an 6-cube.
latencies
the turn model to other topologies,
such as hexagonal,
and cube-connected
cycle net works, all of which per-
important
u]
Average
directions
of different
performance.
the enhanced
o
communication
rithms
are based on adding
extra channels
to networks,
but
not produce
routing
algorithms
that are maximally
adaptive
.20
54
are mauy
I
ABOPL
p-cube (NF)
25
There
I
ABONF
30
Average
communication
latency
and makes
and f&ward.
of routing
algorithms
for reverse-flip
authors
develop
traf-
wish
to thank
Dr.
Philip
K. McKinley
for helping
us
the simulator.
References
1. NCUBE
2. Intel
iion,
286
Company,
Corporation,
1991.
NCUBE
A
6400
Touchdone
P70ce88c,r
DELTA
Manual,
System
1990.
De8crip-
3.
S. B. Borkar,
Kung,
R. Cohn,
M. Lam,
P. S. Tseng,
J. Sutton,
An integrated
solution
Proceedings
4. D.
Lenoski,
J.
May
A.
and J. Webb,
parallel
Gharachorloo,
multiprocessorfl
on
Agarwal,
B.-H.
Lim,
A processor
tiprocessor,”
on
Computer
W.
J. Dally
D.
Kranz,
tmchitecture
in Proc.
of the
Architecture,
Networks,
9.
vol.
vol. 3, no. 4, pp.
nection
“Performance
IEEE
net works,”
39, pp.
775–785,
J.
J.
Kubiatowicz,
International
torus
May
1990.
routing
1, no. 3, pp.
X. Lh
June
267–286,
chipfl
Journal
187–196,
and L. M. Ni,
“Deadlock-fi-ee
networks,”
International
116–125,
C.
L.
Seitz,
A new
Computer
1979.
multicast
wormhole
in Proceedings
Symposium
May
1986.
1990.
ing in in multicomputer
pp.
mul-
Symposium
analysis of Ic-ary n-cube interconTransactions
on Compute..,
vol. C-
Dally,
Annual
10.
and
protocol
7. P. Kermani
and L. Kleinrock,
“Virtual
cut-through:
computer
communication
switching
technique,”
W.
20.
1988.
multiprocessing
104–114,
“The
Computing,
and
for
17th
pp.
and C. L. Seitz,
of Distributed
8.
Conference
the 17th Internapp. 148–159,
of
Architecture,
on
of
Computer
routthe 18th
Architecture,
1991.
W.
Athas,
C.
C.
M.
Flaig,
A.
J.
Martin,
J. Seizovic,
C. S. Steele, and W.-K.
Su, “The arch] tecturc
and programming
of the Ametek
Series 2010 multicomputer~
in Proceedings
of the Third
Conference
current
Computers
and Applications,
CA), pp.
1988.
11.
12.
13.
W.
J.
33–36,
Association
“The
Dally,
on Hypercube
ConVolume
I, (Pasadena,
for Computing
J-machine:
System
in Actors:
Knowledge-Based
Concument
and
eds.),
1989.
Agha,
MIT
Press,
Mach] nery,
support
for
Jan.
Actors)
Computing
(Hewitt
W. J. DallY and H. Aoki,
“Adaptive
routing
using virtual
channels,”
tech. rep., Massachusetts
Institute
of Technology,
Laboratory
for Computer
Science, Sept. 1990.
H. Sullivan
fully
Annu.
and T. R. Bashkow,
distributed
Symp.
parallel
“A large
machine?
Architecture,
Comput.
scale, homogeneous,
in Proceedings
oj the lth
vol. 5, pp. 105–124,
Mar.
1977.
14.
W. J. DallY and
in multiprocessor
tions
15.
on
Computers,
J. T. Yantchev
deadlock-free
IEE
16.
18.
and
vol.
C-36,
Pt.
routing
for
E, vol.
W. J. Dally,
“Fine-grain
message
ers,”
in PTOC. of the Third
rent
Computers,
vol.
vol.
pp.
1987.
low
latency,
of processorsfl
178-186,
May
in
1988.
“An adaptive
and fault tolfor k-ary
n-cubes ,“ IEEE
40, pp.
passing
Conference
1, (Pasadena,
May
“Adaptive,
networks
136(3),
D. H. Linder
and
J. C. Harden,
erant wormhole
routing
strategy
on Computers,
message routing
IEEE
Tmnsac-
pp. 547–553,
C. R. Jesshope,
packet
Proceedings,
Transactions
17.
C. L. Seitz, “Deadlock-free
interconnection
networks,”
on
CA.),
2-12,
Jan.
concurrent
Hypercube
pp.
2–12,
1991.
computConcurJan.
C. J. Glass and L. M. N],
“Adaptive
routing
in meshconnected
networks,”
in Proceedings of the 12th International
on
Distributed
Computing
System.,
June
1992.
in
Gupta,
coherence
in Proc.
Computer
Nov.
A.
cache
“iWarp:
computing:
’88, pp. 330-339,
K.
19.
H. T.
L. Rankh,
1990.
“APRIL:
6.
J. Urbanski,
directory-baaed
Symposium
T. Gross,
J. Pieper,
to high-speed
Laudon,
“The
for the DASH
tional
S. Gleason,
C. Peterson,
oj Svpercompnting
J. Hennessy,
5.
G. Cox,
B. Moore,
1988.
C. J. GhSS and L. M. Ni, “Maximally
fully adaptive
routing in 2d meshes;
Tech. Rep. MSU-CPS-ACS-51,
Dept. of
Computer
Science, Michigan
State University,
East Lansing,
Michigan,
Jan. 1992.
287
S. Konstantinidou,
“Adaptive,
minimal
routing
in hypercubes, “ in Proc. of the 6th MIT
Conference: Advanced Re1990.
search in VLSI, pp. 139–153,
Download