Bandwidth Availability of Multiple

advertisement
918
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-34, NO. 10, OCTOBER 1985
Bandwidth Availability of Multiple-Bus
Multiprocessors
CHITA R. D A S , STUDENT MEMBER, IEEE, AND LAXMI N . BHUYAN, MEMBER, IEEE
Abstract—Multiprocessor systems should be designed consid­
ering both performance and reliability issues. They should sup­
port graceful degradation by isolating the failed components and
by reconfiguring to a new state with decreased performance. We
present in this paper the effect of failures on the performance of
multiple-bus multiprocessors. Bandwidth expressions for this
architecture are derived for uniform and nonuniform memory
references. Mathematical models are developed to compute the
reliability and the performance related bandwidth availability
(BA). The results obtained for the multiple-bus interconnection
are compared with those of a crossbar. The models are also ex­
tended to analyze the partial bus structure where the memories
are divided into groups and each group is connected to a subset of
buses. The reliability and the BA of the multiple-bus and partialbus architectures are compared.
perfect interconnection between them. However, these mini­
mum numbers can be increased depending on the particular
task to be executed.
This paper addresses two important issues of fault toler­
ance, namely, reliability and bandwidth availability (BA).
Reliability [1] of a system at time t is defined as the proba­
bility that the system is operational during (0, t). We define
the BA of a gracefully degrading multiprocessor as the ex­
pected value of available BW in the system at time t. This
is a performance related reliability attribute similar to
computation availability ( C A ) , defined by Beaudry [2].
Performability [3] is another criterion that captures the de­
pendability of degradable computer systems quite well.
We consider two types of multiprocessors: one with a
Index Terms—Bandwidth availability, crossbar, graceful deg­
radation, multiple bus, multiprocessor, partial bus, performance crossbar [4] and the other with a multiple-bus [ 5 ] - [ l l ] inter­
analysis, reliability.
connection. These two interconnections possess good per­
formance characteristics [6]—[13] and are suitable for easy
expansion of the system to support increasing load. The
I. INTRODUCTION
multiple-bus architecture has an added advantage of fault
HE ever increasing need for higher computing power, tolerance in the sense that there exist alternate paths between
coupled with the advent of VLSI technology, has re­ a processor and a memory for use in case of faults. Fig. 1
sulted in the evolution of multiprocessors. The performance shows an Μ * Ν * Β multiple-bus architecture having
evaluations of various multiprocessors have been reported Μ processors, Ν memory modules, and Β buses where
extensively using analytic and simulation models. Most of Β < min(M,A0. A bus is connected to all the processors and
the models use bandwidth (BW) as a performance metric to all the memory modules. The arbiter cyclically allocates a
where BW is defined as the average number of memory bus to a memory that has an outstanding request. Thus, Β
modules (MM's) remaining busy in a cycle. These models processors can be connected to Β memories at a time. The
implicitly assume that all the components of the system are cost of such an interconnection is 0(B(M + N)). Fig. 2
fault free. However, in a real situation the components of a shows an Μ * Ν crossbar system with Μ processors and
multiprocessor fail at random. The failure of the processors, Ν M M ' s . There are Ν buses; a bus is connected to all the
memory modules, and interconnection links degrades the processors, but to only one memory. The cost of the crossbar
performance of the system. So the simultaneous consid­ is 0(M * N). The cost of an Ν * Ν * Β multiple bus with
eration of both the performance and the fault tolerance issues N/2 buses is the same as that of an TV * Ν crossbar. A modi­
is very important to properly evaluate a multiprocessor. It is fication of the multiple-bus multiprocessor, that has been
highly unreasonable to assume that a multiprocessor should proposed by Lang [7] to provide better cost effectiveness, is
fail due to the failure of a single component. Rather, the known as the partial-bus architecture. Fig. 3 depicts a partialsystem should be able to detect any faulty module and should bus system having Μ processors, TV memories, and Β buses.
have the ability to reconfigure and operate in a degraded The memories are divided into g groups. All the processors
mode with fewer available resources. The graceful degrada­ are connected to all the b u s e s , whereas each group of
tion ability implies that the system remains operational as (N/g) M M ' s is connected to a set of (B/g) buses. There are
long as the minimum number of resources needed to execute g arbiters to allocate buses for communication. It is assumed
a task are available. In order to obtain a valid multiprocessor that g is a factor of both Β and N. The partial-bus connection
there should be at least two processors, two memories, and a has a lower cost compared to an Μ * Ν * Β multiple-bus
system, as the cost of the former is 0(B(M + (N/g))). The
effects of processor, memory, and bus failures on the re­
Manuscript received February 1, 1985; revised May 30, 1985. A portion of
this work was presented at the Fourteenth International Conference on Parallel liability and the Β A of these architectures are analyzed.
T
Processing, St. Charles, IL* Aug. 1985.
The authors are with the Center for Advanced Computer Studies, University
of Southwestern Louisiana, Lafayette, LA 70504.
Ingle and Siewiorek [14] have presented the reliability
models for C.mmp and Cm* structures. They have consid-
0018-9340/85/1000-0918$01.00
© 1985 IEEE
919
DAS AND BHUYAN: MULTIPLE-BUS MULTIPROCESSORS
MM,
MM-
S
5
P
Fig. 1.
• · ·
MM
M M ' s through an IN and the failure of the IN also degrades
the system performance.
In this paper we give more realistic models for the re­
liability and combine both performance and reliability to
analyze the BA of the multiple-bus and crossbar archi­
tectures. In the next section we consider the performance
analysis of the multiple-bus system for both uniform and
nonuniform memory references. The results obtained in this
section are used to derive expressions for the bandwidth
availability. In Section III we develop the reliability models
for multiple-bus and crossbar architectures. Section IV deals
with the Β A analysis of the two systems. In Section V we
extend the reliability and the Β A models, developed for the
multiple-bus structure, to evaluate the fault tolerant charac­
teristics of the partial-bus architecture. The last section sum­
marizes the results.
X1
I
• · ·
2
An Μ * Ν * Β multiple-bus multiprocessor.
MM
MM.
MM-
V1
II.
Ξ
Fig. 2.
An Μ * Ν crossbar multiprocessor.
group 1
MM
1
•· ·
Fig. 3.
MM
N
g
"TK-N+D
s
•··
MM
N
AnM * Ν * Β partial-bus multiprocessor with g groups.
ered processor and memory failures assuming that the inter­
connection network. (IN) is not degradable. Hwang and
Chang [15] have analyzed the reliability of multiprocessors
using graph models. These papers are based on classical
reliability theory and do not consider the performance aspect
of the m u l t i p r o c e s s o r s . G a y and K e t e l s e n [16] h a v e
presented capacity and workload models for gracefully de­
grading multiprocessor systems for processors handling mul­
tiple transactions. Chou and Abraham [17] have analyzed the
performance and availability measures of shared resource
multiprocessors using resource guardian. However, the
above two models do not represent a tightly coupled multi­
processor system, where the processors are connected to the
PERFORMANCE ANALYSIS
Performance analyses of the multiple-bus multiprocessor
have been reported recently in several papers [5]—[11].
All the above analyses are based on the assumption that a
processor addresses any one of the common memories with
the same probability. However, in a practical situation a
processor is likely to address a particular memory more
frequently except when an interprocessor communication is
necessary. So we introduce a parameter m, which is defined
as the probability that a processor P, sends a request to mem­
ory MM, provided P, generates a request. We will assume
that we have prior knowledge of the parameter m. When
m = l/N the model reduces to the equally likely case, ana­
lyzed before [5]-[ 11]. For m > l / N , a processor P, commu­
nicates more often with memory MM/, and MM,- is called a
favorite memory of P,. So m is a general parameter that
includes both equally likely and favorite memory cases. The
memory interference analysis of the crossbar architecture
using this general modeling has been reported in [18]. We
apply the same idea to derive the BW expressions for an
Μ * Ν * Β multiple-bus architecture.
The analysis is based on the following assumptions.
1) The operation is synchronous; i.e., the requests issued
by the processors begin and end simultaneously.
2) The requests generated in a cycle are random and are
independent of one another.
3) The requests issued in successive cycles are indepen­
dent of the requests issued in the previous cycle.
4) Requests which are not accepted are rejected.
The third assumption is unrealistic because a rejected re­
quest will indeed be resubmitted in the next cycle. However,
this assumption leads to simpler analysis, and it does not
result in a substantial difference in the actual results [13].
Let ρ be the probability with which a processor generates
a request in every cycle and m be the probability with which
a processor P, addresses memory MM/ given thatP, generates
a request. Thus, pm is the rate of request of a processor P, to
MM,. For an Μ * Ν * Β system, as shown in Fig. 1, with
Β < min(M,A0 there will be contention for memories as
well as for buses. So the model involves a two-stage inter-
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-34, NO. 10, OCTOBER 1985
920
ference resolution as compared to only the memory inter­
ference analysis for a crossbar [18]. Moreover, since m is
used to relate a processor-memory pair, we consider three
different situations: Μ = ΝΜ > Ν', and Μ < Ν.
The probability that there is at least one request for MM, is
Α.
For Μ = Ν and m = \/N, (5) reduces to X = 1 - (1 (p/N)) .
Μ * Ν * Β Multiple Bus with Μ = Ν
'-'-«--('-^Γ('-ίΓ»
N
With ρ and m defined as above, the probability of P, re­
questing MM,, for 1 < / < N, is given by P, —> MM, = pm.
Hence, the probability that P, does not request MM, is
P, -h MM, = 1 - pm. For i Φ j the probability that Pj has
a request for MM, i s P , - » MM, = p((l - m)/(N - 1)). So
P;-/> MM, = 1 - p{{\ - m)/(N - 1)). Now the proba­
bility that none of the Ν processors has a request for MM, is
(1 - pm){\ - p(l - m)/(N - l)) ~ .
From this we can
write the probability that there is at least one request for
MM,- as
N
l
X = \ - { \ - p m ) [ \ - p j ^
N
~ \
(1)
The B W is simply obtained by using (2), (4), and (5).
C. Μ * Ν * Β Multiple Bus with Μ < Ν
This is a reverse situation of case II-B, as the memories are
now divided into two groups. The first group consists of Μ
memories related to the processors with a request rate m, and
the second group consists of (Ν - M) memories, which can
be equally addressed by any of the Μ processors. Cor­
respondingly, there will be two distinct probabilities of
requests for the two memory groups. Let X\ represent the
probability that a memory belonging to the first group
is requested and X be the probability that a memory be­
longing to the second group is requested. From Section II-A
we can write
2
F o r m = l/N, (1) reduces to 1 - (1 (p/N)) .
Given the probability of a processor requesting a memory
by (1), the probability that exactly i memory services are
requested in a cycle is given by [19]:
N
p(i) =
(^)x'(l
"JO" ".
-1
(2)
/
1 - m\ ~
X ^ l - i l - p m ^ l - p j j — j )
.
M
l
(6)
The probability with which a processor requests a memory in
the second group is (1 - m)/(N - 1). Hence, X is given by
2
/
1 - m\
The multiple-bus system with Β buses can allow at best Β
processor-memory connections per cycle. The system gets
Λ - ' - Ι ' - ' Τ Γ Γ Ϊ ) ·
m
saturated when more than Β requests are generated and al­
To find the probability that exactly / memory requests are
lows only Β connections simultaneously. Thus, the BW of
satisfied
in a cycle we have to consider all the possible distri­
the system is expressed as
butions
of
/ requests between the two memory groups. There­
Β
Ν
fore,
p(i)
is
expressed as
BW
= Σ < ·/>(<) + Σ Β -p(i)
M
M M )
i=\
ζ=β+1
Ν
Ν
= Σ » 'Ρϋ) ~
1=1
Σ
mm(N-M,i)
(« -Β)·ρ{ΐ)
(3)
Pd)=
/„
_ w \
\
.
J
Σ
7=0
ι'=β+1
U(i
'
-
xr ~
M
J
2
Ν
= ΝΧ -
Σ (ί - Β)·ρ(ϊ).
(4)
i=B+l
The first term in (4) is the BW of an Ν * Ν crossbar taking
ρ and m into consideration, and the second term gives the
degradation in performance due to bus insufficiency. The
first term is the same as that derived for the crossbar in [18].
Β.
Μ * Ν * Β Multiple Bus with Μ > Ν
where j is the number of memory references satisfied from
the second group of (TV - M ) , and (/ - j) are memory re­
quests satisfied from the first group. For Μ = N (8) reduces
to (2). For m = l/N we find that X = X = X, and (8)
reduces to
9
{
2
This system is modeled as Ν processor-memory pairs re­
lated with probability of reference m and the remaining
(Μ - N) processors having equal addressing capability to
any one of the memories. So there are two groups of pro­
cessors: one group having probability of reference m and the
other group with equal access probability of l/N.
The BW then becomes
Now the probability that none of the processors from the
Β
Ν
first group requests M M , is (1 - pm){\ — p(l — m)/
BW
= Σί' · Ρ ( 0 + Σ Β
(N - \)) ~ .
The probability that none of the processors
1=1
/=Β+1
from the second group requests MM, is (1 (p/N)) ~ .
= MX, + (Ν — Μ)Χ
Hence, the probability that none of the Μ processors has a
Ν
request for MM, is
- Σ (/ -Β)·ρ(ί).
N
M)VB
l
M
·ρ(ί)
N
2
(9)
/=β+1
The first two terms in (9) represent the BW of an M * Ν
921
DAS AND BHUYAN: MULTIPLE-BUS MULTIPROCESSORS
TABLE I
BW OF AN Ν * Ν SYSTEM WITH CROSSBAR AND MULTIPLE Bus, ρ =
Β
NUMBER
OF
BUSES
N=4
m =
N=8
N=12
1
N
m=0.8
m=
1
N"
m=0.8
1
1.0
N=16
m=0.8
m =
1
N
m=0.8
1
0.99
1.00
1.00
1.00
1.00
1.00
1 00
1.00
2
1.89
1.98
2.00
2.00
2.00
2.00
2 00
2.00
3
2.51
2.86
2.97
3.00
3.00
3.00
3 00
3.00
4
2.73
3.35
3.87
4.00
3.99
4.00
4 00
4.00
5
4.59
4.97
4.97
5.00
5 00
5.00
6
5.04
5.84
5.88
6.00
5 99
6.00
7
5.22
6.45
6.66
6.99
6 97
7.00
8
5.25
6.69
7.24
7.96
7 89
8.00
9.00
9
7.58
8.84
8 72
10
7.73
9.53
9 39
9.99
11
7.77
9.92
9 86
10.95
12
7.78
10.03
10 13
11.85
13
10 25
12.59
14
10 29
13.09
15
10 30
13.33
16
10 30
13.38
10.30
13.38
N*N
2.73
5.25
3.35
6.69
7.78
10.03
CROSSBAR J
crossbar for Ν ^ Μ, taking ρ and m into consideration.
Tables I and II compare the BW of a crossbar and multiplebus interconnection for an Ν * Ν system for various values
of Ν and Β with m assigned the values 1 /N and 0.8 for ρ = 1
and 0.5, respectively. The results show that for the equally
likely case the multiple bus performs close to a crossbar
when Β « N/2. For ρ = 1 and m = 0.8 a processor re­
quests a particular memory most of the time, thereby reduc­
ing the memory access conflicts. The BW of the system is
then very much bus dependent. However, for ρ - 0.5 and
m - 0.8 it can be observed that the B W ' s of the two architec­
tures differ by only 7 percent when Β - N/2. The results
indicate that the number of buses for a multiple-bus system
should be determined by taking both ρ and m into consid­
eration. If a processor generates a request in every cycle, then
the multiple bus should have at least N/2 buses to provide
comparable performance with that of a crossbar. When ρ is
less than 1, the multiple bus is more flexible because of
having only the required number of buses. The crossbar in
this case is clearly underutilized.
III.
RELIABILITY MODELING
TABLE II
BW OF AN Ν * Ν SYSTEM WITH CROSSBAR AND MULTIPLE Bus, Ρ = 0 . 5
N=4
Β
NUMBER
OF
BUSES
m=77
Ν
m=0.8
m =
1
0.88
0.91
2
1.43
1.54
m
b
km
m
b
1
N
N= 16
1
m=0.8
m =
0.98
0.99
1.00
1.00
1 00
1 00
1.88
1.93
1.98
1.99
2 00
2 00
2 99
m=0.8
m-=0.8
3
1.63
1.79
2.57
2.73
2.89
2.95
2 98
4
1.66
1.83
2.99
3.27
3.67
3.83
3 91
3 97
3.16
3.54
4.23
4.54
4 74
4 89
5
6
3.22
3.64
4.57
5.03
5 41
5 71
7
3.23
3.66
4.72
5.32
5 87
6 37
8
3.23
3.66
4.78
5.44
6 15
6 83
9
4.80
5.48
6 29
7 10
10
4.80
5.49
6 35
7 24
11
4.80
5.49
6 37
7 29
12
4.80
5.49
6 37
7 31
13
6 37
7 32
14
6 37
7 32
15
6 37
7 32
16
6 37
7 32
6 37
7 32
N*N
CROSSBAR
1.66
1.83
S
p
Ν =12
3.23
3.66
R M(t) = R (t)
The multiple-bus structure of Fig. 1 is divided into three
independent submodules: processors, buses, and memories.
The reliabilities of these submodules ate determined indepen­
dently. The system reliability is then obtained by considering
the series reliability of these submodules. We assume that the
elements of a submodule are all identical and have the same
failure rate. The failures are assumed to be exponentially
distributed for simplicity. Thus, we define λ^, A , and \ as
the failure rate of a processor, a memory, and a bus, re­
spectively. Then R (t) = e~V, R (t) = e~ \ and R (t) =
e~ give the corresponding reliabilities. If a task needs at
least / processors, J memories for execution, and a bus for
communication, the reliability of the multiple-bus system
RSM(0 is given by
Xbt
N=8
1
N
a
• i c r ^ ) ( / ?
m
4.80
5.49
. j
~
( r ) H l
-R (t)) -
N J
m
where C , C , and C are the coverage factors for the pro­
cessor, memory, and b u s , respectively, and R {t) is the
reliability of the arbiter. Coverage [20] is defined as the
probability that the system recovers successfully given that
there was a failure.
A crossbar is modeled in this paper as shown in Fig. 2. As
a bus is connected to only one memory, the failure of a bus
or a memory reduces the size of the crossbar to Μ * (Ν - 1).
p
m
b
a
922
IEEE TRANSACTIONS
ON COMPUTERS,
V O L . C-34,
N O . 10,
OCTOBER
1985
Hence, the reliability of a memory module is expressed as
R' (t) = R (t)R (t)
= e' \ with an equivalent failure rate
A = (A 4- X ). The reliability of the crossbar system R c(0
with minimum / processors and J memories active is then
Xe
m
e
Rsc(t)
m
b
m
b
S
= J ? ( f ) £ c W ) (W)'O
W
e
i=I
\
1
-
Λ,Ο)"""
/
• icr{^)(/?;(i)Wi
(ID
-R' (t)r-'
m
where C is the coverage of a memory-bus combination.
Fig. 4 shows the comparison of reliability between a mul­
tiple bus and a crossbar system for the same processor and
memory requirements. The reliability of the arbiter and the
coverage parameters is assumed to be unity in this figure. The
original c o n f i g u r a t i o n s are a 16 * 16 c r o s s b a r and a
16 * 16 * 8 multiple-bus system, both having approximately
the same cost. The multiple bus has a better reliability than
400
800
1200
1600
2000
the crossbar because in the former the buses are independent,
Time ( h o u r s )
and only one bus may be sufficient for keeping the system Fig. 4. Reliability of a 16 * 16 * 8 multiple bus and a 16 * 16 crossbar
for a task requiring / processors and / memories.
multiple bus,
operational. However, in a crossbar the stipulation of the
- - - - crossbar, k = X = 0.0001, λ* = 0.00005, R {t) = C = C =
memory-bus configuration is quite rigid. Also, it may be
observed in Fig. 4 that the reliability of the multiprocessors
increases dramatically if a task can be executed with fewer
processors and memories than the original configuration.
we write
e
p
P (t)
ijk
IV. BANDWIDTH AVAILABILITY ( B A ) ANALYSIS
The reliability measures discussed in Section III are not
sufficient to evaluate a multiprocessor system. They only
give the probability that the system is operational at a time t,
but do not reveal the performance degradation due to failures
in (0, t). Since the basic objective of the multiprocessors is to
provide high performance, the variation of available B W
with time should be used as a criterion to evaluate various
multiprocessors. Β A gives the expected amount of bandwidth
available on the system at time To calculate the B A we use
the Markov model representation of the multiple-bus archi­
tecture, as shown in Fig. 5, for a task requiring at least /
processors, J memories, and a bus. A state
k) means that
there are i processors, j memories, and k buses operational
in the system starting with the initial configuration (Μ, Ν, Β).
A state is defined as operational as long as the minimum
processor, memory, and bus requirements are satisfied. Tran­
sition from a state (i,j,k)
to a n y o n e of t h e states
(i - \J,k),(iJ
- 1, k), or (1,7, k - 1) can occur if a pro­
cessor, a memory, or a bus fails and the failure is covered. If
a failure is not covered the system goes to the failed state ( F ) .
The transition diagram of the system in state
k) is shown
in Fig. 6. Let P (t) be the probability that the system is in
state (ij, k) at time t. The differential equation for the state
(ij,k)
is given by
ijk
P (t)
ijk
= [(/ + l)C \ P (t)
p p
+ (j +
i+lJyk
+ (* + l ) C * A P
fc
iiM+1
l)C \ P (t)
( f ) - (i\ +jX
m m
p
m
+
m
a
= (^)c -'(^(i))'(l -
m
R (t)r-'
M
p
P
• (^)c^(/? (i)V(l -
R (t))"-
J
m
• (*)crW/))'(l
p
m
-R„(t)) - .
B k
(12)
The system reliability, obtained in (10), can be derived again
by summing the probabilities of all the nonfailed states. So,
Μ
Ν
Β
Λ«(0 = Σ Σ Σ ν θ ·
/=/ j=J
(13)
k=l
To determine the B A , we sum the expected B W of all the
nonfailed states. Hence, the B A of the multiple-bus system
B A ( i ) at time t is expressed as
w
Μ
Ν
Β
Β Α « ( ί ) = Σ Σ Σ Prnit) · BW,;*
(14)
ι=/ j=J * = 1
where B W is the B W of the multiple-bus architecture hav­
ing i processors, j memories, and k buses. It can be obtained
from (3) using appropriate p(i) for the three possible ij
combinations.
The Markov model for the crossbar system is shown in
Fig. 7. Here the trahsition rate from a state (1,7) to state
(1,7 - 1) is given by j \ since either a bus or a memory
failure can result in this transition. Extending the same idea
for the crossbar structure, the BA (t)
is expressed as
0JT
e
sc
iJ+uk
Μ
kX )P (t)]
b
Ν
BA (i) = I E W B W
(
ijk
sc
for / < i < M, J < j < N, and 1 < k < B. The equations
for the boundary states can also be easily determined. Now
the probability of state (ij, k) can be obtained [21]. Hence,
/=/
t
f
(15)
j=j
where Ρ (ί) is the probability that the system is in state (1,7),
and BW/, is the B W of an / * 7 crossbar. P^t) is given by
ί}
DAS A N D B H U Y A N : MULTIPLE-BUS
Ιλ +Νλ
ρ
m
B X
(1-C )+
m
b' 1
c
)
b
Fig. 5.
ix (i-c ) jX (ip
Fig. 6.
p
+
n
C i i i
) kX (i-c )
+
923
MULTIPROCESSORS
b
Markov model of the Μ * Ν * Β multiple-bus multiprocessor.
(7
b
Transition probabilities of the Μ * Ν * Β system in state
Μ
Pub) = ι . j c r w o m
Ν
-
k).
R (t))
h
P
c r > ( J C ( ' M i - ٦'Λι)) -\
Ν
(16)
and the BW is given by
BW
e
=
if i > j
l « , + (j - i)X
2
if ι < y .
The proper X orX\ a n d X values are obtained from (1), ( 5 ) (7) for the various i,j combinations.
In Fig. 8 we present the variation of the Β A with time for
different processor and memory requirements of a task. We
use Β = N/2 for the multiple bus because the performance of
the multiple bus is then close to that of a crossbar. The BW
is calculated for a probability of request equal to 1 and m =
2
Fig. 7.
Markov model of an Μ * Ν multiprocessor employing crossbar.
1 /N. It is seen that even though the crossbar has a better Β A
at the beginning, mainly due to its better performance, the
BA of the multiple bus exceeds that of the crossbar after a
certain time. This time can be reduced if we use more buses.
Fig. 9 shows the comparison of the Β A of the two architec­
tures for ρ — 0.5 and m = 0 . 8 . The curves indicate that the
initial BA advantage of the crossbar in this case is less when
924
IEEE TRANSACTIONS ON COMPUTERS, V O L . C-34, N O . 10, OCTOBER
1985
.C=l
6
BA
s
(t)
4
.C=0.9
C=0. 8
C
„
= C
nT K
C
ρ
=
m
C
ο
800
400
Time
2000
1600
1200
(hours)
Fig. 10. Effect of coverage on the Β A of a 16 * 16 * 8 system for a task
requiring eight processors and eight memories, ρ = 1.0, m = l/N.
1000
2000
Time
4000
3000
(hours)
5000
conclude from the data of Table II that the Β A will saturate
after N/2 buses, and so N/2 buses are adequate to handle
communication.
Fig. 8. Bandwidth availability of a 16 * 16 * 8 multiple bus and a 16 * 16
crossbar for a task requiring/ processors and/ memories.
multiple bus,
crossbar, k = \ = 0.0001, k = 0.00005, R (t) = C = C =
C = \,p = 1.0, m = l/N.
p
m
b
a
p
V.
m
PARTIAL-BUS
ARCHITECTURE
b
In this section we generalize the reliability and the B A
expressions, derived for the multiple bus, to cover the partialbus architecture.
It can be seen from Fig. 3 that each group of N/g memories
along with the corresponding B/g buses forms an indepen­
dent submodule. Hence, the reliability expression for the
partial bus will involve g terms to represent the g groups of
memory-bus combinations. If a task needs at least J memo­
ries all the possible distributions of the J memories among g
groups must be considered in the reliability evaluation. Thus,
the reliability of the partial-bus system R p(t) with minimum
/ processors and J M M ' s active is given by
BA ( t )
S
Rsrit) = i c r ' (
i=i
Ν
1000
4000
2000
3000
Time ( h o u r s )
Fig. 9. Β A of a 16 * 16 * 8 multiple bus and a 16 * 16 crossbar for a task
requiring / processors and / memories.
multiple bus,
crossbar,
k = X = 0.0001, λ, = 0.00005,R (t) = C = C = C = I,ρ = 0.5,
m = 0.8.
p
m
a
p
m
b
W)'(l -
M
\
R (t))»-'
p
ι
1
rmin(NGJ)
Σ
Σ
c -"[ )R (t)Hi
N c
b
1=
m
8i
gl
R (t)Hl
L \jk
c
m
j=j I =o
BG
-R (t)r -*>
NG
m
-
gi>0
R>W
i
min^iGJ-gi)
Σ
_.=o
2
)R (t)Hl
V gi I
L
-
m
R (t)) -*>
NG
m
compared to the previous case. Moreover, the B A of the
( i C f ^ ^ R . m i
-R (t)) - >)
g > 0
multiple bus becomes better than the crossbar in relatively
less time, compared to the ρ = 1 case. The effect of imper­
fect coverage on B A of the multiple-bus architecture is illus­
trated in Fig. 1 0 . It is evident that if faults are not detected
and handled properly, the system performance decreases very
ciT«'(*G)jU0''(i
-RAt))" -*'
fast. In Fig. 1 1 we plot the reliability and the Β A of a 1 6 * 1 6
multiple-bus multiprocessor against the number of buses at
Σ cf-**(* )/U0Mi
-R>(t)) - )\
'\ gg>o}}
t = 1 0 0 0 h. It is observed that the variation in the number of
buses has little effect oh reliability, but the B A increases
(17)
linearly up to ten buses. So it is reasonable to have at least
N/2 buses to match the processing power of the multiple bus where NG = N/g is the number of memories and BG = B/g
with that of the crossbar when ρ = 1 . Similarly, we can is the number of buses in each group.
BG k
b
2
0
G
K k
925
DAS A N D BHUYANI MULTIPLE-BUS MULTIPROCESSORS
R
1.0
10|
0.8
8
0.6
6
(t)
1.0
BA (t)
s
0.4
4
0.2
2
0.6
R
12
10
(t)
16
14
Fig. 11. Effect of number of buses on the reliability and Β A of a
16 * 16 * Β multiprocessor at / = 1000 h. ρ = 1.0, m = l/N.
The distribution of the j memories among g groups is
controlled by the last term g where g — j — {g + g +
· · · + &_!). As φ
= 0 for NG < g all possible valid
distributions are generated to get j active memories from g
g r o u p s . T h e s e c o n d t e r m Σ ^ = ι C£ "*>(?f )R (O (l
Rb(t)) ~ ,
for 1 < y < g, in each group expression gives
the probability that at least one bus in that group is working.
This is a conditional term and is only taken into account if
g > 0. If none of the memories from a particular group is
either selected or active for execution of a task at any instant
of time, then the bus availability of that group is irrelevant,
and the second term in that expression becomes unity. The
reliability terms for the g arbiters are not shown in (17), for
clarity. We assume that the arbiters are fault free.
Fig. 12 shows the comparison of reliability between a 16 *
16 * 8 multiple bus and a 16 * 16 * 8 partial-bus architecture
with four groups. The results indicate that the multiple bus
has a better reliability than the partial bus because for the
same processor and memory requirements a single bus can
support all p r o c e s s o r - m e m o r y communications for the
former. But in the case of a partial bus, if the memories are
selected from more than one group, a bus from each of these
groups is necessary to provide communication. It can also be
concluded that the reliability of the partial bus will increase
by reducing the number of groups, and for low bus failure
rate, the reliability of both the systems will be very close.
The BA of the partial bus, BA (t),
at time t is given by
g
g
x
2
g9
c
ky
b
BG
400
ky
y
800
Time
2000
1600
1200
(hours)
Fig. 12. Reliability of a 16 * 16 * 8 multiple bus and a 16 * 16 * 8 partial bus
with g = 4 for a task requiring / processors and / memories.
multiple
bus,
partial bus, \ = k = 0.0001, λ, = 0.00005, C = C =
C = 1.
p
m
p
m
b
y. The Μ and (Ν - M) terms in (8) should be replaced by the
appropriate number of favorite and equally addressable
memories available in group y for i < j . Then the BWG^ can
be obtained from (3) after substituting Β and Ν by k and
g , respectively.
However, for any general reference m and / < j , keeping
the count of the two types of memories in a group, the first
type representing the favorite ones and the second type repre­
senting the equally likely ones, makes the BW computation
cumbersome. Here we compute the BW of the partial-bus
connection only for the equally likely case. Thus, BWG^ is
written similarly to (4) as
y
y
5 (
ly ~ ky) ' Py(iy)
BWG, = g X -
(19)
SP
w h e r e X = 1 - (1 - (p/j))
and p (i )
= (f;)X >(l X) y *y. The g and k represent the number of nonfailed
memories and buses available in a group y at any instant of
time t. If the number of active memories and buses in each
group is the same, then the BW of the partial bus with i
processors and j memories becomes g B W G
Fig. 13 shows the BA variation of the partial-bus and
multiple-bus architectures with time. The BA's of the two
systems differ by 6.5 percent initially (t = 0), and the differ­
ence gradually decreases with time. The degradation in the
BA of the partial bus is due to its lower BW and reliability
compared to the multiple bus. The BW reduction is primarily
because of more bus contention, as only B/g memories can
be served from a group at any time. The Β A of the partial bus
is 7.71 for g = 2 a U = 0. As the reliability of the partial bus
and the multiple bus will match better when g = 2, it is
evident that the difference in the BA of the two systems will
be less compared to the results of Fig. 13. The results will
match even better when ρ < 1.
1
Μ
Ν Γ
BA (i) = Σ Σ
5P
/=/
l
y
y
g
Σ
y
•ΣΡ,
J=J
(BWG, + · · · + BWG,)
(18)
y
r
where Pi, --, (t)
is the probability that the partial bus has /
processors and j M M ' s active, and the j modules are distrib­
uted over g groups with g memories in the yth group. BWG^
is the BW of the yth group, for 1 < y < g, with i processors,
g memories, and the number of buses ky varying from 1 to
BG. The BWGy can be obtained using the equations from
Section II. Since the memory request distribution of i pro­
cessors among j memories is independent of the bus connec­
tion pattern, the probability that there is at least one request
for a memory, for any i and j combination, can be obtained
directly from (1), ( 5 ) - ( 7 ) . Using t h e X ovX a n d X values the
p (iy) is obtained from (2) or (8) wherep (i ) is the probability
that i memory services are requested in a cycle from group
gh
gg
y
y
x
y
y
y
y
2
926
IEEE TRANSACTIONS O N C O M P U T E R S ,
BA ( t )
1000
2000
Time (hours)
3000
4000
Fig. 13. BA of a 16 * 16 * 8 multiple bus and a 16 * 16 * 8 partial bus with
multiple bus,
g = 4 for a task requiring / processors and / memories.
partial bus, λ = k = 0.0001, λ* = 0.00005, C = C = C = 1,
ρ = 1.0, m = l/N.
ρ
m
p
VI.
m
b
CONCLUSIONS
We have discussed the performance and reliability issues
of the multiple-bus and crossbar architectures. Closed form
expressions for the Β W of the multiple bus have been derived
using more general models. Expressions for reliability and
bandwidth availability are presented considering
graceful
degradation. The models are also extended to compute the
reliability and the Β A of the partial-bus configuration. The
results indicate that the reliability of the multiple bus is better
than that of the crossbar. The BA of the multiple bus also
exceeds that of the crossbar after some time, depending on B,
p, and m. The reliability and the BA of the partial bus will be
close to that of the multiple bus when the number of groups
is small.
The results of this paper give an overall view of the fault
V O L . C-34,
N O . 10,
OCTOBER
1985
systems," IEEE Trans. Comput., vol. C-27, pp. 540-547, Jan. 1978.
[3] J. F. Meyer, "On evaluating the performability of degradable computing
systems," IEEE Trans. Comput., vol. C-29, pp. 720-731, Aug. 1980.
[4] W. A. Wulf and C.G. Bell, "C · mmp—A multiminiprocessor," in
Proc. AFIPS Fall Joint Comput. Conf., Dec. 1972, pp. 765-777.
[5] A. Goyal and T. Agerwala, "Performance analysis of future shared stor­
age systems," IBM J. Res. Develop., vol. 28, pp. 95-108, Jan. 1984.
[6] M. Ajmone Marsan and M. Gerla, "Markov models for multiplebus
multiprocessors," IEEE Trans. Comput., vol. C-31, pp. 239-248, Mar.
1982.
[7] T. Lang et al., "Bandwidth of crossbar and multibus connections for
multiprocessors," IEEE Trans. Comput., vol. C-31, pp. 1227-1234,
Dec. 1982.
[8] M. Valero et al., "A performance evaluation of the multiplebus network
for multiprocessor systems," in Proc. ACM SIGMETRICS Conf, 1983,
pp. 200-206.
[9] T. Mudge et al., "Analysis of multiple-bus interconnection networks," in
Proc. Int. Conf. Parallel Processing, Aug. 1984, pp. 228-232.
[10] L. Ν. Bhuyan, "A combinatorial analysis of multibus multiprocessors,"
in Proc. Int. Conf. Parallel Processing, Aug. 1984, pp. 225-227.
[11] K.B. Irani and I.H. Onyuksel, "A closed form solution for the per­
formance analysis of multiple-bus multiprocessor systems," IEEE Trans.
Comput., vol. C-33, pp. 1004-1012, Nov. 1984.
[12] Τ.-Y. Feng, "A survey of interconnection networks," IEEE Comput.,
vol. 14, pp. 12-27, Dec. 1981.
[13] D. P. Bhandarkar, "Analysis of memory interference in multiprocessors,"
IEEE Trans. Comput., vol. C-24, pp. 897-908, Sept. 1975.
[14] A. D. Ingle and D. P. Seiwiorek, "Reliability models for multiprocessor
systems with and without periodic maintenance," in Proc. 7th Annu. Int.
Conf. FTC, Los Angeles, CA, June 1977, pp. 3-9.
[15] K. Hwang and T. P. Chang, "Combinatorial reliability analysis of multi­
processor computers," IEEE Trans. Reliability, vol. R-31,pp. 469-473,
Dec. 1982.
[16] F. A. Gay and M.L. Ketelsen, "Performance evaluation of gracefully
degrading systems," in Proc. 9th Annu. Int. Conf. FTC, Madison, WI,
June 1979, pp. 51-57.
[17] T. C.K. Chou and J.A. Abraham, "Performance/availability model of
shared resource multiprocessors," IEEE Trans. Reliability, vol. R-29,
pp. 70-74, Apr. 1980.
[18] L. Ν. Bhuyan, "An analysis of processor-memory interconnection net­
works," IEEE Trans. Comput., vol. C-34, pp. 279-283, Mar. 1985.
[19] W. Feller, An Introduction to Probability Theory and Its Applications,
Vol. 1. New York: Wiley, 1957.
[20] T. F. Arnold, "The concept of coverage and its effect on the reliability
model of a repairable system," IEEE Trans. Comput., vol. C-22,
pp. 251-254, Mar. 1973.
[21] A. Pedar and V. V. S. Sarma, "Architecture optimization of aerospace
computing systems," IEEE Trans. Comput., vol. C-32, pp. 911-922,
Oct. 1983.
tolerant properties of three possible multiprocessor con­
figurations. However, the selection of a proper architecture
depends on the specific application. For example, if the re­
quirements are such that we need high communication BW
for a short duration, a crossbar seems to be a viable solution.
On the other hand, if the Β A as well as the duration are
important, then the multiple bus provides better character­
istics than the crossbar. It is also interesting to note that when
ρ < 1, the multiple bus looks attractive in view of the flex­
Chita R. Das (S'84) received the M.Sc. degree in
electrical engineering from Regional Engineering
College, Rourkela, Sambalpur University, India, in
1981 and is currently working towards the Ph.D.
degree in computer science at The Center for
Advanced Computer Studies, University of South­
western Louisiana, Lafayette.
His research interests include parallel/distrib­
uted computing, performance evaluation, and fault
tolerance.
ibility to choose the number of buses depending on ρ and m.
If the BA requirements are not stringent, the partial-bus con­
nection can be a cost-effective alternative to the multiple bus.
ACKNOWLEDGMENT
The authors would like to thank Prof. V. V. S. Sarma for
his useful comments.
REFERENCES
[1] K. Trivedi, Probability and Statistics with Reliability, Queuing, and
Computer Science Applications. Englewood Cliffs, NJ: Prentice-Hall,
1982.
[2] M. D. Beaudry, "Performance related reliability measures for computing
Laxmi N. Bhuyan (S'81-M'83) received the M.Sc.
degree in electrical engineering from Regional Engi­
neering College, Rourkela, Sambalpur University,
India, in 1979 and the Ph.D. degree in computer
engineering from Wayne State University, Detroit,
MI, in 1982.
During 1982-1983 he was with the Department of
Electrical Engineering, University of Manitoba,
Winnipeg, Canada. At present he is an Associate
Professor at The Center for Advanced Computer
Studies, University of Southwestern Louisiana,
Lafayette. His research interests include parallel and distributed computer
architecture, VLSI layout, multiprocessor applications, and performance
evaluation.
Dr. Bhuyan is a member of the Association for Computing Machinery.
Download