918 IEEE TRANSACTIONS ON COMPUTERS, VOL. C-34, NO. 10, OCTOBER 1985 Bandwidth Availability of Multiple-Bus Multiprocessors CHITA R. D A S , STUDENT MEMBER, IEEE, AND LAXMI N . BHUYAN, MEMBER, IEEE Abstract—Multiprocessor systems should be designed consid­ ering both performance and reliability issues. They should sup­ port graceful degradation by isolating the failed components and by reconfiguring to a new state with decreased performance. We present in this paper the effect of failures on the performance of multiple-bus multiprocessors. Bandwidth expressions for this architecture are derived for uniform and nonuniform memory references. Mathematical models are developed to compute the reliability and the performance related bandwidth availability (BA). The results obtained for the multiple-bus interconnection are compared with those of a crossbar. The models are also ex­ tended to analyze the partial bus structure where the memories are divided into groups and each group is connected to a subset of buses. The reliability and the BA of the multiple-bus and partialbus architectures are compared. perfect interconnection between them. However, these mini­ mum numbers can be increased depending on the particular task to be executed. This paper addresses two important issues of fault toler­ ance, namely, reliability and bandwidth availability (BA). Reliability [1] of a system at time t is defined as the proba­ bility that the system is operational during (0, t). We define the BA of a gracefully degrading multiprocessor as the ex­ pected value of available BW in the system at time t. This is a performance related reliability attribute similar to computation availability ( C A ) , defined by Beaudry [2]. Performability [3] is another criterion that captures the de­ pendability of degradable computer systems quite well. We consider two types of multiprocessors: one with a Index Terms—Bandwidth availability, crossbar, graceful deg­ radation, multiple bus, multiprocessor, partial bus, performance crossbar [4] and the other with a multiple-bus [ 5 ] - [ l l ] inter­ analysis, reliability. connection. These two interconnections possess good per­ formance characteristics [6]—[13] and are suitable for easy expansion of the system to support increasing load. The I. INTRODUCTION multiple-bus architecture has an added advantage of fault HE ever increasing need for higher computing power, tolerance in the sense that there exist alternate paths between coupled with the advent of VLSI technology, has re­ a processor and a memory for use in case of faults. Fig. 1 sulted in the evolution of multiprocessors. The performance shows an Μ * Ν * Β multiple-bus architecture having evaluations of various multiprocessors have been reported Μ processors, Ν memory modules, and Β buses where extensively using analytic and simulation models. Most of Β < min(M,A0. A bus is connected to all the processors and the models use bandwidth (BW) as a performance metric to all the memory modules. The arbiter cyclically allocates a where BW is defined as the average number of memory bus to a memory that has an outstanding request. Thus, Β modules (MM's) remaining busy in a cycle. These models processors can be connected to Β memories at a time. The implicitly assume that all the components of the system are cost of such an interconnection is 0(B(M + N)). Fig. 2 fault free. However, in a real situation the components of a shows an Μ * Ν crossbar system with Μ processors and multiprocessor fail at random. The failure of the processors, Ν M M ' s . There are Ν buses; a bus is connected to all the memory modules, and interconnection links degrades the processors, but to only one memory. The cost of the crossbar performance of the system. So the simultaneous consid­ is 0(M * N). The cost of an Ν * Ν * Β multiple bus with eration of both the performance and the fault tolerance issues N/2 buses is the same as that of an TV * Ν crossbar. A modi­ is very important to properly evaluate a multiprocessor. It is fication of the multiple-bus multiprocessor, that has been highly unreasonable to assume that a multiprocessor should proposed by Lang [7] to provide better cost effectiveness, is fail due to the failure of a single component. Rather, the known as the partial-bus architecture. Fig. 3 depicts a partialsystem should be able to detect any faulty module and should bus system having Μ processors, TV memories, and Β buses. have the ability to reconfigure and operate in a degraded The memories are divided into g groups. All the processors mode with fewer available resources. The graceful degrada­ are connected to all the b u s e s , whereas each group of tion ability implies that the system remains operational as (N/g) M M ' s is connected to a set of (B/g) buses. There are long as the minimum number of resources needed to execute g arbiters to allocate buses for communication. It is assumed a task are available. In order to obtain a valid multiprocessor that g is a factor of both Β and N. The partial-bus connection there should be at least two processors, two memories, and a has a lower cost compared to an Μ * Ν * Β multiple-bus system, as the cost of the former is 0(B(M + (N/g))). The effects of processor, memory, and bus failures on the re­ Manuscript received February 1, 1985; revised May 30, 1985. A portion of this work was presented at the Fourteenth International Conference on Parallel liability and the Β A of these architectures are analyzed. T Processing, St. Charles, IL* Aug. 1985. The authors are with the Center for Advanced Computer Studies, University of Southwestern Louisiana, Lafayette, LA 70504. Ingle and Siewiorek [14] have presented the reliability models for C.mmp and Cm* structures. They have consid- 0018-9340/85/1000-0918$01.00 © 1985 IEEE 919 DAS AND BHUYAN: MULTIPLE-BUS MULTIPROCESSORS MM, MM- S 5 P Fig. 1. • · · MM M M ' s through an IN and the failure of the IN also degrades the system performance. In this paper we give more realistic models for the re­ liability and combine both performance and reliability to analyze the BA of the multiple-bus and crossbar archi­ tectures. In the next section we consider the performance analysis of the multiple-bus system for both uniform and nonuniform memory references. The results obtained in this section are used to derive expressions for the bandwidth availability. In Section III we develop the reliability models for multiple-bus and crossbar architectures. Section IV deals with the Β A analysis of the two systems. In Section V we extend the reliability and the Β A models, developed for the multiple-bus structure, to evaluate the fault tolerant charac­ teristics of the partial-bus architecture. The last section sum­ marizes the results. X1 I • · · 2 An Μ * Ν * Β multiple-bus multiprocessor. MM MM. MM- V1 II. Ξ Fig. 2. An Μ * Ν crossbar multiprocessor. group 1 MM 1 •· · Fig. 3. MM N g "TK-N+D s •·· MM N AnM * Ν * Β partial-bus multiprocessor with g groups. ered processor and memory failures assuming that the inter­ connection network. (IN) is not degradable. Hwang and Chang [15] have analyzed the reliability of multiprocessors using graph models. These papers are based on classical reliability theory and do not consider the performance aspect of the m u l t i p r o c e s s o r s . G a y and K e t e l s e n [16] h a v e presented capacity and workload models for gracefully de­ grading multiprocessor systems for processors handling mul­ tiple transactions. Chou and Abraham [17] have analyzed the performance and availability measures of shared resource multiprocessors using resource guardian. However, the above two models do not represent a tightly coupled multi­ processor system, where the processors are connected to the PERFORMANCE ANALYSIS Performance analyses of the multiple-bus multiprocessor have been reported recently in several papers [5]—[11]. All the above analyses are based on the assumption that a processor addresses any one of the common memories with the same probability. However, in a practical situation a processor is likely to address a particular memory more frequently except when an interprocessor communication is necessary. So we introduce a parameter m, which is defined as the probability that a processor P, sends a request to mem­ ory MM, provided P, generates a request. We will assume that we have prior knowledge of the parameter m. When m = l/N the model reduces to the equally likely case, ana­ lyzed before [5]-[ 11]. For m > l / N , a processor P, commu­ nicates more often with memory MM/, and MM,- is called a favorite memory of P,. So m is a general parameter that includes both equally likely and favorite memory cases. The memory interference analysis of the crossbar architecture using this general modeling has been reported in [18]. We apply the same idea to derive the BW expressions for an Μ * Ν * Β multiple-bus architecture. The analysis is based on the following assumptions. 1) The operation is synchronous; i.e., the requests issued by the processors begin and end simultaneously. 2) The requests generated in a cycle are random and are independent of one another. 3) The requests issued in successive cycles are indepen­ dent of the requests issued in the previous cycle. 4) Requests which are not accepted are rejected. The third assumption is unrealistic because a rejected re­ quest will indeed be resubmitted in the next cycle. However, this assumption leads to simpler analysis, and it does not result in a substantial difference in the actual results [13]. Let ρ be the probability with which a processor generates a request in every cycle and m be the probability with which a processor P, addresses memory MM/ given thatP, generates a request. Thus, pm is the rate of request of a processor P, to MM,. For an Μ * Ν * Β system, as shown in Fig. 1, with Β < min(M,A0 there will be contention for memories as well as for buses. So the model involves a two-stage inter- IEEE TRANSACTIONS ON COMPUTERS, VOL. C-34, NO. 10, OCTOBER 1985 920 ference resolution as compared to only the memory inter­ ference analysis for a crossbar [18]. Moreover, since m is used to relate a processor-memory pair, we consider three different situations: Μ = ΝΜ > Ν', and Μ < Ν. The probability that there is at least one request for MM, is Α. For Μ = Ν and m = \/N, (5) reduces to X = 1 - (1 (p/N)) . Μ * Ν * Β Multiple Bus with Μ = Ν '-'-«--('-^Γ('-ίΓ» N With ρ and m defined as above, the probability of P, re­ questing MM,, for 1 < / < N, is given by P, —> MM, = pm. Hence, the probability that P, does not request MM, is P, -h MM, = 1 - pm. For i Φ j the probability that Pj has a request for MM, i s P , - » MM, = p((l - m)/(N - 1)). So P;-/> MM, = 1 - p{{\ - m)/(N - 1)). Now the proba­ bility that none of the Ν processors has a request for MM, is (1 - pm){\ - p(l - m)/(N - l)) ~ . From this we can write the probability that there is at least one request for MM,- as N l X = \ - { \ - p m ) [ \ - p j ^ N ~ \ (1) The B W is simply obtained by using (2), (4), and (5). C. Μ * Ν * Β Multiple Bus with Μ < Ν This is a reverse situation of case II-B, as the memories are now divided into two groups. The first group consists of Μ memories related to the processors with a request rate m, and the second group consists of (Ν - M) memories, which can be equally addressed by any of the Μ processors. Cor­ respondingly, there will be two distinct probabilities of requests for the two memory groups. Let X\ represent the probability that a memory belonging to the first group is requested and X be the probability that a memory be­ longing to the second group is requested. From Section II-A we can write 2 F o r m = l/N, (1) reduces to 1 - (1 (p/N)) . Given the probability of a processor requesting a memory by (1), the probability that exactly i memory services are requested in a cycle is given by [19]: N p(i) = (^)x'(l "JO" ". -1 (2) / 1 - m\ ~ X ^ l - i l - p m ^ l - p j j — j ) . M l (6) The probability with which a processor requests a memory in the second group is (1 - m)/(N - 1). Hence, X is given by 2 / 1 - m\ The multiple-bus system with Β buses can allow at best Β processor-memory connections per cycle. The system gets Λ - ' - Ι ' - ' Τ Γ Γ Ϊ ) · m saturated when more than Β requests are generated and al­ To find the probability that exactly / memory requests are lows only Β connections simultaneously. Thus, the BW of satisfied in a cycle we have to consider all the possible distri­ the system is expressed as butions of / requests between the two memory groups. There­ Β Ν fore, p(i) is expressed as BW = Σ < ·/>(<) + Σ Β -p(i) M M M ) i=\ ζ=β+1 Ν Ν = Σ » 'Ρϋ) ~ 1=1 Σ mm(N-M,i) (« -Β)·ρ{ΐ) (3) Pd)= /„ _ w \ \ . J Σ 7=0 ι'=β+1 U(i ' - xr ~ M J 2 Ν = ΝΧ - Σ (ί - Β)·ρ(ϊ). (4) i=B+l The first term in (4) is the BW of an Ν * Ν crossbar taking ρ and m into consideration, and the second term gives the degradation in performance due to bus insufficiency. The first term is the same as that derived for the crossbar in [18]. Β. Μ * Ν * Β Multiple Bus with Μ > Ν where j is the number of memory references satisfied from the second group of (TV - M ) , and (/ - j) are memory re­ quests satisfied from the first group. For Μ = N (8) reduces to (2). For m = l/N we find that X = X = X, and (8) reduces to 9 { 2 This system is modeled as Ν processor-memory pairs re­ lated with probability of reference m and the remaining (Μ - N) processors having equal addressing capability to any one of the memories. So there are two groups of pro­ cessors: one group having probability of reference m and the other group with equal access probability of l/N. The BW then becomes Now the probability that none of the processors from the Β Ν first group requests M M , is (1 - pm){\ — p(l — m)/ BW = Σί' · Ρ ( 0 + Σ Β (N - \)) ~ . The probability that none of the processors 1=1 /=Β+1 from the second group requests MM, is (1 (p/N)) ~ . = MX, + (Ν — Μ)Χ Hence, the probability that none of the Μ processors has a Ν request for MM, is - Σ (/ -Β)·ρ(ί). N M)VB l M ·ρ(ί) N 2 (9) /=β+1 The first two terms in (9) represent the BW of an M * Ν 921 DAS AND BHUYAN: MULTIPLE-BUS MULTIPROCESSORS TABLE I BW OF AN Ν * Ν SYSTEM WITH CROSSBAR AND MULTIPLE Bus, ρ = Β NUMBER OF BUSES N=4 m = N=8 N=12 1 N m=0.8 m= 1 N" m=0.8 1 1.0 N=16 m=0.8 m = 1 N m=0.8 1 0.99 1.00 1.00 1.00 1.00 1.00 1 00 1.00 2 1.89 1.98 2.00 2.00 2.00 2.00 2 00 2.00 3 2.51 2.86 2.97 3.00 3.00 3.00 3 00 3.00 4 2.73 3.35 3.87 4.00 3.99 4.00 4 00 4.00 5 4.59 4.97 4.97 5.00 5 00 5.00 6 5.04 5.84 5.88 6.00 5 99 6.00 7 5.22 6.45 6.66 6.99 6 97 7.00 8 5.25 6.69 7.24 7.96 7 89 8.00 9.00 9 7.58 8.84 8 72 10 7.73 9.53 9 39 9.99 11 7.77 9.92 9 86 10.95 12 7.78 10.03 10 13 11.85 13 10 25 12.59 14 10 29 13.09 15 10 30 13.33 16 10 30 13.38 10.30 13.38 N*N 2.73 5.25 3.35 6.69 7.78 10.03 CROSSBAR J crossbar for Ν ^ Μ, taking ρ and m into consideration. Tables I and II compare the BW of a crossbar and multiplebus interconnection for an Ν * Ν system for various values of Ν and Β with m assigned the values 1 /N and 0.8 for ρ = 1 and 0.5, respectively. The results show that for the equally likely case the multiple bus performs close to a crossbar when Β « N/2. For ρ = 1 and m = 0.8 a processor re­ quests a particular memory most of the time, thereby reduc­ ing the memory access conflicts. The BW of the system is then very much bus dependent. However, for ρ - 0.5 and m - 0.8 it can be observed that the B W ' s of the two architec­ tures differ by only 7 percent when Β - N/2. The results indicate that the number of buses for a multiple-bus system should be determined by taking both ρ and m into consid­ eration. If a processor generates a request in every cycle, then the multiple bus should have at least N/2 buses to provide comparable performance with that of a crossbar. When ρ is less than 1, the multiple bus is more flexible because of having only the required number of buses. The crossbar in this case is clearly underutilized. III. RELIABILITY MODELING TABLE II BW OF AN Ν * Ν SYSTEM WITH CROSSBAR AND MULTIPLE Bus, Ρ = 0 . 5 N=4 Β NUMBER OF BUSES m=77 Ν m=0.8 m = 1 0.88 0.91 2 1.43 1.54 m b km m b 1 N N= 16 1 m=0.8 m = 0.98 0.99 1.00 1.00 1 00 1 00 1.88 1.93 1.98 1.99 2 00 2 00 2 99 m=0.8 m-=0.8 3 1.63 1.79 2.57 2.73 2.89 2.95 2 98 4 1.66 1.83 2.99 3.27 3.67 3.83 3 91 3 97 3.16 3.54 4.23 4.54 4 74 4 89 5 6 3.22 3.64 4.57 5.03 5 41 5 71 7 3.23 3.66 4.72 5.32 5 87 6 37 8 3.23 3.66 4.78 5.44 6 15 6 83 9 4.80 5.48 6 29 7 10 10 4.80 5.49 6 35 7 24 11 4.80 5.49 6 37 7 29 12 4.80 5.49 6 37 7 31 13 6 37 7 32 14 6 37 7 32 15 6 37 7 32 16 6 37 7 32 6 37 7 32 N*N CROSSBAR 1.66 1.83 S p Ν =12 3.23 3.66 R M(t) = R (t) The multiple-bus structure of Fig. 1 is divided into three independent submodules: processors, buses, and memories. The reliabilities of these submodules ate determined indepen­ dently. The system reliability is then obtained by considering the series reliability of these submodules. We assume that the elements of a submodule are all identical and have the same failure rate. The failures are assumed to be exponentially distributed for simplicity. Thus, we define λ^, A , and \ as the failure rate of a processor, a memory, and a bus, re­ spectively. Then R (t) = e~V, R (t) = e~ \ and R (t) = e~ give the corresponding reliabilities. If a task needs at least / processors, J memories for execution, and a bus for communication, the reliability of the multiple-bus system RSM(0 is given by Xbt N=8 1 N a • i c r ^ ) ( / ? m 4.80 5.49 . j ~ ( r ) H l -R (t)) - N J m where C , C , and C are the coverage factors for the pro­ cessor, memory, and b u s , respectively, and R {t) is the reliability of the arbiter. Coverage [20] is defined as the probability that the system recovers successfully given that there was a failure. A crossbar is modeled in this paper as shown in Fig. 2. As a bus is connected to only one memory, the failure of a bus or a memory reduces the size of the crossbar to Μ * (Ν - 1). p m b a 922 IEEE TRANSACTIONS ON COMPUTERS, V O L . C-34, N O . 10, OCTOBER 1985 Hence, the reliability of a memory module is expressed as R' (t) = R (t)R (t) = e' \ with an equivalent failure rate A = (A 4- X ). The reliability of the crossbar system R c(0 with minimum / processors and J memories active is then Xe m e Rsc(t) m b m b S = J ? ( f ) £ c W ) (W)'O W e i=I \ 1 - Λ,Ο)""" / • icr{^)(/?;(i)Wi (ID -R' (t)r-' m where C is the coverage of a memory-bus combination. Fig. 4 shows the comparison of reliability between a mul­ tiple bus and a crossbar system for the same processor and memory requirements. The reliability of the arbiter and the coverage parameters is assumed to be unity in this figure. The original c o n f i g u r a t i o n s are a 16 * 16 c r o s s b a r and a 16 * 16 * 8 multiple-bus system, both having approximately the same cost. The multiple bus has a better reliability than 400 800 1200 1600 2000 the crossbar because in the former the buses are independent, Time ( h o u r s ) and only one bus may be sufficient for keeping the system Fig. 4. Reliability of a 16 * 16 * 8 multiple bus and a 16 * 16 crossbar for a task requiring / processors and / memories. multiple bus, operational. However, in a crossbar the stipulation of the - - - - crossbar, k = X = 0.0001, λ* = 0.00005, R {t) = C = C = memory-bus configuration is quite rigid. Also, it may be observed in Fig. 4 that the reliability of the multiprocessors increases dramatically if a task can be executed with fewer processors and memories than the original configuration. we write e p P (t) ijk IV. BANDWIDTH AVAILABILITY ( B A ) ANALYSIS The reliability measures discussed in Section III are not sufficient to evaluate a multiprocessor system. They only give the probability that the system is operational at a time t, but do not reveal the performance degradation due to failures in (0, t). Since the basic objective of the multiprocessors is to provide high performance, the variation of available B W with time should be used as a criterion to evaluate various multiprocessors. Β A gives the expected amount of bandwidth available on the system at time To calculate the B A we use the Markov model representation of the multiple-bus archi­ tecture, as shown in Fig. 5, for a task requiring at least / processors, J memories, and a bus. A state k) means that there are i processors, j memories, and k buses operational in the system starting with the initial configuration (Μ, Ν, Β). A state is defined as operational as long as the minimum processor, memory, and bus requirements are satisfied. Tran­ sition from a state (i,j,k) to a n y o n e of t h e states (i - \J,k),(iJ - 1, k), or (1,7, k - 1) can occur if a pro­ cessor, a memory, or a bus fails and the failure is covered. If a failure is not covered the system goes to the failed state ( F ) . The transition diagram of the system in state k) is shown in Fig. 6. Let P (t) be the probability that the system is in state (ij, k) at time t. The differential equation for the state (ij,k) is given by ijk P (t) ijk = [(/ + l)C \ P (t) p p + (j + i+lJyk + (* + l ) C * A P fc iiM+1 l)C \ P (t) ( f ) - (i\ +jX m m p m + m a = (^)c -'(^(i))'(l - m R (t)r-' M p P • (^)c^(/? (i)V(l - R (t))"- J m • (*)crW/))'(l p m -R„(t)) - . B k (12) The system reliability, obtained in (10), can be derived again by summing the probabilities of all the nonfailed states. So, Μ Ν Β Λ«(0 = Σ Σ Σ ν θ · /=/ j=J (13) k=l To determine the B A , we sum the expected B W of all the nonfailed states. Hence, the B A of the multiple-bus system B A ( i ) at time t is expressed as w Μ Ν Β Β Α « ( ί ) = Σ Σ Σ Prnit) · BW,;* (14) ι=/ j=J * = 1 where B W is the B W of the multiple-bus architecture hav­ ing i processors, j memories, and k buses. It can be obtained from (3) using appropriate p(i) for the three possible ij combinations. The Markov model for the crossbar system is shown in Fig. 7. Here the trahsition rate from a state (1,7) to state (1,7 - 1) is given by j \ since either a bus or a memory failure can result in this transition. Extending the same idea for the crossbar structure, the BA (t) is expressed as 0JT e sc iJ+uk Μ kX )P (t)] b Ν BA (i) = I E W B W ( ijk sc for / < i < M, J < j < N, and 1 < k < B. The equations for the boundary states can also be easily determined. Now the probability of state (ij, k) can be obtained [21]. Hence, /=/ t f (15) j=j where Ρ (ί) is the probability that the system is in state (1,7), and BW/, is the B W of an / * 7 crossbar. P^t) is given by ί} DAS A N D B H U Y A N : MULTIPLE-BUS Ιλ +Νλ ρ m B X (1-C )+ m b' 1 c ) b Fig. 5. ix (i-c ) jX (ip Fig. 6. p + n C i i i ) kX (i-c ) + 923 MULTIPROCESSORS b Markov model of the Μ * Ν * Β multiple-bus multiprocessor. (7 b Transition probabilities of the Μ * Ν * Β system in state Μ Pub) = ι . j c r w o m Ν - k). R (t)) h P c r > ( J C ( ' M i - ٦'Λι)) -\ Ν (16) and the BW is given by BW e = if i > j l « , + (j - i)X 2 if ι < y . The proper X orX\ a n d X values are obtained from (1), ( 5 ) (7) for the various i,j combinations. In Fig. 8 we present the variation of the Β A with time for different processor and memory requirements of a task. We use Β = N/2 for the multiple bus because the performance of the multiple bus is then close to that of a crossbar. The BW is calculated for a probability of request equal to 1 and m = 2 Fig. 7. Markov model of an Μ * Ν multiprocessor employing crossbar. 1 /N. It is seen that even though the crossbar has a better Β A at the beginning, mainly due to its better performance, the BA of the multiple bus exceeds that of the crossbar after a certain time. This time can be reduced if we use more buses. Fig. 9 shows the comparison of the Β A of the two architec­ tures for ρ — 0.5 and m = 0 . 8 . The curves indicate that the initial BA advantage of the crossbar in this case is less when 924 IEEE TRANSACTIONS ON COMPUTERS, V O L . C-34, N O . 10, OCTOBER 1985 .C=l 6 BA s (t) 4 .C=0.9 C=0. 8 C „ = C nT K C ρ = m C ο 800 400 Time 2000 1600 1200 (hours) Fig. 10. Effect of coverage on the Β A of a 16 * 16 * 8 system for a task requiring eight processors and eight memories, ρ = 1.0, m = l/N. 1000 2000 Time 4000 3000 (hours) 5000 conclude from the data of Table II that the Β A will saturate after N/2 buses, and so N/2 buses are adequate to handle communication. Fig. 8. Bandwidth availability of a 16 * 16 * 8 multiple bus and a 16 * 16 crossbar for a task requiring/ processors and/ memories. multiple bus, crossbar, k = \ = 0.0001, k = 0.00005, R (t) = C = C = C = \,p = 1.0, m = l/N. p m b a p V. m PARTIAL-BUS ARCHITECTURE b In this section we generalize the reliability and the B A expressions, derived for the multiple bus, to cover the partialbus architecture. It can be seen from Fig. 3 that each group of N/g memories along with the corresponding B/g buses forms an indepen­ dent submodule. Hence, the reliability expression for the partial bus will involve g terms to represent the g groups of memory-bus combinations. If a task needs at least J memo­ ries all the possible distributions of the J memories among g groups must be considered in the reliability evaluation. Thus, the reliability of the partial-bus system R p(t) with minimum / processors and J M M ' s active is given by BA ( t ) S Rsrit) = i c r ' ( i=i Ν 1000 4000 2000 3000 Time ( h o u r s ) Fig. 9. Β A of a 16 * 16 * 8 multiple bus and a 16 * 16 crossbar for a task requiring / processors and / memories. multiple bus, crossbar, k = X = 0.0001, λ, = 0.00005,R (t) = C = C = C = I,ρ = 0.5, m = 0.8. p m a p m b W)'(l - M \ R (t))»-' p ι 1 rmin(NGJ) Σ Σ c -"[ )R (t)Hi N c b 1= m 8i gl R (t)Hl L \jk c m j=j I =o BG -R (t)r -*> NG m - gi>0 R>W i min^iGJ-gi) Σ _.=o 2 )R (t)Hl V gi I L - m R (t)) -*> NG m compared to the previous case. Moreover, the B A of the ( i C f ^ ^ R . m i -R (t)) - >) g > 0 multiple bus becomes better than the crossbar in relatively less time, compared to the ρ = 1 case. The effect of imper­ fect coverage on B A of the multiple-bus architecture is illus­ trated in Fig. 1 0 . It is evident that if faults are not detected and handled properly, the system performance decreases very ciT«'(*G)jU0''(i -RAt))" -*' fast. In Fig. 1 1 we plot the reliability and the Β A of a 1 6 * 1 6 multiple-bus multiprocessor against the number of buses at Σ cf-**(* )/U0Mi -R>(t)) - )\ '\ gg>o}} t = 1 0 0 0 h. It is observed that the variation in the number of buses has little effect oh reliability, but the B A increases (17) linearly up to ten buses. So it is reasonable to have at least N/2 buses to match the processing power of the multiple bus where NG = N/g is the number of memories and BG = B/g with that of the crossbar when ρ = 1 . Similarly, we can is the number of buses in each group. BG k b 2 0 G K k 925 DAS A N D BHUYANI MULTIPLE-BUS MULTIPROCESSORS R 1.0 10| 0.8 8 0.6 6 (t) 1.0 BA (t) s 0.4 4 0.2 2 0.6 R 12 10 (t) 16 14 Fig. 11. Effect of number of buses on the reliability and Β A of a 16 * 16 * Β multiprocessor at / = 1000 h. ρ = 1.0, m = l/N. The distribution of the j memories among g groups is controlled by the last term g where g — j — {g + g + · · · + &_!). As φ = 0 for NG < g all possible valid distributions are generated to get j active memories from g g r o u p s . T h e s e c o n d t e r m Σ ^ = ι C£ "*>(?f )R (O (l Rb(t)) ~ , for 1 < y < g, in each group expression gives the probability that at least one bus in that group is working. This is a conditional term and is only taken into account if g > 0. If none of the memories from a particular group is either selected or active for execution of a task at any instant of time, then the bus availability of that group is irrelevant, and the second term in that expression becomes unity. The reliability terms for the g arbiters are not shown in (17), for clarity. We assume that the arbiters are fault free. Fig. 12 shows the comparison of reliability between a 16 * 16 * 8 multiple bus and a 16 * 16 * 8 partial-bus architecture with four groups. The results indicate that the multiple bus has a better reliability than the partial bus because for the same processor and memory requirements a single bus can support all p r o c e s s o r - m e m o r y communications for the former. But in the case of a partial bus, if the memories are selected from more than one group, a bus from each of these groups is necessary to provide communication. It can also be concluded that the reliability of the partial bus will increase by reducing the number of groups, and for low bus failure rate, the reliability of both the systems will be very close. The BA of the partial bus, BA (t), at time t is given by g g x 2 g9 c ky b BG 400 ky y 800 Time 2000 1600 1200 (hours) Fig. 12. Reliability of a 16 * 16 * 8 multiple bus and a 16 * 16 * 8 partial bus with g = 4 for a task requiring / processors and / memories. multiple bus, partial bus, \ = k = 0.0001, λ, = 0.00005, C = C = C = 1. p m p m b y. The Μ and (Ν - M) terms in (8) should be replaced by the appropriate number of favorite and equally addressable memories available in group y for i < j . Then the BWG^ can be obtained from (3) after substituting Β and Ν by k and g , respectively. However, for any general reference m and / < j , keeping the count of the two types of memories in a group, the first type representing the favorite ones and the second type repre­ senting the equally likely ones, makes the BW computation cumbersome. Here we compute the BW of the partial-bus connection only for the equally likely case. Thus, BWG^ is written similarly to (4) as y y 5 ( ly ~ ky) ' Py(iy) BWG, = g X - (19) SP w h e r e X = 1 - (1 - (p/j)) and p (i ) = (f;)X >(l X) y *y. The g and k represent the number of nonfailed memories and buses available in a group y at any instant of time t. If the number of active memories and buses in each group is the same, then the BW of the partial bus with i processors and j memories becomes g B W G Fig. 13 shows the BA variation of the partial-bus and multiple-bus architectures with time. The BA's of the two systems differ by 6.5 percent initially (t = 0), and the differ­ ence gradually decreases with time. The degradation in the BA of the partial bus is due to its lower BW and reliability compared to the multiple bus. The BW reduction is primarily because of more bus contention, as only B/g memories can be served from a group at any time. The Β A of the partial bus is 7.71 for g = 2 a U = 0. As the reliability of the partial bus and the multiple bus will match better when g = 2, it is evident that the difference in the BA of the two systems will be less compared to the results of Fig. 13. The results will match even better when ρ < 1. 1 Μ Ν Γ BA (i) = Σ Σ 5P /=/ l y y g Σ y •ΣΡ, J=J (BWG, + · · · + BWG,) (18) y r where Pi, --, (t) is the probability that the partial bus has / processors and j M M ' s active, and the j modules are distrib­ uted over g groups with g memories in the yth group. BWG^ is the BW of the yth group, for 1 < y < g, with i processors, g memories, and the number of buses ky varying from 1 to BG. The BWGy can be obtained using the equations from Section II. Since the memory request distribution of i pro­ cessors among j memories is independent of the bus connec­ tion pattern, the probability that there is at least one request for a memory, for any i and j combination, can be obtained directly from (1), ( 5 ) - ( 7 ) . Using t h e X ovX a n d X values the p (iy) is obtained from (2) or (8) wherep (i ) is the probability that i memory services are requested in a cycle from group gh gg y y x y y y y 2 926 IEEE TRANSACTIONS O N C O M P U T E R S , BA ( t ) 1000 2000 Time (hours) 3000 4000 Fig. 13. BA of a 16 * 16 * 8 multiple bus and a 16 * 16 * 8 partial bus with multiple bus, g = 4 for a task requiring / processors and / memories. partial bus, λ = k = 0.0001, λ* = 0.00005, C = C = C = 1, ρ = 1.0, m = l/N. ρ m p VI. m b CONCLUSIONS We have discussed the performance and reliability issues of the multiple-bus and crossbar architectures. Closed form expressions for the Β W of the multiple bus have been derived using more general models. Expressions for reliability and bandwidth availability are presented considering graceful degradation. The models are also extended to compute the reliability and the Β A of the partial-bus configuration. The results indicate that the reliability of the multiple bus is better than that of the crossbar. The BA of the multiple bus also exceeds that of the crossbar after some time, depending on B, p, and m. The reliability and the BA of the partial bus will be close to that of the multiple bus when the number of groups is small. The results of this paper give an overall view of the fault V O L . C-34, N O . 10, OCTOBER 1985 systems," IEEE Trans. Comput., vol. C-27, pp. 540-547, Jan. 1978. [3] J. F. Meyer, "On evaluating the performability of degradable computing systems," IEEE Trans. Comput., vol. C-29, pp. 720-731, Aug. 1980. [4] W. A. Wulf and C.G. Bell, "C · mmp—A multiminiprocessor," in Proc. AFIPS Fall Joint Comput. Conf., Dec. 1972, pp. 765-777. [5] A. Goyal and T. Agerwala, "Performance analysis of future shared stor­ age systems," IBM J. Res. Develop., vol. 28, pp. 95-108, Jan. 1984. [6] M. Ajmone Marsan and M. Gerla, "Markov models for multiplebus multiprocessors," IEEE Trans. Comput., vol. C-31, pp. 239-248, Mar. 1982. [7] T. Lang et al., "Bandwidth of crossbar and multibus connections for multiprocessors," IEEE Trans. Comput., vol. C-31, pp. 1227-1234, Dec. 1982. [8] M. Valero et al., "A performance evaluation of the multiplebus network for multiprocessor systems," in Proc. ACM SIGMETRICS Conf, 1983, pp. 200-206. [9] T. Mudge et al., "Analysis of multiple-bus interconnection networks," in Proc. Int. Conf. Parallel Processing, Aug. 1984, pp. 228-232. [10] L. Ν. Bhuyan, "A combinatorial analysis of multibus multiprocessors," in Proc. Int. Conf. Parallel Processing, Aug. 1984, pp. 225-227. [11] K.B. Irani and I.H. Onyuksel, "A closed form solution for the per­ formance analysis of multiple-bus multiprocessor systems," IEEE Trans. Comput., vol. C-33, pp. 1004-1012, Nov. 1984. [12] Τ.-Y. Feng, "A survey of interconnection networks," IEEE Comput., vol. 14, pp. 12-27, Dec. 1981. [13] D. P. Bhandarkar, "Analysis of memory interference in multiprocessors," IEEE Trans. Comput., vol. C-24, pp. 897-908, Sept. 1975. [14] A. D. Ingle and D. P. Seiwiorek, "Reliability models for multiprocessor systems with and without periodic maintenance," in Proc. 7th Annu. Int. Conf. FTC, Los Angeles, CA, June 1977, pp. 3-9. [15] K. Hwang and T. P. Chang, "Combinatorial reliability analysis of multi­ processor computers," IEEE Trans. Reliability, vol. R-31,pp. 469-473, Dec. 1982. [16] F. A. Gay and M.L. Ketelsen, "Performance evaluation of gracefully degrading systems," in Proc. 9th Annu. Int. Conf. FTC, Madison, WI, June 1979, pp. 51-57. [17] T. C.K. Chou and J.A. Abraham, "Performance/availability model of shared resource multiprocessors," IEEE Trans. Reliability, vol. R-29, pp. 70-74, Apr. 1980. [18] L. Ν. Bhuyan, "An analysis of processor-memory interconnection net­ works," IEEE Trans. Comput., vol. C-34, pp. 279-283, Mar. 1985. [19] W. Feller, An Introduction to Probability Theory and Its Applications, Vol. 1. New York: Wiley, 1957. [20] T. F. Arnold, "The concept of coverage and its effect on the reliability model of a repairable system," IEEE Trans. Comput., vol. C-22, pp. 251-254, Mar. 1973. [21] A. Pedar and V. V. S. Sarma, "Architecture optimization of aerospace computing systems," IEEE Trans. Comput., vol. C-32, pp. 911-922, Oct. 1983. tolerant properties of three possible multiprocessor con­ figurations. However, the selection of a proper architecture depends on the specific application. For example, if the re­ quirements are such that we need high communication BW for a short duration, a crossbar seems to be a viable solution. On the other hand, if the Β A as well as the duration are important, then the multiple bus provides better character­ istics than the crossbar. It is also interesting to note that when ρ < 1, the multiple bus looks attractive in view of the flex­ Chita R. Das (S'84) received the M.Sc. degree in electrical engineering from Regional Engineering College, Rourkela, Sambalpur University, India, in 1981 and is currently working towards the Ph.D. degree in computer science at The Center for Advanced Computer Studies, University of South­ western Louisiana, Lafayette. His research interests include parallel/distrib­ uted computing, performance evaluation, and fault tolerance. ibility to choose the number of buses depending on ρ and m. If the BA requirements are not stringent, the partial-bus con­ nection can be a cost-effective alternative to the multiple bus. ACKNOWLEDGMENT The authors would like to thank Prof. V. V. S. Sarma for his useful comments. REFERENCES [1] K. Trivedi, Probability and Statistics with Reliability, Queuing, and Computer Science Applications. Englewood Cliffs, NJ: Prentice-Hall, 1982. [2] M. D. Beaudry, "Performance related reliability measures for computing Laxmi N. Bhuyan (S'81-M'83) received the M.Sc. degree in electrical engineering from Regional Engi­ neering College, Rourkela, Sambalpur University, India, in 1979 and the Ph.D. degree in computer engineering from Wayne State University, Detroit, MI, in 1982. During 1982-1983 he was with the Department of Electrical Engineering, University of Manitoba, Winnipeg, Canada. At present he is an Associate Professor at The Center for Advanced Computer Studies, University of Southwestern Louisiana, Lafayette. His research interests include parallel and distributed computer architecture, VLSI layout, multiprocessor applications, and performance evaluation. Dr. Bhuyan is a member of the Association for Computing Machinery.