An optical centralized shared-bus architecture

advertisement
512
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 9, NO. 2, MARCH/APRIL 2003
An Optical Centralized Shared-Bus Architecture
Demonstrator for Microprocessor-to-Memory
Interconnects
Xuliang Han, Gicherl Kim, G. Jack Lipovski, Fellow, IEEE, and Ray T. Chen, Senior Member, IEEE
Abstract—An architecture demonstrator of an innovative interconnect scheme called the optical centralized shared-bus is presented in this paper. This architecture retains the advantages of
shared-bus topology while at the same time specifying a uniform
interface between the electrical and the optical backplane layers
in contrast to other proposed architectures. For the first time, a
fanout equalized optical backplane bus is demonstrated. In this architecture demonstrator, the data paths required for the microprocessor-to-memory interconnects are provided by the optical centralized shared-bus. The optoelectronic interface modules are optimized to support data rates up to 1.25 Gb/s. The objective of this
microprocessor-to-memory interconnects demonstration is to ensure the feasibility of applying this innovative architecture in real
systems.
Index Terms—Optical backplane bus, optical interconnect,
optoelectronic interface, vertical cavity surface emitting laser
(VCSEL).
I. INTRODUCTION
I
NTERCONNECT is becoming an even more dominant
factor in modern computation systems [1]. As the underlying implementation technology, however, electrical
interconnection faces numerous challenges, such as power
consumption, signal integrity, and electromagnetic interference.
The employment of optical interconnects will be one of the
major alternatives for upgrading the interconnect performance
[2]. Machine-to-machine interconnection has already been
significantly improved by utilizing optical means. The major
research thrusts in optical interconnects are in the backplane
and board levels where the physical limitations of electrical
interconnects are imposing a prominent bottleneck.
Shared-bus architecture is a preferred interconnect scheme
because its broadcast nature can be effectively utilized to reduce
latency, to lessen the complexity of interconnection network
management, and to support cache coherence in a multiprocessor. However, the physical length, the number of fanouts, and
the speed of the shared-bus are significantly limited by the underlying electrical interconnects. Optics has been widely agreed
as a better alternative as an interconnect technology. Great research efforts have been dedicated to the design of innovative
Manuscript received October 31, 2002; revised February 6, 2003. This work
was supported in part by BMDO, in part by DARPA, in part by ONR, in part by
AFOSR, and in part by the ATP program of the State of Texas.
The authors are with the Microelectronic Research Center, Department of
Electrical and Computer Engineering, University of Texas, Austin, TX 78758
USA (e-mail: raychen@uts.cc.utexas.edu).
Digital Object Identifier 10.1109/JSTQE.2003.813311
optical shared-bus architectures [3], [4]. These previously reported optical shared-bus architectures, however, are difficult to
implement in practice due to the large variation among the optical signal fanouts that dramatically increases the complexity
of the optoelectronic interface modules. The difficulty in equalizing the fanouts is mainly due to the requirement to support
the bidirectionality of signal flows on the backplane bus. In our
previous report [5], a new method for rebroadcast signals in
an optical backplane bus system was described. Uniform optical signal fanouts become possible by using this method. Upon
this method, we present in this paper an innovative architecture
called the optical centralized shared-bus, which is, to the best
our knowledge, the first fanout equalized optical backplane bus.
As shown in Fig. 1, the electrical bus provides interconnects
for the noncritical paths, whereas the optical bus for the critical
paths. To implement optical interconnects, volume holographic
gratings are used for coupling optical signals into and out of the
optical wave-guiding plate within which optical signals propagate as substrate-guided waves [6]. The details of this innovative architecture, especially the feature of uniform optical signal
fanouts, are described in Section II. In Section III, an optical
centralized shared-bus architecture demonstrator for microprocessor-to-memory interconnects is presented to ensure the feasibility of this proposed idea. Finally, a summary is given in
Section IV.
II. OPTICAL CENTERALIZED SHARED-BUS ARCHITECTURE
Fig. 1 illustrates the architectural concept of the optical centralized shared-bus. The electrical backplane layer provides interconnects for the noncritical paths. The daughter board that is
inserted into the central backplane connector plays a pivotal role
in this architecture and is referred to as distributor in this paper.
The optoelectronic interface modules, including vertical cavity
surface emitting laser (VCSELs) and photodetectors, are integrated on the backside of the electrical backplane and aligned
with the underlying optical backplane layer. Different from the
other modules, the positions of the VCSEL and the photodetector in the central module are swapped. The optical backplane
layer consists of an optical wave-guiding plate with the properly
designed volume holographic gratings integrated on its top surface. Underlying the distributor is a double-grating hologram,
and the others are single-grating holograms. In this way, this
architecture provides the bidirectionality of signal flows on the
backplane bus.
1077-260X/03$17.00 © 2003 IEEE
HAN et al.: OPTICAL CENTRALIZED SHARED-BUS ARCHITECTURE DEMONSTRATOR
Fig. 1.
513
Optical centralized shared-bus architecture.
In this architecture, there are two optical paths for each
signal line. One is for the source daughter board to deliver
the signal to the distributor, and the other one for the distributor to broadcast the signal to all the daughter boards on
the backplane bus. A complete data transfer from a daughter
board to: 1) another daughter board (point-to-point); 2) more
than one daughter board (multicast); and 3) all the daughter
boards on the backplane bus (broadcast), generally involves
two processes, which are single-hop delivery from the source
daughter board to the distributor and then broadcast from the
distributor. This explains the reason we name this architecture
optical centralized shared-bus. First, the VCSEL of the source
daughter board projects the light surface normally on its
underlying holographic grating. This signal is coupled into
the optical wave-guiding plate and propagates within the
plate under the total internal reflection (TIR) condition [6].
Then, this signal is surface normally coupled out of the plate
by the central double-grating hologram and detected by the
receiver of the distributor. Second, the distributor regenerates
the same optical signal and projects it surface normally on its
underlying double-grating hologram. This signal is diffracted
into two beams and coupled into the optical wave-guiding plate,
propagating along the two opposite directions within the plate
under the TIR condition. During the propagation, a portion
of the light is surface normally coupled out of the plate by a
daughter board’s underlying holographic grating and detected
by its receiver. This daughter board takes appropriate actions
on the received data. If the distributor is the data source, the
first process will not happen. If the distributor is the only data
destination, the second process is not necessary. The protocol
governing the bus should be able to tolerate the transmission
delay due to the difference in the distance. Furthermore, in
conformity with a specific global topology, a hierarchical
interconnection network [7] can be constructed by using the
optical centralized shared-bus as the building block and the
distributor as the socket.
The most attractive feature of this architecture is to achieve
uniform optical signal fanouts. Assuming that the VCSELs of
all the optoelectronic interface modules emit the same optical
power, uniform fanouts mean that: 1) the power of signals delivered from any daughter board to the distributor is same; 2)
the power of signals broadcast from the distributor to all the
daughter boards on the backplane bus is same; and 3) the power
of signals broadcast from the distributor equals that delivered
from any daughter board to the distributor. Thus, this architecture specifies a uniform interface between the electrical and the
optical backplane layers in contrast to other proposed architectures, e.g., [3] and [4]. On the optical centralized shared-bus, the
fanouts are equalized by specifying the diffraction efficiency of
the volume holographic gratings [5]. Because of the symmetric
configuration, it is obvious that the two multiplexed gratings inside the central hologram should have the same diffraction efficiency. The analysis in [5] shows that the fanouts are equalized
if the following iterative equation is satisfied:
(1)
where represents the diffraction efficiency of the th singlegrating hologram counted from the central double-grating hologram. In the case of uniform fanouts, the fanout coefficient,
which is defined as the ratio of the fanout power to the effective
VCSEL fanin power, is obviously equal to the reciprocal of the
total number of the daughter boards (not including the distributor) on the backplane bus. Thus, the fanout capacity, i.e., the
514
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 9, NO. 2, MARCH/APRIL 2003
Fig. 3.
Fig. 2.
Microprocessor-to-memory interconnects demonstration.
maximum number of daughter boards that one optical centralized shared-bus can accommodate, can be calculated if the bit
error rate requirement, marginal power penalty, and the parameters of the optoelectronic interface modules, such as VCSEL
emission power and photodetector sensitivity, are specified.
III. MICROPROCESSOR-TO-MEMORY INTERCONNECTS
DEMONSTATION
As shown in Fig. 2(a) and (b), this architecture demonstrator
consists of a microprocessor board (CPU12, Motorola Inc.),
four external memory (8KB SRAM) boards, an electrical backplane bus, and an optical centralized shared-bus. The microprocessor board is inserted into the central backplane connector as
the distributor, and the memory boards, L1, L2, R1, and R2, are
inserted into their corresponding slots. Fig. 3 is the conceptual
connectivity block diagram of this system. For simplicity, only
one memory board is illustrated in this diagram. As shown in
Fig. 3, the optical centralized shared-bus provides the data paths
for the microprocessor-to-memory interconnects. Although the
system clock is only 16 MHz, which is limited by the capacity
of the microprocessor (CPU12), this architecture demonstrator
is sufficient for our proof-of-concept objective.
A. Volume Holographic Gratings
The volume holographic gratings specified by the optical centralized shared-bus architecture were recorded in
Connectivity block diagram of the architectural demonstrator.
dry photopolymer films (HRF-600X014-20, DuPont). The
photopolymer-based volume hologram is an attractive candidate for making high-efficiency gratings. The advantages of
this material over other types of emulsion, such as dichromated gelatin and silver halides, include dry-processing capability, long shelf life, and good photo-speed [8]. The
maximum permissible dose of the incident light is beyond
100 JW/cm .
The two-beam interference method was used to form the
gratings in the dry photopolymer films. This material consists
of monomers, polymeric binders, and photoinitiators. The
monomers are polymerized when exposed to the light of
specific wavelength, and the refractive index of the film is
determined by the polymer concentration. While being exposed
to an interference pattern, there are more monomers being
polymerized in the bright regions than in the dark regions. This
nonuniform illumination sets up monomer concentration gradients, driving the monomers to diffuse from the dark regions to
the neighborhood bright regions. A final uniform illumination
is required to polymerize the remaining monomers and stabilize
the spatial distribution of the polymer concentration, which
conforms to the original illumination pattern. Thus, a grating
structure is formed inside the film. To obtain the double-grating
holograms, two sequential exposure steps are required to form
the two multiplexed gratings inside the films.
Our objectives are to obtain single-grating holograms with
accurate diffraction efficiency that satisfies iterative (1), and
high-efficiency equal-strength double-grating holograms. The
quality of these volume holographic gratings directly affects the
fanout variation and capacity, therefore, are pivotal to the implementation of the optical centralized shared-bus architecture.
The recording schedules were developed by using the method
described in [9]. Following these recording schedules, we were
able to control the accuracy of the diffraction efficiency within
2% and obtained 47%/47% double-grating holograms.
Fig. 4 demonstrates the uniform optical signal fanouts in
the architecture demonstrator we built. In conformity with the
specification of the optical centralized shared-bus architecture,
a 47%/47% double-grating hologram is integrated in the
middle of the optical wave-guiding plate. The other two are
single-grating holograms with 50% diffraction efficiency. The
HAN et al.: OPTICAL CENTRALIZED SHARED-BUS ARCHITECTURE DEMONSTRATOR
515
Fig. 5. Onboard high-speed performance test and the eye pattern at a data rate
of 1.25 Gb/s.
Fig. 4. Demonstration of uniform optical signal fanouts in the architecture
demonstrator.
C. Transmitter/Receiver Protocol
diffraction angles within the plate are 45 , which satisfies the
TIR condition. The two 22.5 bevels at both ends are coated
with aluminum, providing nearly 100% reflection efficiency.
An 850-nm VCSEL source was used to obtain the result as
shown in Fig. 4. The input optical power from the VCSEL
was 2 mW, and the fanouts were, from left to right, 0.404,
0.4.06, 0.400, and 0.396 mW, respectively. A two-dimensional
(2-D) multibus line configuration [10] can be implemented by
replacing the optical source with a 2-D VCSEL array in this
architecture demonstrator.
B. Optoelectronic Interface Modules
The optoelectronic interface modules, consisting of transmitters and receivers, implement electrical-to-optical and
optical-to-electrical conversions. As discussed earlier, the
optical centralized shared-bus architecture specifies a uniform
interface between the electrical and the optical backplane layer.
A transmitter module consists of an 850–nm VCSEL and a
laser driver that accepts differential PECL inputs and provides complementary modulation currents. A receiver module
consists of a photodetector/transimpedance amplifier and a
postamplifier that accepts a wide range of voltages (10–1200
mV) while providing a constant-level (PECL) output voltage.
The PCB layout design was optimized to support data rates
up to 1.25 Gb/s. Eye patterns were measured to characterize
the high-speed performance of the optoelectronic interface
modules in this architecture demonstrator, as shown in Fig. 5.
The pseudorandom bit sequence (PRBS) from a pulse generator
(HP8183A) was used as the input to a laser driver in an interface
module. The output optical signal was transferred through the
optical centralized shared-bus and then detected by a receiver
in another interface module. Along with the trigger signal
from the pulse generator, the output signal from the receiver
was fed into a digital communication analyzer (HP83480A)
to display eye patterns. The eye pattern at a data rate of 1.25
Gpbs is shown in Fig. 5. This result verifies that these interface
modules are capable of supporting data rates up to 1.25 Gb/s,
thus providing a sufficient headroom for implementing more
powerful computation systems.
To efficiently utilize the large capacity of the optical links
in this architecture demonstrator, serial data transfer was performed between the transmitter and the receiver ends. Clock
forwarding method, i.e., using the same clock signal at both
the transmitter and the receiver ends, was utilized to implement
this serial data transfer. Without involving complicated clock
recovery circuits, this method reduces the complexity of the
system. In order to synchronize the serial data with the clock
signal, a special transmitter/receiver protocol was designed. At
the idle state, the data links keep logic level low [11]. For a serial data transfer, the serializer at the transmitter end attaches
a logic-high bit in front of the actual data bits, signifying the
beginning timing of the data. Thus, the data pattern presented
on the data links is a logic-high bit followed by the actual data
payload. When received at the receiver end, this logic-high bit
synchronizes the trailing data bits with the clock signal. Therefore, the correct deserializaion can be performed at the receiver
end. In this architecture demonstrator, this protocol was carried
by hardware, thus, transparent to the upper programming level.
D. Optical Connectivity Verification
To verify the optical connectivity required by the microprocessor-to-memory interconnects, we ran this architecture
demonstrator under an infinite loop operation. Inside this loop,
a memory address was issued to select one memory board,
and then a data transfer (write and read back) was performed
between the microprocessor and the selected memory board.
To visualize this data transfer, the data patterns on the optical
links were displayed on the screen of an oscilloscope, as
shown in Fig. 6(a) and (b). In these tests, the data patterns
were measured through the monitor ports, as shown in Fig. 2,
which indicates the modulation current of their corresponding
VCSELs. Fig. 6(a) shows the result when data “0 4C” was
transferred between the microprocessor and memory board
L1. Along with the starting logic-high bit, the serial data
pattern on the optical link should be “100 110 010” in the
time-increasing order. The correct patterns were observed as
shown in Fig. 6(a), where channels 1 and 2 show, respectively,
the data pattern measured at monitor port L1 and C. Fig. 6(b)
shows the result when data “0 44” was transferred between
516
IEEE JOURNAL OF SELECTED TOPICS IN QUANTUM ELECTRONICS, VOL. 9, NO. 2, MARCH/APRIL 2003
ACKNOWLEDGMENT
The authors thank BMDO, DARPA, ONR, AFOSR, and the
ATP program of the State of Texas for supporting this paper.
REFERENCES
Fig. 6. Optical connectivity verification between the microprocessor and
memory boards (a) L1 and (b) R2.
the microprocessor and memory board R2. The correct patterns
were also observed as shown in Fig. 6(b), where channels 1 and
2 show, respectively, the data pattern measured at monitor port
R2 and C. Similar tests were conducted on the optical links
between the microprocessor and the other memory boards on
the backplane bus, and the correct patterns were also observed.
These results verify the optical connectivity required for the
microprocessor- to-memory interconnects.
IV. CONCLUSION
For the first time, we designed and implemented a fanout
equalized optical backplane bus. This innovative architecture,
called the optical centralized shared-bus, retains the advantages
of shared-bus topology, while at the same time, specifying a uniform interface between the electrical and the optical backplane
layers in contrast to other proposed architectures. The objective
of this microprocessor-to-memory interconnects demonstration
is to ensure the feasibility of applying this innovative architecture
in real systems. The overall performance of this architecture
demonstrator is limited by the microprocessor (CPU12) we used.
On the other hand, the optoelectronic interface modules in this
architecture demonstrator are able to support data rates up to 1.25
Gb/s, thus providing a sufficient headroom for implementing
more powerful computing systems. With the rapid developments
in active optoelectronic devices, much higher data rate, e.g., 10
Gb/s, may be considered in the system design. Furthermore,
the optical centralized shared-bus architecture is of general
applicability and can be applied in scenarios other than the
microprocessor-to-memory interconnects demonstrated herein,
e.g., centralized shared-memory multiprocessors.
[1] W. J. Dally, “Computer architecture is all about interconnect,” in Proc.
8th Int. Symp. High-Performance Comput. Architecture, Cambridge,
MA, Feb. 2002.
[2] M. R. Feldman, S. C. Esener, C. C. Guest, and S. H. Lee, “Comparison
between optical and electrical interconnects based on power and speed
characteristics,” Appl. Opt., vol. 27, pp. 1742–1751, 1988.
[3] J. Yeh, R. K. Kostuk, and K. Tu, “Hybrid free-space optical bus
system for board-to-board interconnections,” Appl. Opt., vol. 35, pp.
6354–6364, 1996.
[4] S. Natarajan, C. Zhao, and R. T. Chen, “Bi-directional optical backplane
bus for general purpose multi-processor board-to-board optoelectronic
interconnects,” IEEE J. Lightwave Technol., vol. 13, pp. 1031–1040,
June 1995.
[5] G. Kim, X. Han, and R. T. Chen, “A method for rebroadcasting signals
in an optical backplane bus system,” IEEE J. Lightwave Technol., vol.
19, pp. 959–965, July 2001.
[6] K. Brenner and F. Sauer, “Diffractive-reflective optical interconnects,”
Appl. Opt., vol. 27, pp. 4251–4254, 1988.
[7] T. M. Pinkston, “Design considerations for optical interconnects in
parallel computers,” in Proc. 1st Int. Workshop Massively Parallel
Processing Using Optical Interconnections, Cancun, Mexico, 1994, pp.
306–322.
[8] W. J. Gambogi, A. M. Weber, and T. J. Trout, “Advances and applications of DuPont holographic photopolymers,” in Proc. SPIE, vol. 2043,
1994, pp. 2–13.
[9] X. Han, G. Kim, and R. T. Chen, “Accurate diffraction efficiency control
for multiplexed volume holographic gratings,” Opt. Eng., vol. 41, pp.
2799–2802, 2002.
[10] G. Kim, X. Han, and R. T. Chen, “Crosstalk and interconnection distance considerations for board-to-board optical interconnects using 2-D
VCSEL and microlens array,” IEEE Photon. Technol. Lett., vol. 12, pp.
743–745, June 2000.
[11] R. T. Chen, “VME optical backplane bus for high performance computer,” J. Optoelectron. Devices Technol., vol. 9, pp. 81–94, 1994.
Xuliang Han received the B.S. degree in electronic engineering from Tsinghua
University, Beijing, China, in 1999 and the M.S.E degree in electrical and computer engineering, in 2001, from the University of Texas, Austin, where he is
currently pursuing the Ph.D. degree in electrical and computer engineering .
His research interests include massively parallel processing using optical interconnects, optical centralized shared-bus, high-speed modular optoelectronic
transceivers, guided-wave optics, and holography.
Gicherl Kim received the B.S. and M.S. degrees in physics from Inha University, Incheon, Korea, in 1989 and 1991, respectively, and the Ph.D. degree in
electrical and computer engineering from the University of Texas, Austin, in
2000.
He was a Researcher in the Agency for Defense Development, Korea from
1991 to 1997. He is currently a Research Engineer and Project Manager with
Omega Optics, Inc., Austin, TX. His current research topics are mainly devoted to design and fabrication of optical board and backplane, and the advanced architectural design of optical interconnects and networks. Recently,
his research works have also covered the design, fabrication, and packaging of
modular optical motherboard bus, based on waveguides and substrate-mode interconnects using passive and active photonic devices. His interest in the optical
data transmission topologies includes point-to-point, multidrop, multipoint, and
switch fabric optical communication from the intercomputer to the intracomputer levels.
HAN et al.: OPTICAL CENTRALIZED SHARED-BUS ARCHITECTURE DEMONSTRATOR
G. Jack Lipovski (S’63–M’66–SM’82–F’94) is a Professor in the Department
of Electrical and Computer Engineering, University of Texas, Austin, where he
has taught since 1976. Previously, he taught electrical engineering and computer
science at the University of Florida from 1969 to 1976. In 1988–1989, he held
the Grace Hopper Chair of Computer Science, Naval Postgraduate School, Monterey, CA. He designed and built the pioneering database computer, CASSM,
and parallel computer, TRAC. He has published more than 70 papers and has
authored or coauthored six books and edited three books. He has consulted for
several companies, including the Microelectronics and Computer Corporation.
He currently studies parallel, database, and artificial intelligence computer architectures and microcomputers.
Prof. Lipovski chaired the IEEE Computer Society Technical Committee on
Computer Architecture and the Computer Architecture Committee of the Association for Computer Machinery. He is also a Computer Society Governing
Board Member, the Euromicro Director, IEEE MICRO Editor, and the Area Editor of the Journal of Parallel and Distributed Computing.
517
Ray T. Chen (M’91–SM’98) is the Temple Foundation Endowed Professor in
the Department of Electrical and Computer Engineering, University of Texas,
Austin. His research group has been awarded more than 60 research grants and
contracts from such sponsors as the Department of Defense, the National Science Foundation, the Department of Energy, NASA, the State of Texas, and
private industry. The optical interconnects research group at UT Austin has reported its research in more than 250 published papers. Currently, there are 20
Ph.D. degree students and seven postdoctors working in his group. He has served
as a Consultant for various federal agencies and private companies and delivered numerous invited talks to professional societies. His research topics include
guided-wave and free-space optical interconnects, polymer-based integrated optics, a polymer waveguide amplifier, graded index polymer waveguide lenses,
active optical backplane, traveling-wave electrooptic polymer waveguide modulator, optical control of phased-array antenna, GaAs all-optical crossbar switch,
holographic lithography, and holographic optical elements.
Prof. Chen has chaired or been a Program Committee Member for more than
40 domestic and international conferences organized by SPIE, OSA, IEE, and
PSC. He is a Fellow of SPIE and the Optical Society of America and a Member
of PSC.
Download