GALS Design Style Taxonomy (Cont'd)

advertisement
Globally Asynchronous
Locally Synchronous
(GALS) Systems
Asynchronous Circuits Design Course
86-87-2
1
Agenda
1.
2.
3.
4.
5.
6.
7.
8.
Motivation
Asynchronous Design
GALS Definition
GALS Advantages
GALS Problem
Coping With Metastability
SoC Communication Infrastructure Architectures
GALS Design Style Taxonomy
Pausible Clock GALS Design Style
2. Asynchronous Interface GALS Design Style
3. Loosely synchronous GALS design style
GALS SoC Communication Infrastructure Architectures
1.
9.
10.
Designed and Fabricated GALS Systems and Chips
2
Motivation

SoC design methodology
“Divide” complex chips into several independent
functional blocks
And
 “Conquer” each of them using standard
synchronous methodologies and existing CAD
tools.


An on-chip infrastructure connects them to
form a functional system.
3
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Motivation (Cont’d)

Dividing a chip into smaller blocks,
Keeps technology scaling problems,
such as clock skew, manageable.


Only for individual blocks,
Not for the interconnects.


Connecting the elements by relatively long
wires,
Don’t scaled well in deep sub micron
technologies.
4
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Motivation (Cont’d)

Major problems in having various
synchronous on-chip communications





Modularity and Design Reuse
Electromagnetic Interference (EMI)
Worst Case Performance
Clock Power Consumption
Clock Skew
5
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Asynchronous Design

Smoother handling of both fabrication time and run-time
variability
 Delay matching, completion detection

Eliminating clock power consumption, clock skew, and EMI

Modular design
 Timing assumptions are explicit in the handshaking protocols

The circuit works faster
 Exploiting average case rather than worst case performance
6
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Asynchronous Design (Cont’d)

Asynchronous design is a more difficult task
compares to synchronous design


Absence of industrial tool support


Glitch-free circuits have to be generated.
Lack of a mature tool flow.
High overhead in terms of area, delay, and possibly
even power consumption

Dual rail data path, and collecting Ack signals from every
gate output
7
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS Definition

Globally Asynchronous Locally Synchronous
system design

Aims at filling the gap
between the purely
synchronous and
asynchronous domains.
8
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS Definition (Cont’d)

Consists of synchronous modules on a chip
communication asynchronously.
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
9
T. Meincke, et al., “Evaluating benefits of Globally Asynchronous Locally Synchronous VLSI Architecture”
GALS Definition (Cont’d)

Although most digital
circuits remain
synchronous, many
designs feature multiple
clock domains, often
running at different
frequencies.

Using an asynchronous
interconnect decouples
the timing issues for the
separate blocks.

Systems employing such
schemes are called
Globally Asynchronous,
Locally Synchronous
(GALS).
10
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”
GALS Definition (Cont’d)
The first use of the term GALS
was by Chapiro
in his 1984 doctoral dissertation.
11
GALS Definition (Cont’d)

Each Synchronous module:






SB (Synchronous Block)
SI (Synchronous Island)
IP Block (Intellectual Property Block)
PU (Processing Unit)
Functional Block (FB)
…
12
GALS Definition (Cont’d)

GALS systems have diverse definitions, titles, and antitypes

Multiple asynchronous clock domains

Special data, control, and verification handling
C. E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-Asynchronous Clock Designs”, 2001

Clock domain crossing issues
CLOCK DOMAIN CROSSING, CLOSING THE LOOP ON CLOCK DOMAIN FUNCTIONAL IMPLEMENTATION
PROBLEMS,

TECHNICAL PAPER, Cadence
Cross domain communication
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007

System timing Issues
Robert Mullins, http://www.cl.cam.ac.uk/users/rdm34

Delay-insensitive communication

Skew-insensitive communication

...
13
GALS Definition (Cont’d)

The most rigorous definition:

Only the systems with blocks, clocked Independently from
local clock generators, interconnected asynchronously.
R. Mullins, and S. Moore, “Demystifying Data-Driven and Pausible Clocking Schemes”, 2007

The most universal definition:

A system that it’s global clock network is removed!

Not a single-clocked digital system!
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007

Not a pure, one-clock synchronous system!
C. E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-Asynchronous
Clock Designs”, 2001
14
GALS Definition (Cont’d)
Synchronous to Delay-Insensitive Approaches to System Timing
Timing Assumptions
Global
None
Synchronous
Delay
Insensitive
Less Detection
Local Clocks/ Interaction with data
(becoming aperiodic)
Robert Mullins Presentation, “Asynchronous vs. Synchronous Design Techniques for NoCs“,
15
“The Status of the Network-on-Chip Revolution: Design Methods, Architectures and Silicon Implementation”,
(Tutorial) International Symposium on System-on-Chip, Tampere, Finland. November 14th, 2005.
GALS Advantages
1.
Increased ease of functional-block reuse

Can facilitate fast block reuse by providing wrapper circuits to
handle interblock communication across clock domain
boundaries.
2.
Simplified timing closure
3.
Power advantages due to heterogeneous clocking




By clocking different blocks at their minimum speeds.
By allowing fine tuning of the supply voltages and clock speeds
for different functional blocks
By dynamic voltage and frequency scaling
By eliminating the need for a global, low-skew clock.
16
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Advantages (Cont’d)

Allow synchronous design of components at their
own optimum clock frequency

But facilitates asynchronous communication
between modules

Leads to design flow fairly similar to the
synchronous flow

But with a few additional components which enable
asynchronous communication.
17
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS Advantages (Cont’d)

Eliminates the global clock leading to huge
reduction of power consumption and alleviating the
clock skew problem.

Facilities modular system design which is scalable.

Because of close resemblance to synchronous
design, can attract the attention of synchronous
designers who are not willing to experiment with
asynchronous design.
18
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS Problem

GALS: communication framework in which
local clocks are either unsynchronized or
paused

There is a risk of metastability at the interfaces
which is not present in traditional speed
independent or delay insensitive asynchronous
circuits.
Metastability in
Data Synchronizing and Communicating
19
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS Problem (Cont’d)

Metastability

is a condition where the voltage level of a
signal is


at an intermediate level — neither 0 or 1 —
and which may persist for an indeterminate
amount of time.
20
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Coping With Metastability

Timing-Safe Methods


Allocate a fixed period of time for metastability to resolve,
e.g. two flip-flop synchronizer
Please refer to
C. E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-Asynchronous Clock Designs”,
2001
“CLOCK DOMAIN CROSSING, CLOSING THE LOOP ON CLOCK DOMAIN FUNCTIONAL IMPLEMENTATION
PROBLEMS”, TECHNICAL PAPER, Cadence

Value-Safe Methods



Wait for metastability to resolve, e.g. clock stretching or
pausing techniques
Clock is generated locally
Value-safe ideas are less well understood, avoided by
industry
21
Robert Mullins Presentation, “Demystifying Data-Driven and Pausible Clocking Schemes”
Coping With Metastability (Cont’d)
“For the synchronous designer the problem is that
metastability may persist beyond the time interval that has
been allocated to recover from potential metastability. It is
simply not possible to obtain a decision within a bounded
length of time. The asynchronous designer, on the other
hand, will eventually obtain a decision, but there is no
upper limit on the time he will have to wait for the answer.
In [22] the terms “time safe” and “value safe” are introduced
to denote and classify these two situations.”

[22] D.M. Chapiro. Globally-Asynchronous LocallySynchronous Systems. PhD thesis, Stanford University,
October 1984.
J. SPARSØ, S. FURBER (Editors), “PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN –
A Systems Perspective”, pp. 78
22
Coping With Metastability (Cont’d)

A traditional Timing-Safe method


Using 2 flip-flop synchronizer
MTBF…
23
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
SoC Communication
Infrastructure Architectures


Point-to-Point,
Bus, Ring, Crossbar, Network
24
T. Bjerregaard, S. Mahadevan, “A Survey of Research and Practices of Network-on-Chip”, 2006
GALS Design Style Taxonomy
GALS in its universal definition
classifying GALS design styles according to the methods they use
to transfer data between timing domains
(Data Synchronizing and Communicating Methods )
25
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The pausible-clock design style
 Relies on locally generated clocks



That can be stretched or paused
Either to prevent metastability
Or to let a transmitter or receiver stall because of a full or empty
channel.

A ring oscillator typically generates the clocks.

The Integrated Systems Laboratory at ETHZ (Swiss Federal
Institute of Technology Zurich)


Has implemented several chips featuring pausible clocks, including a
cryptography chip.
Special wrapper circuits interface between synchronous blocks, such
that each wrapper includes a pausible-clock generator.
26
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The asynchronous design style

Involves the general case in which no timing relationship
between the synchronous clocks is assumed.

Are maximally flexible with respect to timing.

Fulcrum Microsystems’ Nexus architecture includes an
asynchronous crossbar switch that handles communication
between blocks operating at arbitrary clock frequencies.
27
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The loosely synchronous design style

Is for cases in which there is a well-defined,
dependable relationship between clocks.

It’s possible to


Exploit the stability of these clocks to achieve high
efficiency
While simultaneously providing tolerance for the large
amounts of skew inherent in global interconnects.
28
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The loosely synchronous design style

Mesochronous

The sender and receiver operate at exactly the same
frequency with an unknown yet stable phase
difference.

Intel’s 80-core processor employs a mesochronous
design. It uses synchronous tiles and a skew-tolerant
networkon-chip (NoC) interconnect scheme driven by
one stable global clock.
29
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The loosely synchronous design style

Plesiochronous

The sender and receiver operate at the same nominal
frequency but may have a slight frequency mismatch,
such as a few parts per million, which leads to drifting
phase.

Gigabit Ethernet is a common example.
30
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

The loosely synchronous design style

Heterochronous


The sender and receiver operate at nominally different
clock frequencies.
Ratiochronous




rationally related clock frequencies in which the receiver’s
clock frequency is an exact rational multiple of the sender’s,
and both are derived from the same source clock
such that there is a predictable periodic phase relationship.
Nonratiochronous
31
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)

In the next slides,

We describe a simplified example that provides one-way
communication between transmitter and receiver blocks.

The blocks
 operate synchronously using two different clocks
 and are connected together using a FIFO buffer that is
robust and free of metastability.

The FIFO buffer can have almost any capacity, including
just one data item,
 but this may affect throughput.
32
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)
In the next slides,


To send a data item,
1. the transmitter asserts put and drives data_in.
2. The FIFO buffer accepts the data on the rising edge
of put and lowers ok_to_put.
3. If this operation fills the FIFO buffer, ok_to_put
remains low until some data is removed.
33
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS Design Style Taxonomy (Cont’d)
In the next slides,


On the receiver side,
1. the FIFO buffer asserts ok_to_take when data is
available.
2. To remove a data item, the receiver latches data_out
and asserts take.
3. The FIFO buffer lowers ok_to_take until new data is
available.
4. If the FIFO buffer is empty, ok_to_take remains low
until new data is inserted.
34
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style

The Chapiro’s GALS approach

using pausible clocks to enable separate clock
domains to communicate without metastability.

each locally synchronous block generates its own
clock with a ring oscillator.

Each ring oscillator’s period is set according to
the speed requirements of the block it drives.
35
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)


Pausible or pausable
Has various modified and customized
versions




Stoppable
Stretchable
Data-driven
…
36
Pausible Clock GALS Design
Style (Cont’d)

Two potential advantages of pausible clocking
 Robustness
 Pausing delays a clock’s sampling edge until after the
arrival of data from the other domains, thus avoiding
metastability altogether.
 Power
 Pausing the clock of a block awaiting communication
prevents that block from dissipating dynamic power.
 Presumably, VDD can be lowered during prolonged stalls to
reduce static power as well.
 Hence, this style may be useful in power-critical designs.
37
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)

From designers viewpoint
 Simplifying design reuse
 By encapsulating crucial timing constraints in the
clock generator wrappers.
 By controlling the receiver’s clock, these interfaces
ensure that data arriving at the receiver satisfies the
receiver’s timing requirements, thus completely
avoiding metastability.
 Once this interface wrapper IP has been verified, it
can be reused for many different local blocks without
the need for further timing analysis.
38
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)

From designers viewpoint
 Clock tree latency
 Must be considered in GALS designs.
 If the latency to distribute a clock is larger than
a single clock cycle, invalid operations may
occur after the clock was supposed to have
stopped.
39
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)

From designers viewpoint
 Designing ring oscillators for robustness and
good performance
 A major difficulty in some GALS research
 The clock period can have high jitter, varying
significantly from cycle to cycle as it restarts from a
pause.
 This jitter can be amplified by the clock distribution
network, further cutting into the timing margin.
40
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)

From designers viewpoint
 A potential advantage of ring oscillator clocks is
 That variations in the clock period should track
variations in logic-gate delays across a range of
operating conditions.
 Unfortunately, standard CAD tools do not account for
this behavior during analysis, and they might force
conservative, worst-case designs.
41
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)
Pausible-clock GALS design style: circuit (a) and timing diagram (b)
42
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)
43
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Pausible Clock GALS Design
Style (Cont’d)
44
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style

Uses circuits known as synchronizers

to transfer signals arriving from an outside timing
domain to the local timing domain.
 Although simple asynchronous interfaces
suffer from low throughput,
 This limitation can be overcome with careful
designs.
45
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
Asynchronous GALS design style employing synchronizers:
circuit (a) and timing diagram (b)
46
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
47
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
48
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
 This simplistic design can transfer at most
 One datum for every three cycles of transmitter
clock φT or receiver clock φR
 whichever is slower
49
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)

From designers viewpoint
 Offering the most flexibility and probably the
easiest integration into existing CAD flows
50
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)

From designers viewpoint
 The modeling and validation of the synchronizer
circuits and the impact of their delay
 Real synchronizers have more complicated behavior
than predicted by simple textbook models, and circuit
simulators such as Spice do not have the numerical
accuracy to verify acceptable reliabilities.
 Recently developed simulation methods address this
problem.
51
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
 From designers viewpoint
 A rule of thumb for synchronizer design is that at
least 40 gate delays should be budgeted for
metastability to resolve to a stable, logical value.
 For a 0.13-micron process with a 60 ps gate
delay, synchronization adds about 2.5 ns of delay
when crossing timing domains.
52
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Asynchronous Interface GALS
Design Style (Cont’d)
 From designers viewpoint
 Thus, it is expected the asynchronous GALS style
to find widespread use in SoC designs
 That can tolerate the extra latency of synchronization
 or that have low clock frequencies (that is, few cycles
of synchronization latency).
 Higher-performance designs will require the
loosely synchronous styles described next.
53
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style

Arises when some bounds on the frequencies
of communicating blocks are known.

In this style, the designer exploits these
bounds to ensure that timing requirements
are met.
54
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
 Requires timing analysis on the paths
between the sender and receiver
 Is less amenable to dynamic changes in
the clock frequency.
55
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
The analysis makes handshaking
unnecessary during data transfer.
The resulting circuits rather than those
of the other methods
 Can achieve higher performance
 Have more deterministic latencies
56
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)

A loosely synchronous design exploits one of the
known timing relationships described earlier.

The simplest case is a mesochronous relationship,
in which the frequencies are exactly matched and
there is a stable but unknown phase difference.

This commonly occurs when the clocks are derived
from the same source but the latency of delivery to
each block differs.
57
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)

The mesochronous example shown in the
next slides



is based on the Stari (Self-Timed At Receiver’s
Input) scheme
in which clocks φT and φR are derived from the
same source.
The receiver uses a self-timed FIFO buffer to
compensate for the phase difference.
58
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)

The key to high-performance operation is to
initialize the FIFO buffer to be half full.

To get the FIFO buffer half full, special
initialization is needed.
59
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)

During operation, the transmitter puts one datum
into the FIFO buffer every cycle, and the receiver
takes one datum.

Neither needs to check the FIFO buffer status
signals (the FIFO buffer is assumed to be fast
enough).

The FIFO buffer will remain within +/-1 data item of
half full because the frequencies are matched
60
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
Loosely synchronous, (mesochronous) GALS design style:
circuit (a) and timing diagram (b)
61
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
62
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
63
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)

From designers viewpoint
 The need for “high clock frequencies” and
“low latency” in high-performance designs
will make them candidates for loosely
synchronous techniques.
64
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
Loosely synchronous Interface
GALS design style (Cont’d)
• From designers viewpoint
 To determine the optimal size of the FIFO buffers,
 Timing analysis is necessary to bound how far the relative
phase difference between the sender and receiver may
drift.
 This type of timing analysis is not yet common for
on-chip timing,
 Although it is standard when using interchip, sourcesynchronous communication (for example, synchronous
DRAMs).
65
Paul Teehan, et al., “A Survey and Taxonomy of GALS Design Styles”, IEEE 2007
GALS SoC Communication
Infrastructure Architectures (Cont’d)
A locally synchronous block with its self-timed wrapper
(Pausible clocking scheme)
66
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS SoC Communication
Infrastructure Architectures (Cont’d)
Block diagram of an asynchronous wrapper
(Pausible clocking scheme)
67
X. Jia, and R. Vemuri, “CAD Tools for a GALS FPGA Architecture”
GALS SoC Communication
Infrastructure Architectures (Cont’d)
Block diagram of two synchronous blocks communication in a
GALS system
(Pausible clocking scheme)
68
R. Dobkin, et al., “High Rate Data Synchronization in GALS SoCs”, IEEE 2006
GALS SoC Communication
Infrastructure Architectures (Cont’d)
Two synchronous blocks channel communications circuit in a
GALS system
(Pausible clocking scheme)
69
Simon Moore, et al., “Channel Communication Between Independent Clock Domains”
GALS SoC Communication
Infrastructure Architectures (Cont’d)
GALS bus architecture
(Pausible clocking scheme)
70
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
GALS SoC Communication
Infrastructure Architectures (Cont’d)
GALS Ring architecture
(Pausible clocking scheme)
71
M. Amde et al., “Asynchronous On-Chip Networks”, IEE 2005
Designed and Fabricated GALS
Chips

FPGA

SoC
 Marble
 Nexus (Fulcrum)
 …

NoC
 Chain (CHip Area INterconnect)
 Faust (Flexible Architecture of Unified System for Telecom)
 Mango (Message-passing Asynchronous Network-on-chip
providing Guaranteed services over Open core protocol (OCP)
interfaces)
 …
72
Designed and Fabricated GALS
Chips (Cont’d)
T. S. T. Mak, et al., “On-FPGA Communication: An Opportunity for GALS?”,
Proceedings of the Eighteenth UK Asynchronous Forum, 2006
73
Designed and Fabricated
GALS Chips (Cont’d)

MARBLE

As a step in developing such an interconnection standard an Amulet3i contains
the first implementation of MARBLE [5], a 32-bit, multimaster, on-chip bus which
communicates by using handshakes rather than a clock. Apart from this the
signal definitions, with 32-bit address and data, look very similar to a
conventional bus. MARBLE separates address and data communications,
allowing pipelining and interleaving of operations in order to increase the
available bandwidth when several devices require global access.

MARBLE is supported by ‘initiator’ and ‘target’ interfaces which can be attached
to any asynchronous component. These, their address, and the bus wiring
provide all that is needed for communication between the various components.
In Amulet3i there are four initiators and seven targets. For example the
processor’s two local buses each terminate in a MARBLE initiator and the local
data bus is also a MARBLE target which allows DMA and test data in and out of
the RAM from other initiators.
J. SPARSØ, S. FURBER (Editors), “PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN –
A Systems Perspective”, pp. 311
74
Designed and Fabricated
GALS Chips (Cont’d)
J. SPARSØ, S. FURBER (Editors), “PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN –
A Systems Perspective”, pp. 311
75
Designed and Fabricated
GALS Chips (Cont’d)
76
S. Furber, “Future Trends in SoC Interconnect”
Designed and Fabricated
GALS Chips (Cont’d)
J. SPARSØ, S. FURBER (Editors), “PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN –
A Systems Perspective”, pp. 311
77
Designed and Fabricated
GALS Chips (Cont’d)
Nexus System-on-Chip Interconnect
 Non-blocking crossbar
 16 full-duplex ports
 Flow control extends through the crossbar
 Full speed arbitration
 Arbitrary-length “bursts”
 Bridges clock domains
 Scales in bit width and ports
 Process portable
78
A. Lines, “Nexus: Asynchronous Interconnect For Synchronous SoC Designs”, Fulcrum microsystems
Designed and Fabricated
GALS Chips (Cont’d)
A specific Nexus example
Multiprocessor SoC
79
A. Lines, “Nexus: Asynchronous Interconnect For Synchronous SoC Designs”, Fulcrum microsystems
Designed and Fabricated
GALS Chips (Cont’d)




CHAIN (‘Chip area interconnect’)
is currently under development as a possible replacement for
a conventional bus for on-chip communications.
Chain is based around narrow, high-speed, point-to-point links
forming a network rather than a bus. The idea is to exploit the
potential for fast symbol transmission within an asynchronous
system while reducing the number of long distance wires.
By using a delay-insensitive coding scheme Chain relieves
the chip designer of the need to ensure timing closure across
the whole chip; it also provides tolerance of potential
problems such as induced crosstalk on the long
interconnection wires. Again the user need only communicate
with ‘conventional’ parallel interfaces.
J. SPARSØ, S. FURBER (Editors), “PRINCIPLES OF ASYNCHRONOUS CIRCUIT DESIGN –
A Systems Perspective”, pp. 311
80
Designed and Fabricated
GALS Chips (Cont’d)
81
Steve Furber, “Future Trends in SoC Interconnect”
Designed and Fabricated
GALS Chips (Cont’d)



FAUST (Flexible Architecture of Unified
System for Telecom)
A complete System-on-Chip (SoC) framework
based on an asynchronous Network-on-Chip
(NoC)
Supports complex Multi-Carrier OFDM-based
telecom applications.
82
D. Lattard, et al, “A Telecom Baseband Circuit based on an Asynchronous Network-on-Chip”, ISSCC 2007
Designed and Fabricated
GALS Chips (Cont’d)
Faust Block diagram
83
D. Lattard, et al, “A Telecom Baseband Circuit based on an Asynchronous Network-on-Chip”, ISSCC 2007
Designed and Fabricated
GALS Chips (Cont’d)
IP integrated in the NoC
84
D. Lattard, et al, “A Telecom Baseband Circuit based on an Asynchronous Network-on-Chip”, ISSCC 2007
Designed and Fabricated
GALS Chips (Cont’d)
Asynchronous implementation of the NoC
85
D. Lattard, et al, “A Telecom Baseband Circuit based on an Asynchronous Network-on-Chip”, ISSCC 2007
Designed and Fabricated
GALS Chips (Cont’d)
MANGO
(Message-passing Asynchronous
Network-on-chip providing Guaranteed services
over Open core protocol (OCP) interfaces)
The network consists
of NAs implementing
the network access points,
routers,
and
pipelined links.
MANGO-based SoC
86
T. Bjerregaard and J. Sparsø, “Implementation of guaranteed services in the MANGO clockless network-on-chip”, IEE 2006
Download