ZeroSkew2_Report - University of Wisconsin

advertisement
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
Minimum Wirelength Zero Skew Clock Routing Trees with Buffer Insertion
John Thompson, Kurt Ting, and Simon Wong
Department of Electrical and Computer Engineering
University of Wisconsin - Madison
{jdthompson, kting, wangwong}@wisc.edu
ABSTRACT
signals are driven with a temporal reference to the on-chip
clock signals, these clock signals must be particularly clean and
Zero skew clock routing is an issue of increasing
importance in the realm of VLSI design. As a result of the
increasing speeds of on-chip clocks, zero skew clock tree
construction has become critical for the correct operation
of high performance VLSI circuits. In addition, in an effort
to both reduce power consumption and the deformation of
clock signals at synchronizing elements on a chip, a
minimum wirelength characteristic of clock tree networks
is highly desirable.
In an effort to provide a solution to the current issues
dealing with zero skew clock tree construction, we present
an efficient two-phase algorithm based on the Elmore delay
model, which successfully constructs zero skew clock
routing trees with buffer insertion and minimum
wirelength. The results of an implementation of this
algorithm have been verified to display zero skew
characteristics in conformance with the Elmore delay
model equations. The first phase of the algorithm is a
bottom-up delayed merge embedding (DME) with buffer
insertion procedure which enumerates all of the possible
zero skew clock trees for consideration in the second
phase. In the second phase, a top-down procedure of
merged embedding is performed with the objective of
minimizing wirelength.
sharp. In addition, as technology scales down in feature size,
clock signals in particular become affected by increased
resistance due to their long interconnect lengths and decreasing
line dimensions. The point is illustrated by the following
equation for a wire’s resistance.
1.
INTRODUCTION
In a synchronous sequential circuit, the clock signal is
used to define a time reference for the movement of data
from one storage element to another, through the circuit.
Due to the extreme importance of the signal in synchronous
circuits, much attention has been paid to the characteristics
of these clock signals and the networks used to route them
on-chip. While they are often regarded in the same light as
any other control signals, further inspection makes it
obvious that special consideration must be given when
dealing with clock signals. These signals can typically be
characterized as having the largest fan out, longest routing
distances, and the highest operational speeds of any onchip signal, control or data. Furthermore, since data
W02-1
R   l / A
(1)
where R is the resistance of the wire, ρ is the resistivity of
the wire in Ohms-meter, l is the length of the wire, and A is
the cross sectional area of the wire is meters2.
From the above equation, (1), it can be seen that in
general, as the feature size of VLSI chips decreases by a
factor of f in a single dimension, the resistance of all wires
increases by a factor of f. This is because the length of the
wire (l) decreases by a factor f, while the cross sectional
area of the wire (A) also decreases, but by a factor of f2.
This increased line resistance is one of the primary reasons
for the growing importance of clock distribution on
synchronous performance. Finally, the control of any
differences in the delay of the clock signals can severely
limit the maximum performance of the entire system as
well as create catastrophic race conditions which may
cause incorrect data values to be latched into the circuits
registers.
In addition to the trend of decreasing feature sizes in
VLSI circuits, a second trend of heightening clock speeds is
also prevalent. Currently, clock speeds achievable in
synchronous VLSI design are mainly limited by two factors,
i) the longest delay path through any block of combinational
logic and ii) clock skew. While the first contributing factor,
that of the maximum delay through combinational logic can
only be solved by further design considerations, the notion
of clock skew can largely be dealt with using clock routing
algorithmic techniques. Clock skew can be defined as the
maximum difference in arrival times of a clocking signal at
the synchronizing elements which it triggers in a design. Its
limiting characteristics on clock cycle period can be
observed from the following well-known inequality that
governs the clock period of a clock signal net.
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
clock period  td  tskew  tsu  tds
(2)
where td is the delay on the longest path through
combinational logic, tskew is the maximum clock skew, tsu is
the set-up time of the synchronizing elements (assuming
they are edge triggered), and tds is the propagation delay
within the synchronizing elements. Furthermore, the term
td can further be broken down into two disjoint components
according to the following equation.
td  td
 interconnect
 td
 gates
(3)
Figure 1. H-tree over four points
where td-interconnect represents that portion of the delay
through the longest path of combinational logic (td) that can
be attributed to interconnect and td-gates represents that
portion of the delay which can be attributed to the actual
delay through the gates. Increased switching speeds in
VLSI circuits due to decreasing feature sizes will decrease
the td-gates term but not the td-interconnect term.
From the above analysis, it can be inferred that the two
dominant terms determining the minimum clock period
(maximum clock speed) are those associated with clock
skew, tskew, and wire interconnect through the longest block
of combinational logic, td-interrconnect. In a paper authored by
Bakoglu [1], it was noted that clock skew may account for
over 10% of the system’s cycle time in high-performance
VLSI circuits. Working to lessen the effect of clock skew,
a vast amount of research has been devoted solely to
reducing skew in clock routing networks.
2.
RELATED WORK
In the past few decades, much research has been
devoted to the issue of minimizing clock skew to zero.
Several algorithms have been proposed which all achieve
zero clock skew at the cost of varying complexity, wire
lengths, and other measures. The simplest approach was
first proposed by Dhar, Franklin, and Wang [2]. Their
proposed H-tree algorithm is based on the construction of a
completely balanced clock tree as can be seen in Figure 1.
While the algorithm proposed by Dhar et al. will result
in zero skew clock tree construction with a minimum
amount of computational complexity, the approach results
in unacceptably large lengths of wire that are required to
route the resulting clock network. Therefore, this approach
is no longer considered for clock tree construction.
More recently, several others have proposed much more
complex algorithms which all succeed in constructing clock
trees with a characteristic zero skew and varying costs in
terms of complexity and resulting wirelengths. Examples
of these algorithms are the Method of Means and Medians
(MMM) algorithm, which was proposed by Jackson et al.
[3], and the Geometric Matching Algorithm, which was
proposed by Cong et al. [4].
In addition to development of new algorithms that
achieve overall zero skew clock tree construction, other
recent research has aimed at changing the most prominent
view that the entire clock network should have a
characteristic zero skew. The most notable research fitting
into this category was conducted by Chen et al [5]. They
researched and formulated the notion of associative skew.
In a paper published in 1999, they proposed that it is not
necessary that all synchronizing elements receive clock
signals with relative zero skew. Instead, what is important
is that closely related synchronizing elements receive clock
signals with relative zero skew. For example, flip flops in a
shift register, which are directly connected, should all
receive identical clock signals; that is, clock signals with
the same period and zero skew. On the other hand,
synchronizing elements which are disjoint from one
another in a circuit’s s-graph may not need to receive
relative zero skew clock signals. Illustrations of these two
scenarios are provided in Figures 2(a, b). In their paper,
they also discuss strategies and algorithms to determine
which synchronizing elements should be clustered together
to have relative zero skew, and provide a version of DME
which will construct appropriate clock routing trees.
D
Q
CLK
Figure 2(a). 4bit shift register (relative zero skew
should be enforced for all of the synchronizing elements)
W02-2
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
S0
tED(u, w) 
S1
Figure 2(b): An unconnected s-graph (synchronizing
element s2 may have clock skew relative to elements s0
and s1)
4.
PROBLEM FORMULATION
As was stated previously, the problem considered in this
paper is that of constructing a zero skew clock tree with
buffer insertion and minimum wirelength. Stated more
formally, the problem can be formulated as follows:
Given a set of clock sinks, {S}, construct a clock
network which stems from a single root node, R, and
terminates on all sinks, si  {S}. In the construction
of this network, all zero skew equations according to
the Elmore delay model must be observed and
satisfied. In addition, buffers must be inserted
where needed, so as to limit the capacitance seen at
any node to some upper threshold.
Finally,
wirelength should be minimized.
The Elmore delay model, which is used in order to assure
the zero skew characteristics is based on the first-order
moment of the impulse response, and is developed as
follows. Let α and β respectively denote the resistance and
capacitance per unit length of interconnect wire, so that the
resistance, rev, and capacitance, cev, of a given interconnect
wire, ev, are given by α| rev| and β| cev|, respectively. For
each sink si in the tree T(S), there is a loading capacitance cL,
which is the input capacitance of the functional unit driven
by si.
We let Tv denote the subtree of T(S) rooted at v, and let cv
denote the node capacitance of v. The tree capacitance of
Tv is denoted by Cv and equals the sum of the capacitances in
Tv. Cv is calculated using the following recursive formula:
(v is a sink node si )
cL

Cv  cv 
 (cew  Cw) (v is an internal node) (4)

w  children( v)

The Elmore delay can be calculated as follows:
(5)
Finally, it should be noted that Elmore delay is additive.
That is, if v is a vertex on the u-w path, then tED(u, w) =
tED(u, v) + tED(v, w), and in particular, if v is a child of u on
the u-si path, then tED(u, si) = rev((1/2)cev + Cv) + tED(v, si). A
sink node si may be treated as a trivial zero skew subtree
with capacitance cL, and delay zero.
S2
3.
1

 rev  cev  Cv 

ev  path(u , w) 2
DME WITH BUFFER INSERTION ALGORITHM
We propose an algorithm for zero skew clock tree
routing which both minimizes wirelength and inserts
buffers at the appropriate locations simultaneously. We
accomplish this task by combining the well-known
Deferred Merge Embedding (DME) algorithm with a
simple buffer insertion heuristic. The DME algorithm,
which was first proposed by Chao et al. [6], is based on the
idea that finalized interconnect embedding for a clock
network should be postponed as long as possible in hopes
that better embedding choices will be able to be made
given a greater view of the problem at hand. With this in
mind, the DME algorithm is performed in two phases, one
of which determines all possible zero skew embedding
tapping point locations for the clock network by working
bottom-up, from the network sinks to the root tapping
point. The other phase is a top-down merge embedding
procedure which acts to choose optimal tapping point
locations among those determined to provide zero skew in
the first phase of the algorithm. More formal descriptions
of the two phases implemented in our research are provided
in the sections that follow.
4.1 Bottom-up Phase I
As was stated previously, the purpose of the first phase
of our two phase algorithm is to determine all of the
possible zero skew tapping locations which will later most
likely result in a minimum wirelength tree construction
outcome. This phase of our algorithm was first described
by Masato Edahiro [7] in 1993. His proposed bottom-up
phase, which is what we have chosen to use, is described
below.
Let K be the current set of points and segments for
consideration in an iteration of the algorithm; initially K
will be the set of all clock sinking locations (K = {S}). On
each iteration of the procedure, the nearest neighbor pair in
the current set, K, is first found. Next, a merging segment
is constructed between the two nearest neighbors. The
process involved for merging segment construction was
first described by Chao et al. [5]. In their work, two
general cases of merging segment construction are
W02-3
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
discussed. These two cases correspond to two scenarios,
the normal case and the case in which interconnect snaking
is required.
Before determining which merging segment scenario is
the one present, the zero skew tapping length between the
two selected Manhattan segments must be found. To do
so, let TSa and TSb represent the two trees of merging
segments to be merged. Let TSa and TSb, respectively,
have capacitances C1 and C2 and delays t1 = tED(a) and t2 =
tED(b). In addition, let pl(v) be a merging point with
minimum merging cost.
From the Elmore delay model equation given by tED(v,
a) = rea((½)cea + C1), it can be seen that pl(v) satisfies the
following equation.
1

1

rea  Cea  C1   t1  reb Ceb  C 2   t 2
2

2

Figure 3(a). Merging Manhattan segments (Case #1)
(6)
Now, let the Manhattan distance between the two
selected merging segments, d(ms(a), ms(b)), be equal to κ.
Supposing that TSa and TSb can be merged with merging
cost κ, that is |ea| = x and |eb| = κ – x for 0  x  κ, then we
have the resistances rea = x and reb = (κ – x) and the
capacitances cea = x and ceb = (κ – x). After substituting
into (6) and solving for x, the following is obtained.


t 2  t1    C 2 
x
1
2


 
 C1  C 2   
(7)
From here, there are two cases, the first of which
requires no interconnect snaking, and occurs when 0  x 
κ. In this case, |ea| = x and |eb| = κ – x. In this case, the
merging Manhattan segment can be determined following a
procedure shown in Figure 3(a). It can be seen from the
figure that two Manhattan radii are drawn, one around
each Manhattan segment or core, with characteristic radii
as determined by the values ea and eb obtained previously.
In this case, the new merged Manhattan segment is the
overlap of the two Manhattan radii.
In the other case, in which x  0 or x  1, the
assumption of merging cost κ results in a negative edge
length for either ea or eb. In this case, an extended distance
κ’  κ is required to balance the delays of the two trees. If
x  0, which means that t1  t2, then we choose pl(a) as the
merging point and set |ea| = 0 and |eb| = κ’. Solving the
following equation for κ’,
Figure 3(b). Merging Manhattan segments (Case #2)
1

 'C 2   t 2
2

t1   ' 
(8)
one obtains the following.
C 2  2 t  t 
'
1/ 2
1
1
2
 C 2

(9)
Similarly, if x  κ, we set |eb| = 0 and
C 2  2 t  t 
| ea |
1/ 2
1
2

W02-4
1
 C1
(10)
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
In the interconnect snaking case, only one Manhattan
radius is drawn, for whichever of ea and eb is nonzero. In
this case, the resulting merged Manhattan segment is the
portion of the Manhattan core that is enclosed in the other
Manhattan core’s Manhattan radius.
This idea is
illustrated in Figure 3(b).
Now we present a formal description of our bottom-up
phase of clock tree construction, which is the same
algorithm originally used in Edahiro’s 1993 work [6], with
modifications made to include a simple buffer insertion
heuristic.
Algorithm Find_Center([NS])
Input:
Set of clock sinks {S}
Output: Tree of merged Manhattan segments, TS
K  {S}
while |K| != 1 (if |K| = 1, then the element in K is the
segment for the center vc)
Choose the nearest neighbor pair of Manhattan
segments, v1 and v2, from K
Calculate the segment for v from v1 and v2 using the
zero skew merge
Delete v1 and v2 from K
Add v to K
if v’s node capacitance > specified maximum node
capacitance
Insert a buffer at node v
Set v’s capacitance to zero
endif
endwhile
4.2 Top-down Phase II
At this point, the bottom-up phase has completed and a
resulting tree of merged Manhattan segments has been
obtained. The next step is to determine the exact final
embeddings of internal nodes in the zero skew clock tree.
This is accomplished in a top-down phase two of the
algorithm.
For a node v in topology G, i) if v is the root node, then
select any point in ms(v), the root nodes Manhattan
segment, to be pl(v), the new tapping or merging point; or
ii) if v is an internal node other than the root, choose pl(v)
to be any point in ms(v) that is at a distance |ev| or less from
the placement of v’s parent p. Because the merging
segment ms(p) was constructed such that d(ms(v), ms(p)) 
|ev|, there must exist some choice of pl(v) satisfying the
previously mentioned condition. In case ii), the algorithm
first creates a square tilted rectangular region (TRR), trrp,
with a radius of |ev| and a core equal to {pl(p)}; then pl(v)
can be any point from ms(v)  trrp. This is illustrated in
Figure 4. A more formal definition of this procedure,
termed Find_Exact_Placement by Chao et al. [6], is
provided below.
In our implementation of this bottom-up phase, the
nearest neighbor pair contained in a set of Manhattan
segments was found by using the Delaunay triangulation
graph method. This method will not be discussed any
further in this paper, as it is out of the scope of the real
research conducted; however, an excellent reference on the
subject can be found in the book, Computational Geometry:
An Introduction, by Preparata and Shamos [8]. Once the
nearest neighbors have been found, the zero skew merge
between the selected Manhattan segments is performed as
was previously described. In addition, the consideration of a
buffer insertion is performed at this point. This procedure of
pairing and merging with buffer insertion is performed until
only one node, the root node, remains. At this point, the
top-down phase of our algorithm is performed.
W02-5
Algorithm Find_Exact_Placement
Input:
Tree of merged Manhattan segments, TS,
containing ms(v) and |ev| for each node v in G
Output: Zero skew tree T(S)
for each internal node v in G (top-down order)
if v is the root
Choose any pl(v)  ms(v)
else
Let p be the parent node of v
Construct trrp as follows:
core(trrp)  {pl(p)}
radius(trrp)  |ev|
Choose any pl(v)  ms(v)  trrp
endif
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
1
2
4
3
Figure 5. Side numbering convention for Manhattan
radii
Manhattan
radius #1
side
1
2
3
4
Figure 4. Exact embedding illustration
5.
IMPLEMENTATION DETAILS
In the following subsections, some of the more
noteworthy details of our implementation of the algorithm
described in Section 4 are provided. Again, the content has
been split to reflect the two phases of the algorithm.
5.1 Bottom-up Phase I
There are three points to make about our
implementation of the first phase of the algorithm. The
first is to credit the use of the DCT Delaunay triangulation
code, authored by Geoff Leach from the Department of
Computer Science at RMIT.
The second concerns the methods used to avoid
mishaps due to rounding errors that are inherent in heavily
computationally based computing. The primary concern
of rounding errors in our implementation was that they
would make overlapping Manhattan radii impossible to
find consistently. Therefore, rather than looking for an
exact overlap between Manhattan radii, an alternative
stance was taken. It was first noted that although the truly
overlapping segments would most likely not truly overlap
in our representation, they would be very close to being
truly overlapped. Therefore, it was decided that it would
be best to determine which two sides of the two
Manhattan radii should be overlapping, and then
approximate the resulting merged Manhattan segment as
best as possible. In order to determine the overlapping
sides between two Manhattan radii, all four sides of each
radius are projected onto the x-axis to find their xintercepts.
Then the differences of all x-intercept
combinations which result in valid overlapping pairs are
computed. Table 1 provides the valid overlapping side
pairs in reference to Figure 5, which gives the reference
notation we used for numbering the sides of a Manhattan
radius.
Manhattan
radius #2
side
3
4
1
2
Table 1. Legal overlapping Manhattan radii sides
Once the four differences, in accordance to Table 1, are
computed, the minimum of the four is found and the two
corresponding sides are determined to be the truly
overlapping sides. From here, a simple comparison of the
four corresponding sides’ corner points, which searches for
two points of inclusion, can be made to determine the
newly merged Manhattan segment.
The third point to be addressed concerns the handling
of the Delaunay triangulation for line segments. Because
the triangulation package used supports the nearest
neighbor calculation for points only, each line segment was
approximated by a series of points which lie on the
corresponding line segment.
5.2 Top-down Phase II
In the Top-down phase, the exact tapping point on each
merging segment is located according to the algorithm
steps presented in Section 4.2. The implementation of this
algorithm is straight forward. However, there is one major
concern about proper wiring in the case of a snaking
situation.
When there is required snaking between a parent and
one of its child segments, the resulting wire length obtained
from the Elmore delay model is larger than the Manhattan
distance between them. Therefore, a plain Manhattan
connection would violate the specific requirements on wire
length that were determined in the first phase of our
algorithm. To fulfill these requirements, detour wiring is
implemented.
W02-6
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
The detour wire unit is a pre-defined wire pattern
intended to provide the extra wire length required by the
Elmore model. A sample is shown in Figure 6. The unit
routes from tapping point one to point five, where the
“source” is the parent’s tapping point, and the “target” is
the tapping point on a child segment. Connections b and d
constitute the actual Manhattan distance between the
source and target, while connections a and c serve as the
additional snaking wires. Detour wiring is accomplished
by concatenating the detour units into the Manhattan path,
as shown in Figure 6.
Target
5
d
c
4
3
b
1
a
2
Source
Figure 7. Notation of a detour wiring unit on y-plane
2
b
3
a
After connecting all of the detour units, the remaining
endpoint is connected directly to the target. The tapping
points in snaking case are accurately connected with
required wire length. Figure 8 shows an example of a
snaking case on tapping points with different coordinates.
c
Target
1
4
Source
d
5
Target
Figure 6. Notation of a detour wiring unit in the xplane
Source
In the above figure, a detouring example on the x-plane
is demonstrated. By defining the number of detour units as
n , the lengths of each connection l i are calculated by the
following equations (11 – 14).
Figure 8. Example of detour wiring
6.
  difference between x - coordinates
of source and target;
(11)
E  Required wire length;
(12)
lb  ld 

/ 2;
n
(13)
la  lc 
E
/ 2;
n
(14)
RESULTS AND OBSERVATIONS
In order to evaluate the performance of our algorithm,
we implemented a version of it on a target SPARC Ultra 10
machine. After completing the implementation, several of
the UCLA benchmarks, provided by C. W. Tsao, were run
with our implementation to observe our results and to
compare them to those of Tsao’s implementation. In what
follows, several tables of data and figures are presented
along with explanations and the implications of our
findings.
For a detouring example in the y-plane,  represents
the difference in y-coordinates between the two tapping
points, and the notation of connections is illustrated in
Figure 7.
Name
# Sinks
Wirelength
Prim1
Prim2
R1
R2
R3
R4
R5
269
603
267
598
862
1903
3101
167,621
388,351
1,374,093
2,689,683
3,440,234
7,010,219
10,408,226
Tsao’s
Wirelength
132,120
313,613
1,320,665
2,602,907
3,388,951
6,828,510
10,242,660
# Buffers
135
300
30
64
84
183
285
Table 2. Implementation results and comparison to Tsao’s
implementation
W02-7
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
From the data presented in Table 2, it can be seen that while
our implementation does not perform quite as well as Tsao’s, who
has a several year head start on us, it does perform quite well.
We partially attribute the differences in our results to the
approximations used that were discussed in Section 5. Other
areas of improvement are discussed in Section 8.
In addition to obtaining raw data on wirelength and the
number of buffers inserted, Matlab plots of our merged
Manhattan segment trees and routed zero skew clock trees were
also generated in code. Examples of these plots for benchmark
Prim1 are provided in Figures 9(a, b). In figure 9(b), the blue
mark is the root node, and all boxes represent buffers.
and more importantly, difficult to accomplish well. In our
research, we both formulated and implemented a zero skew
clock routing algorithm with minimum wire length and onthe-fly buffer insertion based on the famous deferred merge
embedding approach. From the data results obtained from
our runs on the UCLA benchmarks, it is apparent that while
our implementation does quite well at solving the specified
problem, it is by no means state-of-the-art. However, in the
same light, we have noted several improvements that could
be made to our implementation in the future work section
(Section 8).
8.
Figure 9(a). Merged Manhattan segment tree for
benchmark R1
FUTURE WORK AND IMPROVEMENTS
As one can imagine, a research project such as the one
discussed in this paper deserves a large amount of time in
order to obtain optimal results. However, we have only
been able to spend roughly one month delving into both the
theory behind our clock routing algorithm and its
implementation. As a result, we feel that this project, while
making some points about clock routing blatantly apparent,
could be improved in several ways. First and foremost, it
would be desirable to use a version of Delaunay
triangulation code specifically designed for the task at
hand. While line segment approximation through a series
of points is fairly reliable, it is not fool proof. Secondly,
we feel that a better means to verify the zero skew
characteristic are necessary. We have attempted to prove
that our trees are nearly zero skew by reapplying the
Elmore delay model to our final results. While this can be
argued to be sufficient to prove roughly zero skew (any
model is not exact), in reality it is redundant since it is what
we use to construct the tree in the first place. Thirdly, it
would be interesting to investigate the use of more
sophisticated buffer insertion heuristics with our algorithm.
Finally, our top-down procedure uses a simple “connect the
dots” type of router which does not look to share wires
between interconnect paths. However, one which does
could save further on interconnect wire length and should
thus be employed in the future.
9.
Figure 9(b). Zero skew clock tree for benchmark R1
7.
CONCLUSIONS
Several conclusions can be drawn from our work. First,
it is obvious from all of the preceding discussion that clock
routing is by no means trivial. The task of routing such a
large critical net so that its signal arrives at many sinks
literally scattered on a chip is both difficult to accomplish,
REFERENCES
[1] H. B. Bakoglu.
Circuits, Interconnections, and
Packaging for VLSI. Reading, MA: Addison-Wesley,
1990.
[2] Dhar, Franklin, and Wang. Reduction of Clock Delays
in VLSI Structure. ICCAD, 1984.
[3] Jackson, Sirinivasan, and Kuh. Clock Routing for
High-Performance ICs. DAC, 1990.
W02-8
Proc. 2002 IEEE Workshop on VLSI Design Automation, New Paltz, New York, Dec. 20-23, 2002
[4] Cong, Kahng, and Robins. Matching Based Models
for High-Performance Clock Routing. IEEE TCAD,
1993.
[5] Y. Chen, A. B. Kahng, G. Qu, and A. Zelikovsky. The
Associative-Skew Clock Routing Problem.
Proc.
IEEE/ACM Intl. Conference on Computer-Aided
Design, 168-172, November, 1999.
[6] T.-H. Chao, Y.-C. Hsu, J.-M. Ho, K. Boese, and A.
Kahng. Zero Skew Clock Routing with Minimum
Wirelength.
IEEE Transactions on Circuits and
Systems - II: Analog and Digital Signal Processing 39
(1992), 799-814.
[7] Masato Edahiro. A Clustering-Based Optimization
Algorithm in Zero-Skew Routings. Proceedings of the
30th international on Design automation conference,
p.612-616, June 14-18, 1993, Dallas, Texas, United
States
[8] F. P. Preparata and M. I. Shamos. Computational
Geometry: An Introduction. Springer-Verlag, New
York, New York, 1985.
W02-9
Download