Uploaded by M V Mourya

Timing convergence techniques in digital VLSI designs

advertisement
IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI-2017)
Timing Convergence Techniques in Digital VLSI
Designs
Linumon Thomas
Kiran V
Dept. of Electronics and Communication
RVCE, Bengaluru, Karnataka, India
Dept. of Electronics and Communication
RVCE, Bengaluru, Karnataka, India
Abstract—Primary goal of every design is to improve
performance of the system. In digital designs increase in
frequency is very important with each versions of design. As
frequency increases number of negative paths in a circuit will
become more, after static timing analysis. Techniques for making
negative margin paths to positive margin are explained here.
Through cell resizing, placement optimization, clock tuning and
routing optimization timing closure can be achieved.
Keywords-Static timing analysis,timing
tuning,placement and routing optimizations
convergence,clock
I. INTRODUCTION
be inserted. These cells are slow because of high threshold
voltage and can be helpful in hold violation fixes.
B. Clock Optimization
Clock optimization generally known as clock tuning
can be used for timing convergence. If a path is having setup
violation, it can be fixed by clock pushing. Clock pushing is
the process of adding extra clock buffers to the clock network.
Clock pushing can also be achieved by delaying clock cells.
Setup violation occurs when clock at a latch is fast and it
cannot meet setup condition.
Target of all digital designs is to improve the
performance of the system. Increase in frequency of operation
is a major parameter for performance improvement. As
maximum frequency of operation of a circuit increases,
number of paths with negative margin increases. Designer will
have the job to make all the paths with positive margin.
Timing convergence can be obtained by proper choice of cells,
clock tuning, placement optimization and routing
optimization. These techniques are explained in the following
section.
II.
TIMING CONVERGENCE TECHNIQUES
A. Proper choice of standard library cells
Each process technology comes with a set of standard
cells from which designer can choose standard cells based on
requirements. Each cell will be available in different flavors
and sizes. Proper selection of size will help in timing
convergence. Upsizing and downsizing can be helpful in
timing convergence.
If a cell is upsized the cell delay will become less and
slope also will be improved. When this happens total path
delay will be less. If a path is having setup margin violation,
upsizing reduces path delay and it helps in rectifying timing
violation. Similarly if a path is having hold violation,
downsizing increases cell delay and overall path delay. So
downsizing of a cell helps in fixing hold violations.
In standard cell library there will be cells with high
threshold voltage. In paths with hold violation, these cells can
Figure 2.1: Setup convergence by clock pushing
In Figure 2.1 Flip Flop FF2 is having setup violation.
In order to fix the violation clock is pushed as shown. It is
achieved by adding clock buffer CLK BUF2 in the clock
network. It delays the clock at FF2 and gives required setup
margin.
In paths, with hold violation clock pulling can be
implemented to fix the paths. Clock pulling is the process of
decreasing clock delay. Assume that in Figure 2.1, FF2 is
having hold violation. It can be fixed by making the clock
early at FF2.This can be achieved by removing CLK BUF2 or
decreasing cell delay of CLK BUF2 in Figure 2.1.Clock
978-1-5386-0814-2/17/$31.00 ©2017 IEEE
2882
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY DELHI. Downloaded on November 17,2023 at 18:39:15 UTC from IEEE Xplore. Restrictions apply.
IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI-2017)
pulling employed on Figure 2.1 to fix hold violation at FF2 is
shown in Figure 2.2.
around the driver. It ensures proper load balancing and less
path delays.
Figure 2.2: Hold violation Fix by removing CLK BUF2
C. Optimization of placement of cells
One of the major aspect which decides timing of circuit
paths is placement of cells. Optimum placement of cells is
very important in terms of power and timing convergence. In
order to reduce path delays, drivers and receivers should be
placed at minimum possible distance. If a driver is driving
more than one receiver as shown in Figure 2.3, Driver should
be placed at center of configuration with receivers symmetric
and equidistant to achieve proper load balancing.
Figure 2.4: Optimum placement of a driver and receivers
When cell blocks are placed as close as possible, net
delay between the cells decreases and total delay in paths will
become less. Lesser delay helps in achieving timing closure.
Another example of bad placement is shown in Figure
2.5.Here the driver is driving only one receiver and the
receiver drives another receiver Receiver2.Driver and
Receiver 1 should be kept nearby. It helps in reducing net
delay. Metal resources used, can also be reduced.
Figure 2.3: Bad placement of blocks
Figure 2.3 shows a configuration in which a driver is
driving 3 receivers. This is a bad placement example.
Receivers are concentrated at one end and driver is at opposite
end. Optimum placement for the structure is shown in Figure
2.4.Here driver is kept at center with receivers spreading
Figure 2.5: Non optimum placement
Ideal placement for the above configuration can be as
shown in Figure 2.6.Driver and Receiver1 are placed closer
2883
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY DELHI. Downloaded on November 17,2023 at 18:39:15 UTC from IEEE Xplore. Restrictions apply.
IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI-2017)
and Receiver 2 is placed near Receiver 1.Total net length will
be half.
Figure 2.8: Optimum Routing
Figure 2.6: Ideal placement of blocks
D. Optimization of Routing
Routing decides actual net delay and thereby timing
of digital circuits. Optimum utilization of metal resources is
very important. Higher metal layers with higher speed should
not be used for shorter distances. If higher metal layers are
used for shorter distances, it will create resource congestion
for critical paths.
Normal digital circuits have drivers with up to 32
fanouts. Optimum metal allocation and routing is very
important in capacitance load balancing. Interconnect
capacitance decides net delay. Thus for timing convergence
optimum metal allocation and routing are very important.
Figure 2.9: Fish bone routing configuration
Figure 2.7: Bad Routing example
In Figure 2.7, a Driver is driving Receiver
1.Placement of both the blocks are nearby. The routing
strategy used here is not optimum. Care should be taken to
minimize routing length and number of metal layers. Here 3
metal layers are used because of bad routing.it can be
optimized as shown in Figure 2.8.Here the routing length is
less than first case .Number of metal layers used are 2.This is
an optimum routing scenario. Saving in metal layer for short
distances gives liberty of more resources at critical paths
where timing convergence is difficult.
A routing configuration known as fish bone routing is
shown in Figure 2.9.Here a Driver is driving 4
Receivers.Driver is located at the centre and Receivers are
placed eqyuidistant from the Driver.Routing layer known as
Trunk runs from Driver to Receivers.The last minute
connection between Trunk and Receivers are given by lower
metal layers known as spines.When a Driver drives multiple
Receivers this configuration can be used.
III RESULTS
A module inside microprocessor, called Broadcast
module is designed and timing closure techniques are applied
2884
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY DELHI. Downloaded on November 17,2023 at 18:39:15 UTC from IEEE Xplore. Restrictions apply.
IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI-2017)
Table 3.1: Comparison of “Reference” and “Current” design
for setup margin
Margin(Pico
Number of paths Number of paths
second)
in Reference
in Current design
on it for an enhanced frequency of operation. Result of the
design is explained below. Design starts with a design known
as “Reference” and the final design is known as “Current”
design.
Setup Margin
Static timing analysis is done for the designed circuit
to identify timing violations and to fix the paths. Figure 3.1
shows a bar chart with distribution of number of paths against
margin bucket. This is for the “Reference” design
[-50 < X < -25 ]
26
0
[-25 < X < 0 ]
347
9
[0 < X < 25 ]
1594
320
[25 < X <50 ]
3493
1672
Hold Margin
Hold margin for paths are also important similar to
setup paths. Hold violations should be rectified for proper
operation of the circuits.
Figure 3.1: Setup margin distribution for Reference
X axis represents setup margin and y axis represents
number of paths. Similarly setup margin distribution for the
final design known as “Current” design is given in Figure 3.2.
Figure 3.3: Hold margin distribution for Reference design
Figure 3.3 shown bar chart distribution for Reference
design. Figure 3.4 shows bar chart distribution for Current
design.
Figure 3.2: Setup margin distribution for Current design
Comparison of number of paths in each margin
bucket between Reference and Current design is given in
Table 3.1.From the table it is clear that the Reference design
has 373 paths with negative margin. In current design number
of paths with negative margin is 9.These 9 paths are external
paths. That means, these paths are generated from other
modules. It is the responsibility of driver modules to fix these
paths as decided from top section level.
Figure 3.4: Hold margin distribution for Current design
2885
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY DELHI. Downloaded on November 17,2023 at 18:39:15 UTC from IEEE Xplore. Restrictions apply.
IEEE International Conference on Power, Control, Signals and Instrumentation Engineering (ICPCSI-2017)
Table 3.2: Comparison of “Reference” and “Current” design
for hold margin
Number of
Margin(Pico
paths in
Number of paths in
second)
Reference
Current design
[-50 < X < -25 ]
1986
0
[-25 < X < 0 ]
3117
0
[0 < X < 25 ]
2017
6027
[25 < X <50 ]
1013
4288
as cell resizing, clock tuning, placement optimization and
routing optimization. These techniques can be applied for all
digital circuits for timing closure.
REFERENCES
Comparison of number of paths for hold margin bucket
between Reference and Current design is shown in Table
3.2.From the table it is clear that Reference design has 5103
paths with negative margin. In Current design all the
violations are cleared. Number of paths with negative margin
is 0.
IV CONCLUSION
A module was designed and timing convergence
methods were applied on the module. The design was done by
applying low power techniques [1][2][3] and significant
saving in power was obtained. The module was converged for
timing by applying techniques as explained in the paper such
[1] Mayank Chakraverty, Harisankar PS and Vaibhav
Ruparelia,” Low
Power Design Practices for Power Optimization at the Logic and
Architecture Levels for VLSI System Design”, IEEE conference
publications,International conference on energy efficient technologies
for sustainability,2016.
[2] Gary K. Yeap, “Logic” in Practical Low Power Digital VLSI
Design,1sted, Kluwer Academic Publishers Norwell, MA, USA ©1998
ISBN:978-0-7923-8009-2.
[3] J.T. Burd and R. Brodersen, “Processor Design for Portable
Systems”,Journal of VLSI Signal Processing Systems, vol. 13, no. 2–3,
pp. 203–221, August 1996.
[4] J. Montanaro, et al., “A 160-MHz, 32-b, 0.5-W CMOS RISC
Microprocessor”, IEEE Journal of Solid-State Circuits, vol. 31, no.
11,pp. 1703–1714, November 1996.
[5] M. Takahashi, et al., “A 60-mW MPEG4 Video Coded Using Clustered
Voltage Scaling with Variable Supply-Voltage Scheme”, IEEE Journal
of Solid-State Circuits, vol. 33, no. 11, pp. 1772– 1780, November 1998.
[6] Phani kumar M, N. Shanmukha Rao, “A Low Power and High Speed
Design for VLSI Logic Circuits Using Multi-Threshold Voltage CMOS
Technology”, International Journal of Computer Science and
Information Technologies (IJCSIT), Vol. 3 (3) , PP. 4131-4133,ISSN:
0975-9646, 2012.
[7] Ko-Chi Kuo and Hsueh-Ta Ko, “Low Power Design Flow with Static
and Statistical Timing Aanalysis”, 2012 IEEE International Symposium
on Intelligent Signal Processing and Communication Systems (ISPACS
2012) November 4-7, 2012
[8] Ali Dasdan and Ivan Hom, “Handling Inverted Temperature
Dependence Static Timing Analysis,” ACM Transactions on Design
Automation of Electronic Systems, vol. 11, no. 2, pp. 306-324, April
2006
2886
Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY DELHI. Downloaded on November 17,2023 at 18:39:15 UTC from IEEE Xplore. Restrictions apply.
Download