Uploaded by 825491226

685178547-ICC2-CTS-Quick-Reference-Guide

advertisement
Quick Reference Guide to
IC Compiler II Clock Tree Synthesis
Version 2.0
CONFIDENTIAL INFORMATION
The following material is confidential information of Synopsys and is being
disclosed to you pursuant to a non-disclosure agreement between you or your
employer and Synopsys. The material being disclosed may only be used as
permitted under such non-disclosure agreement.
IMPORTANT NOTICE
In the event information in this presentation reflects Synopsys’ future plans,
such plans are as of the date of this presentation and are subject to
change. Synopsys is not obligated to develop the software with the features
and functionality discussed in these materials. In any event, Synopsys’
products may be offered and purchased only pursuant to an authorized quote
and purchase order or a mutually agreed upon written contract.
© 2016 Synopsys, Inc. 2
Contents
•
Introduction
•
Ensuring the Design is Ready for Clock Tree Synthesis
•
Setting Up for Clock Tree Synthesis
•
Performing Clock Tree Synthesis and Optimization
•
Analyzing the Results
© 2016 Synopsys, Inc. 3
IC Compiler II Clock Tree Synthesis (CTS)
Overview
• IC Compiler II supports various type of clock tree building technologies to meet various design
aspects
– Standard clock tree synthesis
– Fully automated
– Low power consumption
– Structural multisource clock tree synthesis (MSCTS)
– Low skew and very high OCV tolerance
– Regular multisource clock tree synthesis (MSCTS)
– Lesser power than structural MSCTS
– Better OCV than standard CTS
This application note covers standard clock tree synthesis in IC Compiler II
© 2016 Synopsys, Inc. 4
IC Compiler II Clock Tree Synthesis
Flow Overview
Read in placed block
Perform CTS setup
clock_opt \
‐from build_clock \
‐to route_clock
Analyze results
report_clock_qor
report_qor
Perform post-CTS optimization
clock_opt ‐from final_opto
© 2016 Synopsys, Inc. 5
Contents
•
Introduction
•
Ensuring the Design is Ready for Clock Tree Synthesis
•
Setting Up for Clock Tree Synthesis
•
Performing Clock Tree Synthesis and Optimization
•
Analyzing the Results
© 2016 Synopsys, Inc. 6
Design Ready for CTS?
Nondefault routing rules
and layer list applied ?
Balance point definitions
are applied ?
DRC constraints are set?
References are set for
CTS ?
Pre-CTS sanity check
completed ?
© 2016 Synopsys, Inc. 7
Contents
•
Introduction
•
Ensuring the Design is Ready for Clock Tree Synthesis
•
Setting Up for Clock Tree Synthesis
•
Performing Clock Tree Synthesis and Optimization
•
Analyzing the Results
© 2016 Synopsys, Inc. 8
Setting Up the Design for CTS
Setting Up the Scenario for CTS
• IC Compiler II clock tree synthesisworks on all active scenarios that are enabled for
setup or hold for skew balancing and latency optimization
• Clock tree synthesis also performs logical DRC fixing on these scenarios if they are
enabled for maximum transition or maximum capacitance with the
set_scenario_status command
• The tool does not use CTS-specific scenario settings such as the cts_mode or
cts_corner settings in IC Compiler
Leakage
Dynamic
Name
Mode
Corner
Active Setup Hold Power
Power
Max_tran Max_cap Min_cap
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
s1
func1
BEST
true
false true true
true
true
true
true
s2
func1
WORST
true
true
false true
true
true
true
true
s3
test1
BEST
true
false true true
true
true
true
true
© 2016 Synopsys, Inc. 9
Setting Up Design for CTS
Constraints
• Use set_max_transition and set_max_capacitance commands to set the DRC
constraint for clock tree
Define the DRC constraint
#DRC constraint
current_scenario func_worst
set_max_transition
0.10
‐clock_path [get_clocks ]
set_max_capacitance 0.300 ‐clock_path
[get_clocks]
current_scenario func_best
set_max_transition
0.200 ‐clock_path [get_clocks ]
set_max_capacitance 0.400 ‐clock_path
© 2016 Synopsys, Inc. 10
[get_clocks]
for each scenario. During
CTS scenario specific
constraints are used for
DRC fixing in each scenario
Setting Up Design for CTS
Constraints
• By Default CTS uses 0 target skew and latency for clock building
• To define relaxed target for low frequency clocks use set_clock_tree_options
command
#Target skew and latency
current_scenario fun_worst
set_clock_tree_options
‐target_skew 0.1 \
Define the constraint for each
scenario. If the constraint is not
defined for one of the scenario
then CTS uses default target
setting
‐target_latency 0.30 \
‐corner worst
current_scenario func_best
set_clock_tree_options
‐target_skew 0.1 \
‐target_latency 0.150 \
‐corner best
Understand mode and corner specific constraints and apply accordingly
© 2016 Synopsys, Inc. 11
Setting Up Design for CTS
Constraints
Optional settings for clock tree synthesis
# Fanout Control
set_app_options ‐name cts.common.max_fanout ‐value 2000
# Clock Cell Spacing
set_clock_cell_spacing \
‐x_spacing 0.9 ‐y_spacing 0.4 ‐lib_cells lib/BUF
# Max net length
set_app_options ‐name cts.common.max_net_length ‐value 200
© 2016 Synopsys, Inc. 12
Setting Up Design for CTS
Setting Up the Nondefault Rule (NDR) and Layer Constraints
• IC Compiler II CTS supports various types of NDR and layer specifications when
building the clock tree
1. Net specific NDR from set_routing_rule
•
NDR and layer list does not get propagated to newly created nets
2. Net specific NDR from set_clock_routing_rules
•
NDR and layer list does get propagated to newly created nets
3. Clock Specific NDR from set_clock_routing_rules
•
Supports separate NDRs and layer list for root, leaf, and internal nets, which are clock
specific
4. Global NDR from set_clock_routing_rules
•
© 2016 Synopsys, Inc. 13
Supports separate NDRs and layer list for root, leaf, and internal nets
Setting Up Design for CTS
Setting Up the Nondefault Rule (NDR) and Layer Constraints
• IC Compiler II CTS supports 3 levels of NDRs and layer list:
–Root NDR:
– Applied on the root clock net till the first point of branching or till the first preexisting cell on the clock tree,
whichever comes first
–Sink NDR:
– Applied on nets connected to the clock pins of flops
–Internal NDR:
– Applied on nets other than root and sink
Case 1
Root NDR till the first preexisting
g cell on clock tree
© 2016 Synopsys, Inc. 14
Post-CTS
Pre-CTS
Root
Internal
Leaf
Setting Up Design for CTS
NDR and Layer List Examples
Case 2
Post-CTS
Pre-CTS
Root
Root NDR till the first point of branching
g
Case 3
Leaf
Post-CTS
Pre-CTS
Root NDR till the first point of branching
g
© 2016 Synopsys, Inc. 15
Internal
Root
CTS inserted buffer
Internal
Leaf
Setting Up Design for CTS
Setting the Reference for Clock Tree Building and Optimization
• CTS commands use library cells with valid_purposes=cts and
dont_touch=false attribute settings
set_lib_cell_purpose –include cts ${lib_cells}
• You need to ensure all cell types (such as ICGs, MUX) on clock tree have logically
equivalent library cells specified as IC Compiler II CTS for sizing optimization.
–If logically equivalent library cells with valid_purposes=cts are not present, CTS
does not size the cell instances and QoR might get affected
–To find the LEQ cells of preexisting cells on clock tree and have it in the reference list,
use the derive_clock_cell_references command
© 2016 Synopsys, Inc. 16
Setting Up Design for CTS
Defining Clock Tree Exceptions: Balance Point
• Balance point is a stop or float exception that defines the endpoint to be considered as a valid
balance endpoint for clock tree synthesis
• You can either define the pin as sink pin with 0 phase delay or float pin with required insertion
delay requirement
• By default, balance point applies to all the corners of current mode
• When a corner is provided with the –corner option, the constraint applies to given corner
and current mode
Setting
Details
set_clock_balance_points \
–balance_points reg1/CK \
–delay 0.1 –corners corner1
Defines the float delay exception of 0.1 on
register clock input pin in corner “corner1” of
current mode
set_clock_balance_points \
–balance_points buf1/A
The input of the buffer is considered as stop
point for clock tree synthesis for current mode
Apply the balance point delay in primary corner. During CTO, the
tool performs scaling and uses the scaled delay in other corners
© 2016 Synopsys, Inc. 17
Setting Up Design for CTS
Primary Corner for CTS
• In MCMM designs, during CTS, the tool chooses the primary corner for clock tree building
– Primary corner is selected based on corner with worst wire delay, gate delay, and balance point
– As primary corner selection also depends on balance point delay, always define the balance point
delay for each mode on associated worst corner with –corner option
– Optionally, you can also control the primary corner selection by using the
cts.compile.primary_corner application option
icc2_shell> set_app_options ‐name cts.compile.primary_corner \
‐value corner_worst
© 2016 Synopsys, Inc. 18
Setting Up Design for CTS
Automatic Scaling of Balance Point During Clock Tree Optimization (CTO)
• During clock tree optimization, the tool automatically scales the balance point delay from the
primary corner to all other corner of the same mode
Total number of global routed clock nets: 28
Information: The run time for clock net global routing is 0 hr : 0 min : 0.86 sec, cpu time is 0 hr : 0 min : 0.86 sec. (CTS‐104)
…
Information: Balance point auto scaling set on term ‘reg1/CK' clock 'clk' corner 'func_C2'
early rise 0.104088 early fall 0.104088 late rise 0.104088 late fall 0.104088. (CTS‐040)
icc2_shell> set_clock_balance_points \
–balance_points reg1/CK \
–delay 0.1 –corners C1
Scenario
Mode
Corner
Balance point delay considered by tool for reg1/CK
S1
Func1
C1
0.1
S2
Func1
C2
CTO automatically scales 0.1, and use that for optimization
S3
Func2
C1
No balance point as the mode is different
However, if the mode is same as func1, you must use the
copy feature to copy the exceptions
© 2016 Synopsys, Inc. 19
Setting Up Design for CTS
Balance Point Delay Copying and Scaling
• To copy the exception across the similar modes, set the clock tree options as follows:
set_clock_balance_points \
–balance_points reg1/CK \
–delay 0.1 –corners C1\
set_clock_tree_options ‐copy_exceptions_across_modes \
‐from_mode func1 –to_mode {func2 func3…}
– If there are user defined balance points in the mode specified by the –to_mode option, they are
not overwritten
– Ignore exceptions are not copied across modes
Scenario
Mode
Corner
Balance point delay considered by tool for reg1/CK
S1
func1
C1
0.1
S2
func1
C2
CTO automatically scales 0.1, and use that for optimization
S3
func2
C2
The tool copies and scales the exception to the C2 corner
and uses it in the S3 scenario for optimization
© 2016 Synopsys, Inc. 20
Synopsys Confidential
Setting Up Design for CTS
Defining Clock Tree Exceptions: Exclude Pins
• Ignore or exclude exception
– These exceptions exclude clock endpoints from being optimized for skew and latency
– Exclude exception is a mode specific constraint and applies to the current mode
Setting
Details
set_clock_balance_points \
–balance_points reg1/CK \
–consider_for_balancing false
The clock pin of the register is considered
as ignore pin for clock tree synthesis and
optimization. For these pins only DRC
fixing is done
© 2016 Synopsys, Inc. 21
Setting Up Design for CTS
Defining Clock Tree Exceptions: Don’t Touch and Size Only
• Setting don’t touch on subtree
icc2_shell> set_dont_touch_network ‐clock_only [get_pins pin_name]
• Setting don’t touch on net and instance
icc2_shell> set_dont_touch [get_cells cell_name] true
icc2_shell> set_dont_touch [get_nets ‐segments net_name] true
• Preventing the router from touching a net
icc2_shell> set_attribute
[get_nets $clk_net] physical_status locked
• Setting size only
icc2_shell> set_size_only [get_cells cell_name] true
• Setting physical status
icc2_shell> set_placement_status fixed [get_cells cell_name]
© 2016 Synopsys, Inc. 22
Setting Up Design for CTS
Defining the Clock Balance Group for Inter Clock Delay Balancing (ICDB)
• Inter clock balance groups specifies the group of clocks which needs to be balanced together
for the timing
• You can automatically generate the balance constraint based on the timing relation using
derive_clock_balance_constraints command
– Use the create_clock_balance_group command for the customized balance requirements
• By default, the clock_opt –from build_clock command runs ICDB.
• To run standalone ICDB, you can use the balance_clock_groups command.
© 2016 Synopsys, Inc. 23
Setting Up Design for CTS
Handling of Clock Tree Exceptions During Clock Tree Synthesis : Cell Specific
Allow
Sizing
Allow
Movement
Allow buffer
removal
Allow splitting
Allow merging with
merge_clock_gates or
MSCTS
physical_status fixed or locked
No
No
No
No
No
physical_status legalize_only
Yes
Only to legalize
No
No
No
No
Yes
No
No
No
Yes
Yes
No
No, if user_size_only
Yes, if derived_size_only
No, if user_size_only
Yes, if derived_size_only
Exception
dont_touch
size_only
© 2016 Synopsys, Inc. 24
Setting Up Design for CTS:
Clock Skew Groups
• Designs might have requirements to balance few clock pins separately from
rest of clock pins
• Clock skew groups provide a way to meet this requirement
Enable path
e
n
CLK
CLK
ICG
gclk1
CLK
Group1
gclk2
gclk3
Group1
CLK
Group2
No timing paths between group1 & group2
Need minimum latency for Group1 registers
Skew groups are mode specific constraints. Ensure that group is created in all modes
© 2016 Synopsys, Inc. 25
Confidential
Setting Up Design for CTS
Clock Skew Group Example
• Create the clock skew group as follows, and then run CTS
# create clock skew groups
•
Skew groups are mode specific. Create it
for each mode
foreach_in_collection m [get_modes] {
current_mode $m
create_clock_skew_group ‐name sg1 ‐objects {sink1/CP sink2/CP}
create_clock_skew_group ‐name sg2 ‐objects {sink3/CP sink4/CP}
report_clock_skew_groups
}
clock_opt –from build_clock –to route_clock
© 2016 Synopsys, Inc. 26
Setting Up Design for CTS
Summary of Constraints
CTS Configuration
Command/app option
Recommendation
Maximum transition
set_max_transition –clock_path
Define for each scenario.
Maximum capacitance
set_max_capacitance –clock_path
Define for each scenario.
Target skew
set_clock_tree_options \
–target_skew
Define for each corner
Target Latency
set_clock_tree_options \
‐target_latency
Define for each corner
Routing rule and layer
set_clock_routing_rules
Define layer and NDR together
Balance point
(float and exclude)
set_clock_balance_points
Define the balance point delay for each mode on associated
worst corner with the –corner option
Exceptions
(sizing and don’t touch)
set_dont_touch
set_size_only
Define the exception only on required cells and nets
References
set_dont_touch $cts_ref false
set_lib_cell_purpose –include cts
$cts_ref
Ensure that non-repeaters such as ICGs, MUXs, and so on
are also included in the reference list
Clock balance group for
ICDB
derive_clock_balance_constraints/
create_clock_balance_group
© 2016 Synopsys, Inc. 27
Setting Up Design for CTS
Example Setup Script
Clear any unintended NDR on clock network before
CTS and apply correct rule
Clear don’t touch on clock network before CTS
Ensure that there are no dont_touch attribute on
references p
provided for CTS
Define the DRC constraint for all the scenarios
Define the target skew/latency for all the scenario
Define the balance point delay for each mode on
p
associated worst corner with –corner option
© 2016 Synopsys, Inc. 28
Setting Up Design For CTS:
Pre-CTS Sanity Check: report_clock_settings
Review the settings using the report_clock_settings command
© 2016 Synopsys, Inc. 29
Setting Up Design For CTS:
Pre-CTS Sanity Check: report_clock_routing_rules
•
© 2016 Synopsys, Inc. 30
Ensure that routing rule is always
defined with layer list. If layers are not
reported, set the rule again with correct
layer list
Setting Up Design for CTS
Pre-CTS Sanity Check: check_clock_trees
© 2016 Synopsys, Inc. 31
Confidential
•
Summary report provides improved
clarity
•
Classification of checks into multiple
categories to provide a quick overview
•
Detailed information about each
warning for faster debugging
•
Tabular form of reporting for
improved clarity
Setting Up Design For CTS:
Pre-CTS sanity check: check_legality
When there are illegally placed cells in the input
design, it might lead to runtime and bad QOR
© 2016 Synopsys, Inc. 32
Setting Up for CTS
Scenario setup
NDR and layer list
DRC constraints
Reference setting
Balance point definition
ICDB constraints
Skew group
Pre-cts sanity check
© 2016 Synopsys, Inc. 33
Quiz 1
• Which layers are used by CTS for root, internal, and sink nets for following NDR specification?
icc2_shell> set_clock_routing_rules ‐net_type all \
–rules 2w2s –max_routing_layer M6
–min_routing_layer M4
icc2_shell> set_clock_routing_rules ‐net_type internal ‐rule 2w2s
icc2_shell> set_clock_routing_rules ‐net_type root ‐rule 2w2s
icc2_shell> set_clock_routing_rules ‐net_type sink ‐rule 1w2s
© 2016 Synopsys, Inc. 34
Quiz1: Answer
• Because layers are not specified for root, internal, and sink nets types,the tool uses all the
available layers for routing.
Always specify the layer list and NDR together for each net type
Correct Usage:
icc2_shell> set_clock_routing_rules ‐net_type internal \
‐rule 2w2s –max_routing_layer M6 \
–min_routing_layer M5
icc2_shell> set_clock_routing_rules ‐net_type root ‐rule 2w2s \
‐max_routing_layer M6 –min_routing_layer M5
icc2_shell> set_clock_routing_rules ‐net_type sink ‐rule 1w2s \
‐max_routing_layer M6 –min_routing_layer M5
© 2016 Synopsys, Inc. 35
Contents
•
Introduction
•
Ensuring the Design is Ready for Clock Tree Synthesis
•
Setting Up for Clock Tree Synthesis
•
Performing Clock Tree Synthesis and Optimization
•
Analyzing the Results
© 2016 Synopsys, Inc. 36
Clock Tree Synthesis and Optimization
CTS Flow overview
Read in placed block
Perform CTS setup
Perform CTS using clock_opt –to build_clock
command, during which the tool performs:
• Removal of clock tree
• Auto exception generation
• Gate by gate clock tree building
• DRC fixing beyond exception
• Global routing of clock nets
• Pre-opt DRC fixing
• Skew and latency optimization
• Post opt DRC fixing
• Clock tree summary
• ICDB
Analyze the CTS results
Route clock nets
© 2016 Synopsys, Inc. 37
Perform post-CTS and postroute optimization
Clock Tree Synthesis and Optimization
Tool Derived Automatic Exceptions for Ease of Use
Cases
CTS balancing conflict
Tool derived automatic exception
Case 1
Conflict between two clocks due to missing
generated clock
Derive exclude exception for the clock with missing
generated clock
Case 2
Internal sink pins of a cell with clock output pins
Derive exclude exception on internal sink pins
Case 3
Large ETM internal skew coming from the max/min
clock tree path
Derive balance point with maximum delay at the ETM
input pin
Case 4
Corners with missing balance points
Scales the user defined delays from other corners
Unfixable
skew
Exclude
Balance
point
Exclude
ETM
clk
Case 1
© 2016 Synopsys, Inc. 38
Case 2
Confidential
Case 3
Clock Tree Synthesis and Optimization
Clock Tree Synthesis: Analyzing Log File
************************************************************
* CTS STEP: Design Initialization
************************************************************
Information: All clock objects will be converted from ideal to propagated clock during CTS. (CTS‐105)
Information: CTS will work on the following scenarios. (CTS‐101)
• Structured log file where each step
s1
(Mode: func1; Corner: func_BEST)
is separated by a header.
s2
(Mode: func1; Corner: func_WORST)
Information: CTS will work on all clocks in active scenarios, including 1 master clocks and 0 generated• clocks.
Under(CTS‐107)
each step, it provides more
information about settings used,
details of the operation performed,
summary, and more.
CTS related app options set by user:
cts.common.verbose = 1
Buffer/Inverter reference list for clock tree synthesis:
stdcell_lib/BUFX10
CTS NDR rule list:
Design Base; Net Type: root;
Rule: 2w2s; Min Layer: M6; Max Layer: M7
Design Base; Net Type: internal; Rule: 2w2s; Min Layer: M6; Max Layer: M7
Design Base; Net Type: sink;
Rule: 1w2s; Min Layer: M4; Max Layer: M7
************************************************************
* CTS STEP: Existing Clock Tree Removal
************************************************************
A total of 1 buffer(s) and 0 inverter(s) have been removed.
************************************************************
* CTS STEP: Clock Tree Initialization
************************************************************
Start Auto‐Exception Derivations...
No internal pin was found.
No conflict pin was found.
No macro pin was found for clock balance point settings.
© 2016 Synopsys, Inc. 39
Synopsys Confidential
Clock Tree Synthesis and Optimization
Clock Tree Synthesis: Analyzing Log File
************************************************************
* CTS STEP: Clock Cell Relocation
************************************************************
Inst 'mux1' is not movable
Information: Relocated the clock cell 'icg3' from (791.37, 352.30) to (188.19, 87.10). (CTS‐106)
Information: Relocated the clock cell 'icg2' from (206.15, 373.10) to (358.97, 864.50). (CTS‐106)
At each gate level, it provides the
Inst 'b2' is not movable
Information: Relocated the clock cell 'icg1' from (795.42, 744.90) to (989.96, 370.50). (CTS‐106)
details about driver name, clocks
A total of 3 clock cells have been relocated
considered for synthesis, number of
************************************************************
* CTS STEP: Gate‐By‐Gate Clock Tree Synthesis
loads, DRC constraint, and more.
************************************************************
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
Gate level 2 clock tree synthesis
driving pin = mux1/Z
Clocks:
clk (func1)
Design rule constraints:
max transition = 0.100000
max capacitance = 0.600000
Number of load sinks = 1
Number of ignore points = 0
Warning: Gate 'mux1' is not sizable because of lib_cell dont_touch. (CTS‐041)
.....
Added 5 Repeaters. Built 5 Repeater Levels
Phase delay: mux1/D0 : (lp max: 0.194 sp min: 0.194) : skew = 0.000
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
…
…
Information: The run time for gate‐by‐gate clock tree synthesis is 0 hr : 0 min : 1.35 sec, cpu time is 0 hr : 0 min : 1.35 sec. (CTS‐
104)
© 2016 Synopsys, Inc. 40
Synopsys Confidential
Clock Tree Synthesis and Optimization
Clock Tree Synthesis: Analyzing Log File
************************************************************
* CTS STEP: DRC Fixing Beyond Exceptions
************************************************************
…
• User friendly log reporting, which
prints the QoR information for each
clock, corner, and mode at the start
and end of each optimization step.
• See the CTS-037 message for
details.
************************************************************
* CTS STEP: Clock Net Global Routing
************************************************************
Total number of global routed clock nets: 28
Information: The run time for clock net global routing is 0 hr : 0 min : 0.86 sec, cpu time is 0 hr : 0 min : 0.86 sec. (CTS‐104)
…
Information: Balance point auto scaling set on term 'sink3/CP' clock 'clk' corner ‘’func_WORST' early rise 0.104088 early fall 0.104088
late rise 0.104088 late fall 0.104088. (CTS‐040)
…
************************************************************
* CTS STEP: Pre‐Optimization DRC Fixing
************************************************************
Information: CTS QoR Pre Initial DRC Fixing: GlobalSkew = 0.0068; ID = 0.4446; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Pre Initial DRC Fixing: GlobalSkew = 0.0096; ID = 0.4640; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
Resized 0 cell(s), relocated 0 cell(s), cloned 0 cell(s) and inserted 0 buffer(s)/inverter(s)
Information: CTS QoR Post Initial DRC Fixing: GlobalSkew = 0.0068; ID = 0.4446; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = fun_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Post Initial DRC Fixing: GlobalSkew = 0.0096; ID = 0.4640; NetsWithDRC = 0; Worst Tran/Cap
cost
= 0.0000/0.0000;
Grep for
CTS-104
message to get the
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
runtime information for each step
…
…
Information: The run time for pre‐optimization DRC fixing is 0 hr : 0 min : 0.00 sec, cpu time is 0 hr : 0 min : 0.00 sec. (CTS‐104)
© 2016 Synopsys, Inc. 41
Synopsys Confidential
Clock Tree Synthesis and Optimization
Clock Tree Synthesis: Analyzing Log File
• User friendly log reporting, which
prints the QoR information for each
clock, corner, and mode at the start
and end of each optimization step.
• See the CTS-037 message for
details.
************************************************************
* CTS STEP: Skew Latency Optimization and Area Recovery
************************************************************
Information: CTS QoR Pre Optimization: GlobalSkew = 0.0068; ID = 0.4446; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Pre Optimization: GlobalSkew = 0.0096; ID = 0.4640; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 34.7490; ClockCellArea = 51.7725; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
Begin Network Flow Based Optimization:
Default network flow optimizer made 27 successful improvements out of 58 iterations
Resized 4, relocated 12, deleted 4, inserted 0, sizeUp Relocated 0 cells
Information: CTS QoR Post Optimization: GlobalSkew = 0.0064; ID = 0.3539; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Post Optimization: GlobalSkew = 0.0024; ID = 0.3650; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
Begin Area Recovery Buffer Removal:
AR: deleted 0 cell(s)
Information: CTS QoR Post Area Recovery: GlobalSkew = 0.0064; ID = 0.3539; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Post Area Recovery: GlobalSkew = 0.0024; ID = 0.3650; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
Begin Area Recovery Resizing:
AR: resized 0 out of 24 cell(s)
Information: CTS QoR Post Area Recovery: GlobalSkew = 0.0064; ID = 0.3539; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Post Area Recovery: GlobalSkew = 0.0024; ID = 0.3650; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
© 2016 Synopsys, Inc. 42
Synopsys Confidential
Clock Tree Synthesis and Optimization
Clock Tree Synthesis: Analyzing Log File
************************************************************
* CTS STEP: Post‐Optimization DRC Fixing
************************************************************
Resized 0 cell(s), relocated 0 cell(s), cloned 0 cell(s) and inserted 0 buffer(s)/inverter(s)
Information: CTS QoR Post Final DRC Fixing: GlobalSkew = 0.0064; ID = 0.3539; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_BEST; ClockRoot = clk1. (CTS‐037)
Information: CTS QoR Post Final DRC Fixing: GlobalSkew = 0.0024; ID = 0.3650; NetsWithDRC = 0; Worst Tran/Cap cost = 0.0000/0.0000;
ClockBufArea = 30.8880; ClockCellArea = 47.9115; Clock = clk; Mode = func1; Corner = func_WORST; ClockRoot = clk1. (CTS‐037)
…
************************************************************
• At the end of CTS, the tool prints
* CTS STEP: Postlude
************************************************************
the summary information, which
Marking clock synthesized attributes
includes information about clock
Successfully legalize placement.
************************************************************
tree, skew bottleneck analysis
* CTS STEP: Summary report
report, and summary of errors and
************************************************************
There are 25 flat clock tree nets.
warnings during CTS.
There are 26 non‐sink instances (total area 51.07) on clock trees including 4 instances dont_touch.
• Check this report as a first step
Clock tree synthesize and optimization added 2 buffers and 18 inverters (total area 29.48).
3 buffers/inverters were inserted below 1 leaf level Gates.
when analyzing and debugging the
Skew Bottleneck Analysis:
Largest skew jumps (> 0.1 or 50 percent of the global skew) of the terms for clock clk:
Skewgroup: default_clk, Corner: func_BEST
Skew jumped by 0.010 at term icg2/GCP
Summary of messages during CTS:
===============================================================
Tag
Count Type
Description
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
CTS‐041
4 Warning Gate '%s' is not sizable because of %s dont_touch.
CTS‐043Inc. 431 Warning Gate '%s' is not sizable because of fixed
placement
© 2016 Synopsys,
Synopsys
Confidential
CTS results
Clock Structure: Pre-CTS
I
clk_out1
clk_out2
I
E
set_output_delay 2 clk_out2 \
–clock clk1
Setting: NDR
Root: 2W3S
Internal : 2W2S
Sink: 1W2S
CG
I
is_clock_used_as_data=true
is_clock_used_as_clock=false
gclk
clk1
I0
clk2
I
Z
I1
set_case_analysis 0 SEL
(I1 ‐> Z is disabled)
B
CG
is_clock_used_as_clock= false
is_clock_used_as_data=false
I
© 2016 Synopsys, Inc. 44
cip
cts_size_in_place
B
I
E
Balance point
Implicit Ignore
Explicit Ignore
Clock Structure: Pre-CTS
Attribute Settings
• All the pins on the clock network have the is_clock_used_as_clock attribute as true,
except for the pins which goes to data paths and disabled arcs
• CTS works only on the path with pins that have the is_clock_used_as_clock attribute
as true
© 2016 Synopsys, Inc. 45
Clock Structure: Post-CTS
set_output_delay 2 clk_out2 \
clk_out2 –clock clk1
cip
I
*_bdp
clk_out1
I
*_bip
E
S
CG
Setting: NDR
Root: root
Internal : internal
Sink: sink
cip
*_bdp
cip
CTS Inserted buffer
cip
clk1
I0
clk2
gclk
set_case_analysis 0 SEL
(I1 ‐> Z is disabled)
I
cip
is_clock_used_as_data=true
is_clock_used_as_clock=false
I
cip
Z
I1
B
cip
CG
*_btd1
cip
cip
cip
I
*_bdp
© 2016 Synopsys, Inc. 46
cip
cts_size_in_place
B
I
E
Balance point
Implicit Ignore
Explicit Ignore
Clock Structure: Post-CTS
Attribute Settings
• To separate the data path, the tool inserts guide buffer with the name *dp* and sets the
is_clock_used_as_data attribute setting on its input to true
• To separate the disabled path input, the tool inserts guide buffer with name *td* and sets
the is_clock_used_as_clock and is_clock_used_as_data attribute of its input to
false
• All the cells inserted by CTS and pre-existing cells other than those identified as cip (in
the figure in the previous page) have their physical status set as application_fixed
© 2016 Synopsys, Inc. 47
Clock Tree Synthesis and Optimization
Cell and Net Naming: CTS, CTO and ICDB
Name
Purpose
Example
cts_inv_<digit> or cts_buf_<digit>
Inserted by CTS clustering
cts_buf_4189301507
cts_inv_7655304984
cts_dlydt_<digit>
Inserted by CTS for delay detour
cts_dlydt_547
_*bip* or _*vip*
(v for inverter)
Inserted by CTS to separate ignore pin
clk1_bip4
_*bdp* or _vdp*
Inserted by CTS to separate clock to data path
sink3_Q_bdp3
*_vtd* or *_btd*
Inserted by CTS to separate disabled path
clk2_btd1
cto_buf_drc_<digit>
Inserted by CTO to fix the DRC
cto_buf_drc_1181
cto_buf_<digit>
Inserted by CTO for optimization
cto_buf_1416
cto_dtrdly_<digigt>
Inserted by CTO for detour
cto_dtrdly_307365
buf_drc_cln<digit> or inv_drc_cln<digit>
Cloned buffered during DRC fixing
buf_drc_cln306737
cto_buf_cln_<digit>
cto_inv_cln_<digit>
Cloned buffer during CTO
cto_buf_cln_306880
ICDB_<digit>
Cell added during balance clock group
ICDB_1457
CTS_MCSB_<digit>
Cell added by CTS for multi fanout skew balancing
CTS_MCSB_8830
© 2016 Synopsys, Inc. 48
To add a prefix to the cells, use the
cts.common.user_instance_name_prefix application option
Clock Tree Synthesis and Optimization
Cell and Net Naming: Postroute CTO and CCD
Name
Purpose
Example
ctosc_inst_*
Inserted by post route CTO for optimization
ctosc_inst_519862
ctosc_drc_inst_*
Inserted by post route CTO for DRC fixing
ctosc_drc_inst_518456
ccd_setup_*
Inserted by CCD for setup optimization
ccd_setup_4189301507
ccd_hold_*
Inserted by CCD for hold optimization
ccd_hold_283737
© 2016 Synopsys, Inc. 49
Clock Tree Synthesis and Optimization
Cell and Net Naming: MSCTS flow
Name
Purpose
Example
<old_cell_name>_split_<digit>
Newly created cell by split_clock_cells
command
icg_split_12321
*clk_drv_r<row_num>c<column_num>
Newly created cell by create_clock_drivers
command
clk_drv_r1c1
*msgts_l<num>_*
Newly created cell by
synthesize_multisource_global_clock_trees
command.
l<num> represents the level
msgts_l3_d1s0_1
Naming convention for cells introduced during the synthesize_multisource_clock_taps and
synthesize_multisource_clock_subtrees commands can be controlled by using following application options:
cts.multisource.subtree_merge_concatenate_length_threshold
cts.multisource.subtree_merge_*name*
cts.multisource.subtree_split_*name*
© 2016 Synopsys, Inc. 50
Contents
•
Introduction
•
Ensuring the Design is Ready for Clock Tree Synthesis
•
Setting Up for Clock Tree Synthesis
•
Performing Clock Tree Synthesis and Optimization
•
Analyzing the Results
© 2016 Synopsys, Inc. 51
Analyzing the CTS Results
Reports to Analyze the CTS QoR: report_clock_qor
•
•
To report the clock tree QoR, use the report_clock_qor command
It generates two type of report
• Tabular reports
Options available with tabular report
• Histogram report
Report Type
Metrics
summary (latency, skew, area, and DRC)
latency
drc_violators
area
Tabular Reports
(‐type)
robustness
structure
balance_group
local_skew
power
© 2016 Synopsys, Inc. 52
Confidential
Analyzing the CTS Results
Reports to Analyze the CTS QoR: report_clock_qor
Options available with tabular report
Report Type
Metrics
latency
level
robustness
wire_delay_fraction
fanout
capacitance
Histogram Reports
(‐histogram_type)
transition
wire_length
cell_delay
local_skew
wire_arc_length
wirelength
© 2016 Synopsys, Inc. 53
Confidential
report_clock_qor
More Organized and Structured Reporting
The clocks are organized by modes and reported across corners
Indentation and the
Attrs column show
master clock,
generated clock, and
skew group
relationships within a
scenario
© 2016 Synopsys, Inc. 54
Confidential
report_clock_qor -show_verbose_paths
Added Verbosity in Path Report Improves the Ability to Debug
Point
Fanout
Cap
Trans
Incr
Path
Location
NDR
Layers
CTS Tags
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
source latency
0.0000
0.0000
clk1 (in)
1
0.0258
0.0000
0.0000
0.0000 r
(999.92,737.10)
2w2s
{M7 113} {M6 174}
clk
cts_inv_34763893/A (INVX8)
0.0327
0.0166
0.0166 r
(825.66,625.11)
{M7 113} {M6 174}
cts_inv_34763893/Z (INVX8)
2
0.0223
0.0210
0.0138
0.0304 f
(825.72,624.72)
2w2s
…
cts_inv_34683885/A (INVX8)
cts_inv_34683885/Z (INVX8)
mux1/D1 (CLKMUX2X2)
mux1/Z (CLKMUX2X2)
0.0157
0.0549
0.0242
0.0485
0.0337
0.0226
0.0177
0.0196
0.0491
0.1336 f
0.1512 r
0.1708 r
0.2199 r
(310.64,1.11)
(310.70,0.71)
(10.80,10.98)
(11.88,10.85)
0.0111
0.0180
0.2309 r
0.2489 f
(214.38,16.24)
(214.48,16.26)
2w2s
0.0093
0.0131
0.3169 f
0.3300 r
(557.68,381.09)
(557.75,381.48)
1w2s
2
0.0215
1
cts_inv_33553772/A (INVX4)
cts_inv_33553772/Z (INVX4)
1
0.0149
0.0413
0.0268
cts_inv_33523769/A (INVX8)
cts_inv_33523769/Z (INVX8)
1
0.0161
0.0305
0.0165
2w2s
{M6 171} {M7 287}
{M6 300} {M7 10}
{M6 300} {M7 10}
2w2s
{M6 202}
{M6 202}
dt_lib
…
{M6 46} {M7 230}
{M5 229} {M4 47}
sink4/CP (DFF1)
0.0299
0.0117
0.3418 r
(605.61,610.54)
{M5 229} {M4 47}
sink
‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
total clock latency
0.3418
Extra columns in the clock path reporting for NDRs, routing layers, location, and so on
provides detailed information and is helpful for debugging
© 2016 Synopsys, Inc. 55
Synopsys Confidential
Querying Clock Tree with get_clock_tree_pins
• get_clock_tree_pins returns a collection of clock tree pins
–Lot of clock tree related attributes available for querying/filtering
get_clock_tree_pins
# Get clock tree pins by various criteria
[‐clocks object_list] (List of clocks)
[‐from list_of_pins]
(Include pins on paths starting from these pins)
[‐sort_by extended_attribute_expression]
(Sort by this attribute )
[‐index_range start_index_end_index]
(Return a subset of the sorted pins)
[‐groups_from list_of_pins]
(Return a list of collections, one for each pin listed here)
[‐metrics expression_list]
(One or more expressions to evaluate for each pin selected )
[‐assign_to_variable variable_name]
(Put results in associative variable)
[‐total variable_name] (Add up the known results and assign the result to this variable)
© 2016 Synopsys, Inc. 56
Synopsys Confidential
Querying Clock Tree with get_clock_tree_pins
• Few attributes available in get_clock_tree_pins command
Clock QoR:
Pin type and property:
Area and Power related:
skew_max,
downstream_delay_max
downstream_delay_min
latency_max
latency_min
is_generated_clock_source
is_clock_source
is_sink
is_leaf
is_float_pin
is_explicit_ignore
is_ignored
is_cts_added
is_on_repeater
is_on_buffer
is_on_inverter
is_on_icg
is_inside_etm
subtree_num_repeaters_max
subtree_total_area_max
subtree_total_leakage_power_max
subtree_total_internal_power_max
subtree_total_switching_power_max
cell_internal_power_max
cell_leakage_power_max
cell_switching_power_max
cell_total_power_max
net_switching_power_max
toggle_rate_max
Topological:
height_max
height_min
depth_max
depth_min
is_reconvergent
Refer to get_clock_tree_pins command man page for a full list of attributes
© 2016 Synopsys, Inc. 57
Synopsys Confidential
Analyzing the CTS Results
Clock Tree Attributes in get_clock_tree_pins
clk1
height_max=4
depth_max=0
height_max=3
depth_max=1
height_max=2
depth_max=2
height_max=1
depth_max=3
clk2
is_clock_source
!is_cts_added
is_on_buffer
height_max=2
height_min=1
is_cts_added
is_on_buffer
is_on_icg
height_max=0
depth_max=4
is_sink
© 2016 Synopsys, Inc. 58
Analyzing the CTS Results
Example Usage of the get_clock_tree_pins Command
• Get top 10 sinks with largest latency:
icc2_shel> get_clock_tree_pins –filter is_sink –sort_by latency_max –index_range 10
• Get all clock tree pins which are ignored for skew balancing:
icc2_shel> get_clock_tree_pins –filter is_ignored
• Get sinks inside ETM:
icc2_shel> get_clock_tree_pins ‐fil "is_inside_etm && is_sink"
• Collect and store all latency values into a Tcl array:
icc2_shel> get_clock_tree_pins –clocks CLK –sort_by latency_max \
–assign_to_variable my_array
• Get all downstream sinks beyond a pin (traverse through dividers and ICGs):
icc2_shel> get_clock_tree_pins ‐filter is_sink ‐groups_from mux/Z
© 2016 Synopsys, Inc. 59
clk
Example-1
icg1
mux
• Get total switching power of all leaf level clock nets
icg3
icg2
get_clock_tree_pins ‐sort_by net_switching_power_max \
‐filter “height_max == 0 && is_net_driver” \
‐total var
echo $var ==> prints “26”, which is the net switching power of nets
driven by icg2, icg3, and 2 leaf inverters
© 2016 Synopsys, Inc. 60
Synopsys Confidential
clk
Example-2
icg1
• Find gate levels with highest switching power:
mux
–Get subtrees with toggle_rate_max > 0.1
–Get the top 2 gate levels
–Print number of repeaters inserted, and switching power
icg3
icg2
get_clock_tree_pins ‐sort_by subtree_total_switching_power_max \
–filter "toggle_rate_max > 0.1 && is_net_driver && !is_on_repeater"
\
–metrics subtree_num_repeaters_max,subtree_total_switching_power_max \
–assign var –index_range ‐2
returns {clk icg1/Z}
© 2016 Synopsys, Inc. 61
echo $var(clk) ==> prints {5 114.88579}
echo $var(icg1) ==> prints {8 80.59376}
Synopsys Confidential
get_clock_tree_pins
Case Study: QoR Debugging Examples
icc2_shell> get_clock_tree_pins ‐filter “skew_max>0.1&&height_max<2&&is_net_driver”
{icg1/ECK}
Transition violators: identify clock tree pins with
• Transition greater than constraints
Skew debug: identify clock tree pins with
• Skew greater than 0.1
• Level less than 2 starting from the clock sinks
• Is an output pin
icc2_shell> get_nets –of_objects [get_clock_tree_pins ‐filter "transition_max > 0.05”]
{cts0 tmp tmp2}
© 2016 Synopsys, Inc. 62
Confidential
Case Study 1
• Insufficient sharing in the clock path for clock and high buffer count
• Observation:
– Reviewing the clock settings shows that routing layers are not defined when setting the clock
routing rule
Original setting:
icc2_shell> set_clock_routing_rules ‐clocks [get_clocks] –rule RULE1
Correct setting:
icc2_shell> set_clock_routing_rules ‐clocks [get_clocks ] ‐rule
RULE1 \
‐min_routing_layer M7 ‐max_routing_layer M8
With the correct setting, path sharing and buffer count improves
© 2016 Synopsys, Inc. 63
Quiz 2
• In what order does CTO optimize the overlapping clock trees?
© 2016 Synopsys, Inc. 64
Quiz 2: Answer
• In what order does CTO optimize the overlapping clock trees?
It considers all the clocks with overlaps together during CTO.
Therefore, clock QoR is not dependent on the order.
© 2016 Synopsys, Inc. 65
Quiz 3
• I have provided CTS buffers and inverters for CTS reference list, but the
synthesize_clock_trees command exists saying that no usable buffers or inverters
available. What can be the possible reasons for this?
© 2016 Synopsys, Inc. 66
Quiz 3: Answer
• I have provided CTS buffers and inverters for CTS reference list, but the
synthesize_clock_trees command exists saying that no usable buffers or inverters
available. What can be the possible reasons for this?
Library cells had dont_touch attribute. Clear these setting using the
set_dont_touch [get_lib_cells] false command
© 2016 Synopsys, Inc. 67
Thank You
Download