( ESNUG 393 Item 9 ) --------------------------------------------- [04/25/02]
From: Jens Michelsen <jcm@vitesse.com>
Subject: A User Tape-out Of PhysOpt w/ The New Synopsys Clock Tree Compiler
Hi, John,
Before we used PhysOpt, we had a traditional Synopsys frontend to Avanti
backend COT flow. We still use VCS, TetraMax, Design Compiler, Formality,
PrimeTime (and now PrimeTime-SI) from Synopsys and all the Avanti tools.
We use Avanti Star-RC for extraction while our ASIC vendor does the
LVS/DRC checks.
Our main problem has been the iterations between gate-level netlists and
P&R. It was taking too long and becoming more difficult to achieve timing
closure. Our inserted clocks caused a lot of uncertainty before P&R. The
clock skew margins we had to use in our pre-placed-and-routed netlist
hindered our ability to optimize for area and power as well.
When we brought in PhysOpt, we also signed up to be an early evaluator of
their Clock Tree Compiler tool. We set up our PhysOpt / Clock Tree Compiler
flow to be fully hierarchical. Every top-level block is run through PhysOpt
and Clock Tree Compiler and then Avanti Planet is our block level floor
planner after that. We then took the resulting placement directly to Apollo
to complete the block level routing. The timing after routing had good
correlation to pre-routing estimates. No routability problems came up.
For the top-level, we then used the top level netlist together with the
extracted block-level models, which were taken through PhysOpt for top level
optimization and clock tree insertion. This new flow had a significant
upside over our traditional flow; our results were now predicable and
deterministic.
While we were implementing this new flow, we were asked to help on a block
from another group that was having timing closure problems. Their block was
part of a SoC being developed in Datacom Vitesse. The block consisted of
60 K instances of logic with 5 memories, and was targeted for TSMC 0.18 um
7-layer metal. The biggest issues were timing closure in the presence of
its complex clock tree, design congestion and the routability of the design
after back-end place and route. The RTL, area and port locations were fixed
and couldn't be changed. In addition the block had low utilization (35%)
due to the fixed area constraint. All the other flows within Vitesse failed
to get closure on this problem block. They gave us 2 weeks to get it
through PhysOpt / Clock Tree Compiler and tape-out.
Clock Tree Description:
- 4 sub clock trees driven by a top-level clock (the top level clock
is also driving 8800 FF's) plus 5 reset trees and one scan mode tree.
- Clocks specification 2.0 ns, 4.5 ns and 5.5 ns periods with 10%
uncertainty
Here is what we got.
Clock tree # of FF Latency(ns) Buffers Levels Skew (ps)
----------- ------- ----------- ------- ------ ---------
Sub clock 1 2200 1.3 600 6 55
Sub clock 2 320 0.8 24 2 40
Sub clock 3 5000 2.4 340 7 200
Sub clock 4 175 0.7 12 2 15
Top clock 8800 2.1 560 7 200
Reset 1 2200 1.2 120 3 60
Reset 2 320 0.8 18 2 20
Reset 3 5000 1.3 275 3 75
Reset 4 170 0.6 14 2 12
Top reset 8800 2.0 599 7 326
Scan Mode 16400 2.1 916 6 160
In 5 days we taped out and met our clock skew spec on this block with room
to spare.
For our next tape-out we are hoping to include Power Compiler within the
flow, and hopefully to reduce the number of routing iterations required to
achieve timing closure. We also need to include signal integrity effects
and process antenna rules as part of the overall placement process.
Overall we were pleased with the introduction of the new Synopsys physical
synthesis, placement and clock tree synthesis tools into our COT flow.
- Jens Michelsen
Vitesse Denmark
[ Editor's Note: The scripts Jens used with Clock Tree Compiler are
in the "Downloads" section of http://www.DeepChip.com - John ]
|
|