Editor's Note: If you're a AAA member and you think they'll tow your car
home for free, think again. It turns out that the "basic" AAA membership
only buys you the right to have your car towed *3* measly miles before the
tow truck people start charging you $3.00 a mile. (And BTW, 3 miles is
nothing. A tow truck can easily take up 3 miles just trying to turn
around on the highway.) You'll need to buy the AAA Plus "upgrade" before
they'll tow your car up to 100 miles free to get it home. And to think
it only cost me $178 to learn this cheery tidbit of worldly wisdom! :)
- John Cooley
the ESNUG guy
( ESNUG 399 Subjects ) ------------------------------------------- [08/08/02]
Item 1: Cadence User Seeking A Matlab <-> Cadence SpectreRF Interface
Item 2: ( ESNUG 395 #3 ) PhysOpt-MPC Best Usage & An insert_dft Follow-Up
Item 3: Can Synopsys Design Compiler Take Encrypted Verilog RTL Code?
Item 4: ( ESNUG 398 #1 ) Customer Reports A Failed Floorplan Compiler Demo
Item 5: ( ESNUG 398 #4 ) One User's Cadence Digital Mixed-Signal TSMC Flow
Item 6: Mentor Won't Let Us Eval Their New "Calibre CI" Extraction Tool!
Item 7: ( ESNUG 397 #2 ) Four Users Review Sequence's PhysicalStudio Tool
Item 8: ( ESNUG 398 #10 ) 2.5-D RC Estimation & Correlation In PhysOpt
Item 9: ( ESNUG 312 #1 ) Makefile/LSF Dependencies Are Driving Me Nuts!
The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com
( ESNUG 399 Item 1 ) --------------------------------------------- [08/08/02]
Subject: Cadence User Seeking A Matlab <-> Cadence SpectreRF Interface
> Is there some kind of interface between Matlab and Cadence SpectreRF
> available? Or can you give me some suggestions to start to make one
> in Matlab?
>
> - Dimitri Linten
> IMEC Belgium
From: andrewb@cadence.com (Andrew Beckett)
There is no interface between Matlab and SpectreRF as such -- it really
depends on what you're trying to do here.
If you're talking about importing and exporting waveform data, then you
can do that - use the printvs button in the calculator (don't fill in any
start, stop or step values on the form) and then save the results to a file;
use the table special function in the calculator to read data in again.
These simple ASCII formats should be readable in Matlab without too much
difficulty, and Matlab can export data this way, too. (I don't remember the
Matlab commands, as it's quite a while since I used Matlab.)
- Andrew Beckett
Cadence Design Systems Ltd
( ESNUG 399 Item 2 ) --------------------------------------------- [08/08/02]
Subject: ( ESNUG 395 #3 ) PhysOpt-MPC Best Usage & An insert_dft Follow-Up
> As always, caveat emptor: there certainly is a place for MPC in a flow
> but caution should be exercised (specifying *no* option is probably a
> very bad idea, and the more information one provides, the better).
> No silver bullet, I'm afraid.
>
> - Jean-Marc Calvez
> STMicroelectronics Grenoble, France
From: Neel Das <neel.das@corrent.com>
Hi John,
Back in December I reported IN ESNUG 385 #11 certain issues we'd seen with
insert_dft in our PhysOpt flow. With the 2002.05 (Solaris) and 2002.05-1
(Linux) binaries, the QOR from insert_dft -physical has improved to where
we're getting comparable results (not necessarily better results) as
insert_scan. Thanks, Synopsys! We're now going to switch over to the
new command, if for no other reason but that insert_scan will eventually
stop getting supported. :)
Let me dwell a little on the -mpc and -quick flows in PhysOpt. (We're using
PhysOpt 2002.05 quite a bit now.)
First, why would you want to use the -mpc switch? For prototyping, right?
OK, so for a realistic prototype, one would want to include at the very
least a rough power plan. In high performance designs, it is not uncommon
to allocate 40% or more of routing resources even on multiple layers to the
power delivery strategy. Not including the capability to include a power
plan has for us severly limited the usefulness of the -mpc switch. To take
this one step further, PhysOpt ought to be able to identify not just the
power grid but also via obstructions (if you consider fat power straps
intersecting with fat vias, you get an idea of how limited routing resources
could be in that vicinity). Currently, -mpc also produced illegally placed
pins (some pins protruded out of the core boundary), which meant it's not
straightforward to take the output from such a flow into Apollo/Astro (which
we use for routing/CTS) and examine routability.
The -quick switch, which has been added for prototyping, was interesting as
well. So far, in the experiments I've run, I did try, but couldn't find a
lot of use for this feature. With all the limitations that have been built
in, I couldn't get a good handle on either timing or congestion prototyping.
I'm enclosing results from 4 runs I did on one of our larger blocks.
Platform: Linux PhysOpt Version: 2002.05-1
[in all cases, idential pin location specifications were used]
1. with mpc, auto-floorplanning, '-quick' switch:
WNS: 1.92 TNS: 2869.74 Number of Violating Paths: 3886
Nets with DRC Violations: 30241
Total moveable cell area: 3927650.2
Total fixed cell area: 0.0
Core area: (0 0 2367000 2369280)
2. with mpc, floorplan used size from older chip design, '-quick' switch:
WNS: 1.32 TNS: 1796.94 Number of Violating Paths: 2711
Nets with DRC Violations: 23450
Total moveable cell area: 36366765.
Total fixed cell area: 0.0
Core area: (0 0 4085000 2440000)
3. with mpc, floorplan used size from older chip design, '-congestion
-congestion_effort high' switch
WNS: 0.00 TNS: 0.00 Number of Violating Paths: 0
Nets with DRC Violations: 29
Total moveable cell area: 3318074.0
Total fixed cell area: 0.0
Core area: (0 0 4085000 2440000)
4. with mpc, auto-floorplan, '-congestion -congestion_effort high' switch
WNS: 0.06 TNS: 0.19 Number of Violating Paths: 22
Nets with DRC Violations: 22
Total moveable cell area: 3133408.2
Total fixed cell area: 0.0
Core area: (0 0 2234400 2234880)
Note that #4 is something we could start to work with. Good overall timing
and congestion. I would need to increase the floorplan to account for
power/ground straps.
I think it would be helpful if the -mpc switch *always* produces a routable
floorplan in the absence of user-provided x/y dimensions, regardless of
whether -quick is getting used or not.
The best approach in my opinion is to use the -mpc switch with a 'realistic'
compile strategy. Use the -mpc until you can put together a rudimentary
floorplan (including power straps) in a 'real' design planner: then switch
to the real FP immediately. In our case, -mpc did produce a nice, compact
floorplan, but when we used it with -quick, the design was too congested
to try a route.
People need to be especially careful with congested designs (if PhysOpt's
congestion map displays hotspots.) What happens in many cases is PhysOpt's
Steiner router gets out of sync with the 'real' router we use, thereby
resulting in lots (may be thousands in some clusters) of minor DRC
violations (max_tran and max_cap violations). In many of our clusters,
we've seen PhysOpt reports congestion that Apollo/Astro's router doesn't,
'cos it can detour around the congested hotspots. We also have seen an
outstanding issue (a STAR has been filed) with PhysOpt not being able to
fix a few hold violations on scan paths. We have to spend considerable
time verifying, binning and hand-fixing the DRCs. Until PhysOpt and the
global router being used get merged (at least to the degree of track and
layer assigments), this problem will continue to exist.
Which brings me to another issue that I'm interested in. Synopsys has been
working on the detail block-level router (Route66) and Clock Tree Compiler
(CTC) for some time now. With the merger, it now has former Avanti tools
that already have been tested and used extensively in the field. It doesn't
take a great leap of faith to imagine a scenario in the not-too-distant
future (within a year?) where PhysOpt will have the ability to read from and
write to a Milkyway database. I think that leaves the Route66 and CTC teams
about the same time to demonstrate a significant runtime/performance or QOR
advantage over the Apollo/Astro solutions. If the performance differential
is minor, which tools will survive? Hmm...
- Neel Das
Corrent Corp. Tempe, AZ
( ESNUG 399 Item 3 ) --------------------------------------------- [08/08/02]
From: Bei-Hwa Lin <bhlin@gadzoox.com>
Subject: Can Synopsys Design Compiler Take Encrypted Verilog RTL Code?
Hi, John,
By any chance can Synopsys Design Compiler take encrypted Verilog code?
- Bei-Hwa Lin
Gadzoox
( ESNUG 399 Item 4 ) --------------------------------------------- [08/08/02]
Subject: ( ESNUG 398 #1 ) Customer Reports A Failed Floorplan Compiler Demo
> Design 'Obelix' was a 1.4 M instance 0.13 um design (~5.6 M gates) with 7
> top level soft blocks, 35 K leaf cells and more than 230 embedded hard
> macros. The top level of Obelix had more than 50K nets. It's highest
> system frequency was 15O Mhz with the I/O frequency exceeding 600 MHz.
> ... with Obelix we needed to apply both a top down and a bottom up
> approach. ... The floorplan completed with a very fast cycle time (10
> minutes for cluster creation and half an hour for cluster placement).
> The top-level architecture was well understood and the blocks identified
> were successfully closed in term of timing and routability in our Apollo
> back-end flow.
>
> - Valentina Baiardo
> STMicroelectronics Agrate, Italy
From: Valentina Baiardo <valentina.baiardo@st.com>
John,
I wanted to correct a typo here. We used Magma as the backend for the
Obelix design and an Apollo flow in shadow mode as a check. Both flows
closed timing successfully.
- Valentina Baiardo
STMicroelectronics Agrate, Italy
---- ---- ---- ---- ---- ---- ----
From: [ Chicken Man ]
Hi, John,
Keep me anon. I am very curious regarding Floorplan Compiler evalution.
We saw a Floorplan Compiler demo last week at Synopsys where their own test
data was smaller than the chip in Valentina's report. The clustering and
placement was not finished at 1.5 hrs and they did not show us the final
result in the full 2 hr demo. I do not know what was wrong. Thus, I'm kind
of curious on her 40 minute runtime quote.
Anyway, we will not spend much time on FPC till it merges with the Milkyway
database next year. They are working in mergeing JupiterXT and FPC (under
Ho's team that owns FPC, Jupiter, Columbia, etc.) I know there are some
key differences in Astro-Apollo/PhysOpt based placement algoritms. It's not
easy to merge both. We plan to stick to our Apollo + Jupiter flow for now.
- [ Chicken Man ]
( ESNUG 399 Item 5 ) --------------------------------------------- [08/08/02]
Subject: ( ESNUG 398 #4 ) One User's Cadence Digital Mixed-Signal TSMC Flow
> I recently downloaded the TSMC 0.18 PDK and am trying to use it to prepare
> a layout for a IEEE-745 compliant floating point unit. I have the HDL
> written and simulated for it. I was recently introduced to the Cadence
> suite and would like to know a few things.
>
> 1. What is the preferred design flow if you are starting from a HDL?
> The flow that I am following was:
>
> ncvhdl - ncelab - ncsim (simulate the design) - BuildGates
> (for netlist ) - Abstract generator ( to generate
> LEF) - Silicon Ensemble for P&R.
>
> Is this the correct design flow?
>
> 2. I am trying to create a tech.dpux file using the tsmc18 techfile.
> I give the library name and the cds.lib file properly. But when I
> start up cds2hld_4.4 it gives me an error saying it cannot read the
> default.drf file. I want the tool to read the display.drf file
> provided by tsmc which is in my current working directory. What env
> variables should be set for this?
>
> 3. Is there any other method of creating a tech.dpux file and also to
> create a LEF?
>
> Any help is greatly appreciated.
>
> - Shrirang Yardi
From: [ Cadence-a-saurus ]
> 1. What is the preferred design flow if you are starting from HDL?
My quick digital mixed-signal 12-step nanometer flow:
1. Write your RTL (add Verliog-A/Verilog-AMS for the SOC while
you're at it).
2. Functionally verify using AMS Designer (NC-Sim & Spectre) or
just NC-Sim.
3. Build your RF, Analog, & Mixed-Signal blocks (VXL, VCP, VCR,
NeoCell, etc.)
4. Create LEF for your RF, Analog, & Mixed-Signal blocks using
IC50 'abstract'.
5. Create TLF for your RF, Analog, & Mixed-Signal blocks using
MDL & Ocean.
6. Synthesize the digital portions of your design using PKS.
7. Prototype using SOC Encounter.
8. Route using Silicon Ensemble.
9. Extract using Fire & ICE QXC & use CeltIC for crosstalk analysis.
10. Sign-off delay & timing in PKS (#6 thru #10 are integral to
SOC Encounter).
11. Assura DRC & LVS & critical-path extraction/simulation via
Assura/Spectre.
12. Chip Finishing back in the famous DFII (IC50) cockpit.
My team has taken an RF/Analog/Mixed-Signal/digital/Digital design thru
this (and other) flows, using all latest released tools, and aiming for
zero lines of workaround code; then documenting every single button
press -- and passing the results to the QA folks to begin to ensure
these basic flows continue to work, from release to release.
> 3. Is there any other method of creating a tech.dpux file and also to
> create a LEF?
a. I've done it a few times on our generic IC baseline design
(containing RF, Analog, Mixed-Signal, digital, and Digital blocks).
b. If you have a standard-cell library, e.g., from Artisan, the _easiest_
way (by far) to create a tech.dpux file (do you _really_ want that as
the end goal?) is simply to start the 'abstract' program and import
the Artisan supplied TSMC technology LEF & save the results to the
tech.dpux file.
c. To spit out LEF, you can use that same 'abstract' GUI which allows you
to tweak the parameters (e.g., antenna information, pin information,
rectilinear boundaries, routing info, keep outs, etc.) as desired.
See you at ICU in September!
- [ Cadence-a-saurus ]
---- ---- ---- ---- ---- ---- ----
From: Shrirang Yardi <syardi@hotmail.com>
Thanks a million for the 12-step nanometer flow. :) My question was:
- I don't have TSMC technology LEF but just the TSMC technology file.
does MOSIS provide the LEF too? I have just the TSMC "techfile" to
start with. I wanted to know a step-by-step procedure for converting
this techfile to a tech.dpux file so that I can feed it to abstract.
Am I correct in this or have I missed something?
- You mentioned that you have written a step-by-step guide. Is it possible
to obtain this guide from the MOSIS web-site as I am an accnt holder
at MOSIS?
Please do let me know. Thanks a lot for your help.
- Shrirang Yardi
( ESNUG 399 Item 6 ) --------------------------------------------- [08/08/02]
From: [ Left Out In The Cold ]
Subject: Mentor Won't Let Us Eval Their New "Calibre CI" Extraction Tool!
Hi, John,
Please keep me anonymous because this is too critical in future business
with Mentor.
After waiting & pushing Avanti (now SYNOPSYS) to improve device extraction
flow of StarRCXT, we are more interested in bring in other extraction tool
to integrate with our Calibre: Calibre LVS + Extraction.
The first tool we would like to evaluate is Celestry NRC which claims to
have tight link with Calibre LVS through Calibre CI feature. We are told
Calibre CI is exactly there for 3rd party to link with Calibre LVS.
Since this is the 1st time I heard of 'Calibre CI' (it is not in ANY Mentor
document I have seen), I made a request to Mentor salesperson for an eval.
My temp license request got response. I called to check the status and the
answer is very interesting.
The Mentor salesman said Mentor WILL NOT lend us the Calibre CI license for
evaluation, instead Mentor wants us to evaluate their xCalibre and their
to-be-released xRC. They sent me some marketing email to show me how good
xCalibre/xRC is.
It is definitely not possible to purchase the Calibre CI either.
After pushing and pushing, finally the salesman said his boss is willing to
lend us a time-based temp license but they DO NOT PROMISE Mentor will sell
"Calibre CI" to us. That is to say, even if I have decided to use (Calibre
LVS+Calibre CI + 3rd party extraction tool) in my flow, I may be blocked
for not possible to purchase it from Mentor.
What a joke!!! Celestry said they have contacted the Mentor CEO and still
got nothing to offer for a clear and promised policy in open Calibre CI.
What I need ESNUG readers to help is to check if this is a new policy or a
changing policy for Mentor in the US? In Europe? In Asia?
- [ Left Out In The Cold ]
( ESNUG 399 Item 7 ) --------------------------------------------- [08/08/02]
Subject: ( ESNUG 397 #2 ) Four Users Review Sequence's PhysicalStudio Tool
> We are looking at a tool named PhysicalStudio from Sequence for SI, xtalk,
> noise, and delay analyses. What are the advantages/disadvantages of using
> PhysicalStudio based delay calculation (lumped + coupled) with PrimeTime
> based STA in an Apollo P&R environment?
>
> - Chandrani Pal
> Intel
From: Wolfgang Roethig <wroethig@el.nec.com>
Hello John,
Before I can critique PhysicalStudio, I have to explain how it works.
PhysicalStudio has two modes of operation, pre-route and post-route
optimization. In post-route mode, PhysicalStudio works on exact parasitics
and on exact physical locations. ColumbusTurbo, their extraction tool, can
generate a SPEF file containing this information. If you prefer Avanti
tools, StarRC-XT can also generate a SPEF file compatible with
PhysicalStudio. We have tested both.
The SPEF file with coordinates enables PhysicalStudio to calculate crosstalk
-aware delay and noise and to make netlist and placement changes with
minimal disturbance of your layout. For example it could figure a timing
violation due to a large coupling capacitance on a net which can be fixed by
just putting a buffer in the middle of the net. It knows the location of
the wire segements associated with that capacitance, so it places the buffer
at a optimal location and rips out the wire segment. All my ECO router has
to do is to reconnect the segments and drop vias to the pins of the buffer.
Since the ECO route is tightly controlled, the parasitics after ECO route
are also tightly controlled. In our experience, the timing predicted by
PhysicalStudio and the timing after Cadence ECO route correlates very well.
Xtalk-based STA in PhysicalStudio
---------------------------------
Conventional xtalk timing analysis is based on the min-max time window
concept. (For example, PrimeTime-SI, Mantle, Pearl ...) A min-max time
window is the interval between the earliest and the latest possible
arrival time of a signal within a clock cycle. If the min-max time windows
of two signals overlap and there is a coupling cap between them, they are
subjected to crosstalk. In contrast, PhysicalStudio supports multiple time
windows within a clock cycle. For example, signal A may switch either very
early or very late, and signal B may switch in the middle. Therefore, the
min-max time windows of A and B do overlap, but the actual time windows
do not.
These high-resolution time windows enable a more accurate xtalk analysis.
Non-overlapping signals are ruled out, and the aggressor alignment for
overlapping signals is known with more certainty. On the other hand, any
analysis based on min-max time windows must make a pessimistic assumption
that the agressor alignment is always worst.
Another differentiatior for PhysicalStudio STA is it supports the Advanced
Library Format (ALF). ALF allows us to put more accurate data into our
timing and SI library than .lib, so that our STA results correlate much
better with our SPICE runs and our internal delay calculator. However,
the .lib version of PhysicalStudio is still reasonably accurate for design
optimization purposes. We published results at the Designer's Forum of the
DATE2002 conference.
You must test your timing constraints
-------------------------------------
We evaluated PhysicalStudio in a Synopsys/Cadence flow (PrimeTime, PhysOpt,
Silicon Ensemble, Nanoroute, NEC tools), but the following could also apply
in a Synopsys/Avanti flow:
1. You've got to test your timing constraints. PhysicalStudio STA
supports SDC natively. There is no translation involved. However,
there are some corner cases, where different tools interpret the
constraints slightly differently (which is annoying.)
Example 1:
set_multicycle -setup 5
If -hold is not specified, tool A assumes -hold 4, tool B assumes
-hold 0. If -hold is specified, say
set_multicycle -setup 5 -hold 4
Tool C counts 4 cycles backwards from 5, tool D counts 4 cycles
forward from 0.
This scenario gets a lot more complicated when the launching and the
receiving clock have different frequencies.
Example 2:
A constraint for a path appears in a file. Later in the same file
*another* constraint for the same path appears.
Tool E decides to pick the more severe constraint, tool F decides to
pick the constraint that appears later in the file.
A problem can also appear if the designer's intention and the tool's
interpretation do not match. The designer thinks that the tool
interprets the constraints one way, but the tool actually interprets
them another way. (The message here is that you should ALWAYS run a
sanity check of the timing constraints, especially if you are using
more than one STA tool in the flow.)
How do designers work around such a situation? Well, the vast majority
of paths are single-cycle paths, which do not exhibit this problem.
For multicycle paths there is always a workaround:
If the designer decides that the *intended* timing on the path is
already met, the path can be declared as a false path henceforth.
Every tool understands that. (Which tools have we looked at?
PrimeTime, Pearl, PKS-STA, Mantle, PhysicalStudio-STA, Sonar-STA.)
If the timing is not met, get rid of that multicycle constraint. Then
it becomes a single cycle path. Let the tool optimize until the
designer decides that the intended timing is met.
It is possible to tweak the constraints, until PrimeTime and Studio STA
give the same answer. If you have to, you can do this test with a zero
wireload model or with a dummy SDF file.
2. If you want to use PhysicalStudio for delay calculation and PrimeTime
for STA, you have to output an SDF file from PhysicalStudio.
Theoretically, PrimeTime should give the same result as PhysicalStudio
STA with the same SDF file.
3. Test your ECO router. We have used Cadence Wroute and NanoRoute. You
probably want to use Apollo/Astro. Other Sequence customers are also
using Apollo/Astro.
Issues we found
---------------
We tried PhysicalStudio on designs ranging from >1 M gates to 5+ M gates.
The issues we found.
1. Sequence needs their own ECO router
There is a strong dependency between PhysicalStudio and external tools,
and if either one breaks, the flow breaks. We used Cadence's Wroute
version in Silicon Ensemble 5.3.138, and Nanoroute version in 2.5.6.
The biggest enhancement I wish Sequence to consider is to integrate an
ECO router into PhysicalStudio, so that it can output a DRC-clean DEF
file. An external ECO router adds its own issues and always requires
some flow tweaking.
2. Only one level of clocks
Only one level of generated clocks are supported. Constraints with a
generated clock refering to another generated clock have to be modified
in a way that a generated clock refers only to a primary input clock.
Sequence is working on an enhancement.
3. Placement legalization didn't always work
This feature did not work in one of our designs. We had to do placement
legalization with Cadence Qplace. Sequence is working on a fix.
Meanwhile, use your favorite placer.
4. TCL commands
PhysicalStudio has around 500 commands and most of them are documented.
About 10 commands are cryptic, but documented. Example:
Phase1N This does net splitting
Phase1A This does driver upsizing
About 15 commands are intuitive, but undocumented. Example:
report_clock_delay This does, well, what it says.
This sometimes makes it difficult to deviate from a pre-scripted flow
and develop customizations.
5. SDF output sometimes misses a few nets
In one design with 800 K nets, the SDF annotation was missing for a
couple of nets. However, we did not investigate this issue very deeply,
since we do not depend on the SDF output from PhysicalStudio in our
design flow.
6. Not all software patches worked
Sequence has been pretty responsive in fixing bugs and issues, but not
all their patches did work right away. Using the officially released
software and sticking to workarounds until the next major release works
better than trying out every patch.
PhysicalStudio offers both analysis and optimization using the same engine.
It interfaces with Verilog, LEF/DEF and SDC. It requires an external router
to complete the ECOs. The accuracy of its timing and xtalk analysis is good
with .lib and excellent with ALF.
I recommend PhysicalStudio for post-route optimization of timing and signal
integrity issues.
- Wolfgang Roethig
NEC Electronics Santa Clara, CA
---- ---- ---- ---- ---- ---- ----
From: Sudhanshu Jain <suds@broadcom.com>
Hi, John,
We've used PhysicalStudio to do extraction and delay calculations for two
large 0.13 um designs at Broadcom. One is back and working and the other is
still in fab.
Our previous standard flow was :
Star-RCXT -> Primetime -> Celtic -> PrimeTime -> hand fix SI problems
Often we would have to go through this multiple tools loop 3-7 times to
close timing and SI. We were excited when Sequence said that they could
handle all of these tasks within one tool and do automatic fixing of SI
problems. Our big concern was if we could trust their analyses versus
silicon since we already had a relatively successful (but painful) flow.
We spent a lot of time doing correlations of their extraction as well as
their delay calculator and timer.
We extracted with both Star-RCXT as well as Sequence's Columbus tool and
actually discovered that we were using the wrong Tech file for Star-RCXT
since the numbers weren't correlating. Once we fixed that, we got
correlation to within 5% between SPEF from Star-RCXT and Columbus (as
determined by using PrimeTime to do timing.)
Then we correlated Delay Calculators by taking SPF from Star-RCXT and
running it into ShowTime and into Primetime. This too correlated to
within 5%. While PrimeTime was our signoff delay calculator, we were
still constantly getting RC-009, RC-004, RC-008 warnings from PrimeTime
(due to the simplistic driver model used by PrimeTime and issues with
Broadcom libraries.) So we had some doubts about how much to trust and
to pad the PrimeTime results in the first place. But, since other
groups had successfully taped out 0.13 um chips with PrimeTime, we had
to consider those results to be "golden". So the answer to Chandrani's
question is that we found correlation to within 5% for ShowTime and
PrimeTime delay calc and timer.
We also did a correlation between SDF generated by PhysicalStudio into
PrimeTime versus SPF from PhysicalStudio into PrimeTime. We found they
correlated to within 2%.
- Sudhanshu Jain
Broadcom Corp. Milpitas, Ca
---- ---- ---- ---- ---- ---- ----
From: [ Call Me Ishmael ]
John,
I ask that you post my reply anonymously.
We have been using PhysicalStudio with NoiseIT for a number of designs.
The first thing you will notice is the PhysicalStudio is much faster
than PrimeTime. We do a lot of timing analysis with SPF file. PrimeTime
is dog slow, and capacity is very limited! PhysicalStudio can easily do
timing analysis on 1 million placed objects with full SPEF in under
15 minutes.
We found that PhysicalStudio's timing results matched PrimeTime within
+/- 1%. But Apollo's timing engine is really a piece of junk. We had
found a number of "features" in Apollo that makes it quite different
from PrimeTime and PhysicalStudio. In our flow, we use Apollo to do the
initial placement. It was not timing driven since its timing engine is
no good. It's too slow as well. We use PhysicalStudio to do in place
optimization, and then ECO the changes into Apollo. The final timing
check is done using PhysicalStudio with NoiseIT. We use the tools to
optimize for setup, hold, max cap, and max transition violations. The
tools fixes noise induced delay as well. Initially, we use PrimeTime
to double check the final timing results. But after a few successful
chips, we don't even bother to do that anymore.
It took us some time to integrate PhysicalStudio into our Apollo flow.
The tool has a very flexible TCL interface. The Sequence folks wrote
it in such a way that it can be customized in every possible ways. This
meant that it's not a simple tool to integrate into ones flow. We needed
it to work seamlessly between synthesis and P&R. Now that our flow is
finally in place, timing closure is a 95% automated process.
There were a number of limitations when we first got started more than 2
years back. They didn't support having both a slow and a fast library
residing in memory at the same time. So, it was bit of a pain to optimize
for setup using slow lib, get out of the tool, load in the design with the
fast lib then optimize for hold. They have fixed this about a year ago.
It was not building buffer trees that were timing optimal. That has been
fixed.
To handle scan and test logic embedded in your design, you have to "false
path" every scan path during your setup of PhysicalStudio and then optimize
for hold at both slow and fast corners.
PhysicalStudio doesn't know how to automatically handle clock dividers. You
have to define generated clocks. It has no problem with propagated clock.
One short coming: it doesn't have the ability to back annotate SDF.
Their 64-bit version doesn't seem to be as stable as the 32-bit version.
The 32-bit version can handle upto 1 million place-able objects; which is
good enough for us.
In the first few months, we were finding bugs at a rate of 2-3 a week.
They gave us hot fixes at a rate of every 1-2 weeks. Very fast turnaround
time. Basically, if we are dead in the water, they would jump right in to
help. How a startup should be. The tool is a lot more stable now. We are
have not found any bug in the past 3 months.
- [ Call Me Ishmael ]
---- ---- ---- ---- ---- ---- ----
From: Satish Bagalkotkar <Satish.Bagalkotkar@siliconaccess.com>
Hi John,
We've used PhysicalStudio in taping out 5 chips in 0.13 um over the past 14
months. Our flow integrated:
SPC-FirstEncounter (placement), PhysicalStudio (pre/post route
timing/noise), Plato/Apollo (route), PrimeTime (timing signoff)
& Verplex (equivalency check)
Our biggest chip was 333 Mhz, ~20 M gates, so we had to rely on EDA
tools to not only report but to also fix the problems. This design
had 32 seperate clocks. As with any other P&R tool, the SDC commands
supported are very limited compared to DC & Primetime. We have a list
of all the supported syntax from each vendor and we have a internal
scripts which converts the Synopsys constraint to SPC, PhysicalStudio,
Plato & Apollo format. To identify & understand how each vendor tool
handles SDC syntax slightly differs with each tool means you've got to
put in some effort integrating them together.
We haven't seen any new gotchas regarding data transfer between tools
because we spent lot of time to integrate these tools and ensured that
each transfer was clean before actually using it in production. This kind
of stuff cannot be done on-the-fly as each vendor tool has it's own minute
difference even if they claim they are fully compatible. Integrating
several vendor's tools is a task which takes time and resources.
So far all 5 chips have worked at speed in "first silicon" which is
icing on the cake.
Here are the pros/cons of PhysicalStudio in a nutshell.
Pros:
- Lots of switches which can be tuned to get very good results if you
know what you are doing
- Can do pre and post route timing/noise optimization
- Buffer insertion and timing/noise closure engines are good
- Post-route optimization adds buffers along the wire paths (which
is lot better than most tools where buffers are inserted and during
placement they can end up any place causing unnecessary iteration.)
- We have very good correlation with Primetime
- Timing and Noise analysis is extremely fast
- Columbus extraction seems to correlate well with Avanti StarXT (our
signoff tool)
- If the gate level netlist structure is good then the tool does a very
good job for timing closure
- Command set is very simple and you could write very neat scripts that
can run in batch mode. This was extremely useful as we have a fully
automated flow.
- Tool produced consistent results which is very important
Cons:
- Some of the post route optimizations which did area recovery caused
hold and setup problems. (We disabled it as area was not that critical
for us. They claimed to have fixed this problem.)
- Had problems with high fanout nets so we had to use Apollo (CTS) to fix
these nets
- If the constraints are not reasonable and achievable the PhysicalStudio
does lot of weird stuff
- The tool has problems optimizing paths with negative slack if the top
few path are not fixable. At times had to we to "false paths" these
and then it fixes the other paths. We have asked R&D to fix this.
- Optimization runtimes are very long and in few cases it added too many
weak buffers instead of few stronger driver
- Their GUI sucks!
- Buffer removal is not one of the strong features of PhysicalStudio
- We had several problems where tool used to core dump with no indication
and it used to take R&D lot of time to identify and fix the problem
- In some rare cases we had to guide the tool to close timing
- In previous version legalize command had problems. We saw some stdcell
getting placed on top of memories. This problem has been fixed.
- Doesn't run on Linux
The newer version pf PhysicalStudio claims to do glitch analysis but we have
not tried it. Overall, we are happy with the tool.
- Satish Bagalkotkar
Silicon Access Network San Jose, CA
( ESNUG 399 Item 8 ) --------------------------------------------- [08/08/02]
Subject: ( ESNUG 398 #10 ) 2.5-D RC Estimation & Correlation In PhysOpt
> I finally got the answer to my question from a Synopsys AE. The PhysOpt
> 2.5-D extraction engine does not need RC correlation. Actually, there is
> no way to even obtain correlation numbers in the current rev of PhysOpt.
>
> The numbers which were printed out by estimate_rc (and were confusing me)
> are there for backward compatibility and are comparing the actual DSPF to
> the numbers in the library. Basically, these numbers should be ignored.
>
> - Mahsa Vahidi
> Mindspeed Technologies San Diego CA
From: Vandana Kaul <vkaul@synopsys.com>
Hi John,
The 2.5 D extraction parameter based RC estimation in PhysOpt should
eliminate the need for RC correlation. If this extraction parameter based
estimation is working properly, your PhysOpt log file will contain a message
like the following:
Information: Extractor based RC computation is enabled. (PSYN-140)
As Mahsa Vahidi indicates in ESNUG 398 #10, when you run the "estimate_rc"
command for RC correlation, the log file continues to show values based
upon the 1D values, even if you are using the extraction based approach.
These messages look as follows:
Capacitance - horizontal 0.0002105 vertical : 0.00017
Resistance - horizontal 0.000618 vertical : 0.0002368
These numbers are calculated by comparing the post-route backannotated data
with the 1D values calculated from the physical library. This can be
confusing. For an annotated (delay/load) design, no matter which RC model
is used, the "estimate_rc" command will always report 1D based values. The
"compare_rc" command, however, will generate a plot showing the correlation
between the extraction based values and the post-route backannotated values.
This curve should show very good correlation between the post-PhysOpt and
post-route values. The goal is to eliminate the need for RC correlation
altogether.
John, some of your readers may not be familiar with the newer 2.5D flow,
so I would also like to provide the following background information on
parasitic estimation in PhysOpt, so everyone is on the same page.
The early (pre-2001.08) PhysOpt releases used RC parameters that were
directly converted from LEF parameters. The LEF syntax, however, puts a
limit on accuracy as capacitive modeling was restricted to a basic 1D
model with an area cap and an edge cap.
In the 2001.08 release of PhysOpt, the .plib started to support a more
detailed (2.5D) cap model that includes: area caps, side wall caps,
inter-layer fringe caps and intra-layer fringe caps. These new cap models
require the use of a Routing Wire Model, RWM, to give PhysOpt an idea of
what you expect for routing densities per layer, coupling between layers
and on the same layer.
Beginning in the 2002.05 release, PhysOpt supports RC models based on just
extraction parameters: field oxide thickness, field oxide permittivity,
layer thickness, layer lateral oxide thickness, layer lateral oxide
permittivity, layer oxide thickness and layer oxide permittivity.
In summary, there are three types of RC models you can use with PhysOpt:
1) LEF based plib RC models
* Limited to 1D models.
2) Area Cap, SideWall Caps, Interlayer Fringe caps, Intra-layer caps
* Requires a RWM for layer densities
* Support started in 2001.08
3) Extraction based models
* Uses process extraction parameters
* Support started in 2002.05
The following two points might further clarify how the extraction plib
models work in a PhysOpt flow:
1) The plib defines the metal width, spacing, pitch, oxide thickness,
and permittivity. Therefore, PhysOpt can calculate under-the-hood
values for area cap, side wall cap, intra-layer fringe caps, intra-
layer caps, etc. based on assumed adjacent wires.
2) Next, PhysOpt is already building routing data per gcell when it
calculates congestion maps, so this concept can be extended
under-the-hood to calculate a RWM based on your design and the
routing densities.
Therefore, I see the new extraction based plibs as an extension of the
area cap, fringe cap based plib noted above. The advantage of this flow is
a user does not have to supply a RWM. PhysOpt will calculate the RWM
based on your design.
NOTE: We found one bug with the 2.5D extraction based method in 2002.05.
This bug has to do with accuracy when performing RC calculation around
large hard macros/RAMs. This will be fixed in version 2002.05-SP1, which
should be released in the September time frame. For designs that contain a
lot of channels between hard macros/RAMs, it is best to wait for this
release to begin using the extraction based method.
- Vandana Kaul
Synopsys, Inc. Mountain View, CA
( ESNUG 399 Item 9 ) --------------------------------------------- [08/08/02]
From: Tomoo Taguchi <ttaguchi@amcc.com>
Subject: ( ESNUG 312 #1 ) Makefile/LSF Dependencies Are Driving Me Nuts!
Hi, John,
I'm trying to set up a makefile to kick off LSF jobs in parallel, but I'm
having difficulty getting the dependencies to work with LSF. I know this
problem has been solved at a previous employer using an internal LSF-like
tool, and I'm sure that this problem has been solved numerous times with
LSF at other companies. I poked through the DeepChip ESNUG archives and
didn't find what I was looking for.
Basically, if A instantiates B & C, I want a makefile that kicks off B and
C LSF compile jobs off in parallel, but holds off the A LSF compile job
until B and C are complete. I'm using plain vanilla make, and I'm
suspecting that I need nmake or some other make flavor to support the
parallel feature. The other problem is that since bsub returns after the
B and C jobs are queued, the makefile assumes that the dependency is
satisfied and kicks off the A job before B and C dbs are available.
I did come up with a non-bullet-proof, but simple/good-enough-for-me
solution to running parallel builds with make and LSF.
I ran into several problems.
Problem 1. LSF queued up my compile jobs and came back a few seconds
later, so the dependencies had no effect and everything (including jobs
that should have been held off until lower level blocks were done) were
kicked off in parallel.
Solution 1. bsub (the command that kicks off LSF jobs) has a -K option
that waits for the job to complete. So this solves the problem of
upper-level blocks kicking off before their dependencies are done.
Problem 2. If I use the -K option, then LSF kicks off jobs on different
servers, but the process becomes completely serial, since the bsub waits
for the LSF job to complete.
Solution 2. Out of the many flavors of make (imake, nmake, gnu make,
etc), I was using old, vanilla make, which doesn't support parallel
execution. I found that nmake and gnu make support the -j option which
will kick off jobs in parallel as long as it follows the dependencies
specified. I ended up using gnu make because it was already installed
locally and from what I could take of nmake, you had to pay for it.
Problem 3. We have a limited number of Design Compiler licenses, and if
I let make kick off a bunch of jobs in parallel, it has the potential to
gobble up all the licenses and make guys wanting to kick off interactive
or other jobs angry. Granted, this could be solved by completely limiting
the access to dc_shell through LSF, but that's not the situation I have
here. Even if it was, I think other designers would be upset if I ate up
all the licenses on long compile jobs. So, I needed a way to limit the
number of license that my make run ate up.
Solution 3. My first attempt was to write a perl script that would run
"lmstat -f Design-Compiler", and parse its output to determine how many
licenses were being currently being used. bsub has an -E <command>
option that will run <command> before kicking off a LSF job. If the
command returns a 0, it kicks off the command. If it returns a 1, it
puts the job back on the queue. I specify to my perl script how many
license I want to leave open. If the script figures out that if I run my
job, there will still be the specified number of license open or more,
then it returns a 0 and the job kicks off, otherwise, it goes back to a
pending status on the queue.
The first problem that I ran into with this approach, is that because of
the neat -j parallel exection option in gnu make, if I kicked off a build
where I could build X jobs in parallel before running into my first
dependency, all the jobs would run my perl script simultaneously, and
since no dc_shell jobs have been kicked off yet, they would all see that
there were plenty of license available, and all of them would kick off
their dc_shell jobs. What I really needed to do was stagger the kick
off of jobs (by about 10-15 seconds so that dc_shell has time to grab a
license, and for lmstat to execute, which takes a few seconds), so that
as each job kicks off, it has an accurate picture of how many license
were being used. So, I put different-valued sleep commands before
each target commands, so that each target would kick off at a different
time. This worked great when I initially executed the make command,
but if jobs went back on the queue because too many licenses were
being used, then LSF would determine when it would try to reexecute
the same job. Since I didn't have any control of when LSF would try to
reexecute all the jobs on the queue, most of the time, everything on
the queue eventually ended up kicking off and grabbing too many licenses.
So, the eventual solution that I came up with was to build a perl wrapper
around my make command that took advantage of the -j <parallel_jobs_num>
feature of gnu make. If a value to -j isn't specified, then it allows an
infinite amount of parallel jobs. But if a number is specified to the -j
option, then make will only kick off up to that number of parallel jobs.
So the perl wrapper runs lmstat, figures out how many licenses I can grab
given the number of installed licenses, the number of licenses used, and
the number of specified license to keep free, then kicks off the make job
with this number as the argument to the -j option.
I grant that this method control licenses by contolling make, but I figure
that unless licenses usage can be completely regulated (where the license
server or something with the same level of control can handle request to
only grant licenses if a specified number is kept open), any scheme would
have some short coming. By specifying the maximum number of parallel runs,
I set a maximum number of licenses I would ever grab at any given time.
I have to admit that I'm a newbie to writing makefiles or running LSF, so
I'm sure that those with more experience in both would come up with more
elegant and bullet-proof schemes to optimize their compile environment.
Also, I haven't extensively used it, so there might be some scenario that
my method screws up, but for now it seems to be working.
- Tomoo Taguchi
AMCC San Diego, CA
P.S. Ron Ranauro's white paper in ESNUG 312 #1 was helpful. I tried to
email him, but it bounced. Do you have a more recent email address?
============================================================================
Trying to figure out a Synopsys bug? Want to hear how 14,063 other users
dealt with it? Then join the E-Mail Synopsys Users Group (ESNUG)!
!!! "It's not a BUG, jcooley@TheWorld.com
/o o\ / it's a FEATURE!" (508) 429-4357
( > )
\ - / - John Cooley, EDA & ASIC Design Consultant in Synopsys,
_] [_ Verilog, VHDL and numerous Design Methodologies.
Holliston Poor Farm, P.O. Box 6222, Holliston, MA 01746-6222
Legal Disclaimer: "As always, anything said here is only opinion."
The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com
|
|