( ESNUG 396 Item 2 ) --------------------------------------------- [07/11/02]
Subject: ( ESNUG 394 #3 ) Conflicting Letters On The 3 Monterey Tape-outs
> Last month Monterey claimed 30+ tape-outs. And Goering rightly reported
> that Monterey "claims just over 30 tapeouts". I know of two customers
> who've done 3 Monterey tape-outs -- therefore I'll only report 3 tape-outs
> until I hear otherwise from the Monterey customers themselves. - John
From: [ The Truth Is Out There ]
Dear John,
Please keep me anonymous. I'm an engineer that uses Aristo IC-Wizard of
Monterey. With Dolphin, I know:
1. There are currently, 0 tapeouts (!) were done using Dolphin only.
The tapeouts in Infineon and in Zoran use Silicon Ensamble as a
router. In addition, both tapeouts were done using Cadence tools
for clock.
2. All 3 "tapeout" uses a huge R&D support by Monterey.
3. The tool itself is extremely slow, and require a lot of special
tricks and bypasses.
4. There is no real connection between Dolphin timing engine and Aristo
IC-Wizard. We have heard a lot promises, with 0 results.
I do not know where Mr. Goering found 30 tapeouts. If it is so, why is it
so confidential??
To me it seems like someone's trying to fool ESNUG readers.
- [ The Truth Is Out There ]
---- ---- ---- ---- ---- ---- ----
From: Hiroyuki Nakamura <nakamura.hiroyuki@canon.co.jp>
Cooley-san,
Monterey asked me to tell you how many tapeout we made. We made 2 tapeouts
last year. One is for Sensor's peripheral mixed signal ASIC by 0.35 um
technology. Another one is also mixed signal IC including area sensor by
0.8 um technology. I can also inform you that these 2 chips work on 1st
silicon without problem. Now we believe that Monterey's Dolphin is very
reliable EDA tool.
- Hiroyuki Nakamura
Canon, Inc. Japan
---- ---- ---- ---- ---- ---- ----
From: [ The Infineon Man ]
John,
Pls. keep me anonymous. I'll try to clearly mark my opinion from facts.
We have used a Monterey "physical synthesis" flow for 2 tapeouts so far.
Designs: Telecommunication SoC's, 700 K gates, 1M gates, each dominated by
macros (70%,80% chip area is macros). One design 30 clocks other 50 clocks,
60 MHz base frequencies (one design locally up to 500MHz).
Our simplified design flow:
1.) designs simulated RTL VHDL
2.) Synopsys DC (no special options) and test insertion
3.) Monterey Sonar/Dolphin Layout + "physical synthesis"
4.) Extraction, STA, formal verif., LVS/DRC
The only new tool in our flow is Monterey's "physical synthesis". In our
case this has been Sonar/Dolphin 2.0 -- which has proven reasonnably
stable and very useful (opinion).
We give Design Compiler fairly simple block-level (clock, some basic
block-level IO etc.) constraints and leave chip-level timing, buffering
etc. to layout.
We have designed the system to ease timing on the interface definition of
blocks (registered outputs, etc.)
There's been only one occurence of a case of bad architecture selection
(arithmetic) by DC when it thought it could get away with a simple ripple
structure when more complex (carry-look ahead) was needed (due to a non-
registered top-level path which had slipped by.)
Dolphin usually improved on the timing expected by Design Compiler even
using aggressive wire-load models.
Personal Opinion: In this flow, Design Compiler still wastes too much
time tinkering with wire-load models, drive strengths, fanout buffering
and low level time-driven structuring. It should leave all that up to
physical synthesis. Physical synthesis is in a better position for that
as it has more accurate information and it can do better optimization
(by not only taking placement into account but by controlling placement
to improve timing). Really aggressive timing driven placement can
generate surprisingly fast logic for datapath designs.
To take advantage of this flow, however, new libraries are needed. We are
seeing critical paths placed so tightly that smaller cells without output
buffering would be strong enough drivers yielding more speed and less power.
Currently the smallest library cells available still "overdrive" those
paths.
Our DC effort had no extreme time/memory/compute intensive toplevel
characterization, bottom-up, top-down, etc. runs. We're back to good ol'
straight forward module level DC runs which we parallelized. Dolphin
uses the same format timing constraints as PrimeTime, thus it's no big
deal to convert between them. (Figuring out how to constrain between
50 clocks is a different story.)
On a side note: Clocks which are derived off of other clocks (in our
case most of our clocks) can be specified as such to Dolphin. It can
balance them and determine the actual phase relationship between them
into account when doing setup/hold fixing. That takes the burden
(and uncertainty) of determining false paths off the user's shoulders.
Opinion: This doesn't sound like much, but we believe it greatly enhances
design reliability as false-path statements always bear the risk of being
less false than the designer believes. Since Dolphin knows about the exact
timing between the clocks it fixes whatever paths there are - no worries.
(Yes, we likely fix more paths than we really have to - but who cares when
it's free?) Another side effect is that one can quite flexibly describe
how to do the clocks (e.g. skewing flops early for fast chip-output paths)
and still get all other timing needs (hold to those flops, scan etc.) fixed
without any extra effort.
All this is -IMHO- only possible in a physical synthesis flow/tool, which
makes absolutely sure that the timing you see is what you get as there
are no further steps afterwards. (Unlike the traditional synthesis-
layout flow.)
After running through layout (less than 24 hours) of course we do some
DRC/LVS/extraction, independant STA, formal verification. This usually
comes out somewhat clean. (Whenever we saw major issues the causes
where human errors, library mismatches, timing mis-characterization,
wrong derating, etc.)
Opinion: With more experience/maturity of the flows/tools/methodologies,
we believe we can take some of those backend-backend steps off the
critical path and move them into parallel (sort of qualification-) paths.
Basically like years ago when we started to believe synthesis not to
produce bad logic (we still check - but not as insistently and rigorously
as years ago) and when we adopted STA to replace timing simulation.
Opinionated Summary:
Even though the Monterey flow is already clearly better, we are not
seeing the full potential of it yet:
- No clear distinction of tasks between "Front-End" and "Physical"
Synthesis. (IMHO these require extremely different key features:
Architectural vs. Physical)
- Basically no library support in place. Need: Front End libraries
without distracting information (like wire-load models, many drivers,
wire-resistance etc.)
Backend libraries need significantly more drive-strenghts (finer grain,
much smaller), availability of more aggressively designed cells (complex
functions, non-buffered cells.)
- [ The Infineon Man ]
|
|