( ESNUG 374 Item 7 ) -------------------------------------------- [06/14/01]
Subject: The Magma "Blast Fusion" Customer Tape-out Listing Before DAC
NO FOOLISH CONSISTENCY: Last week, after I announced that I was doing a
Magma customer tape-out count, 4 Magma users contacted me directly and the
management at Magma gave me a list of 12 customers who had done multiple
tape-outs. I told Magma management then that I was going to count their
customer's tape-outs the *exact* same way I did last December as outlined
in my Tape-out FAQ at http://www.deepchip.com/news/tapeoutFAQ.html
I lied.
Why? Because I had sent those 16 Magma users a fairly detailed questionare
and found the following running themes.
- All of them used Magma in a gates-to-placed-gates mode. That is, not
one was using Magma's RTL synthesis. Instead, they were mostly using
Synopsys Design Compiler with a few Cadence Ambit-RTL users thrown in.
- All but 1 weren't using Magma's built in DRC/LVS capabilities. Instead
they were mostly using Menter's Calibre + a few Avanti Hercules users.
- The majority had prior bad experiences with Synopsys PhysOpt; and a
few had disappointments with Cadence PKS and Monterey Dolphin.
- All had 1 or more tape-outs using either Blast Fusion or Blast Chip.
OK, those are legitimate user experiences. No problem here. My problem was
on the phone, Scott Hamm of Vitesse told me he had a number Magma people on
his site helping him. In fact, most Magma users reported they had used
Magma tools with a *lot* of help from Magma support and R&D:
"The designs were all closely cooperative design tasks between Fujitsu
and Magma. Magma had roughly 1 to 3 people working on our chips and we
had about the same number. Whatever they had with people, we matched.
We were basically working together on these tape-outs."
- Gerry Atterbury of Fujitsu Microelectronics
"Our relationship with Magma is tight, but not with its R&D department,
more the AE department. During our intense P&R stage, we had a Magma
FAE on site at least one day a week, and on the other days, the FAE
would be running jobs for us."
- Morrie Berglas of PowerVR Technologies
"Magma gives us one full time support guy on-site 2-3 days a week here at
TI. In addition, they also give us 2 to 3 others as part time local
support."
- Francis Larochelle of Texas Instruments ASIC
"Magma worked very closely with us and we have a great tie in with their
R&D group. This is such a bonus when you are working on large complex
chips such as ours."
- Paul Pontin of 3Dlabs
"Most of the work was carried out by Magma FAE's in UK (we know those guys
as they used to work for Avanti Europe.) We gave them a netlist + some
floorplan info. In parallel, people in my group worked on the same
design using our standard layout flow based on Apollo/Saturn."
- Hans-Olov Eriksson of Ericsson Radio System AB
So, from what these users are saying, Magma was in taxicab mode with a lot
of customers. "OK, such support is normal for new technologies, John.
What's the big deal?", you ask. In my Tape-out FAQ 6 months ago I wrote:
"Also, another non-tape-out is if the EDA vendor runs their tool for the
customer (i.e. "taxicab mode") instead of the user running the physical
synthesis tools themselves. Taxicab mode happens a lot in evals. It's
interesting, but it's not a customer tape-out as far as I'm concerned."
- from http://www.deepchip.com/news/tapeoutFAQ.html
Now if I keep my word and stick to that FAQ standard that I promised Magma
management that I would stick to, here's my Magma tape-out list:
Magma Customer Tape-Outs
Date Size Clock (Mhz) Company Location Fab/um
-------------------------------------------------------------------------
7/00 57 K gates 26 QThink San Diego, CA 0.25 TSMC
8/00 500 K gates 200 NEC Tokyo, Japan 0.13 NEC
8/00 7 K gates 480 QThink San Diego, CA 0.25 TSMC
8/00 5 K gates 50 QThink San Diego, CA 0.25 TSMC
9/00 5 K gates 24 QThink San Diego, CA 0.25 TSMC
10/00 5 K gates 100 QThink San Diego, CA 0.25 TSMC
12/00 7 K gates 480 QThink San Diego, CA 0.18 TSMC
12/00 740 K insts 125 IMG Tech Kings Langley, UK 0.18 TSMC
12/00 70 K gates 183 QThink San Diego, CA 0.18 TSMC
3/01 300 K gates 100 Broadcom San Jose, CA 0.18 TSMC
5/01 336 K insts 80 STMicro Agrate, Italy 0.18 STMicro
6/01 24 K insts 100 Signet Austin, TX 0.18 TSMC
6/01 536 K gates 120 Broadcom San Jose, CA 0.18 TSMC
But when I look at this list, I'm bothered. This gives Magma all of 13
tape-outs. Aargh! This isn't telling the whole Magma user story here.
Then I remembered one of my favorite quotes:
"A foolish consistency is the hobgoblin of little minds."
- Ralph Waldo Emerson in his "Self-Reliance" essay (1841)
Now I'm anticipating angry letters from Synopsys and Silicon Perspectives.
They did really well in my tape-out count 6 months ago. For this Magma
tape-out count, I'm ignoring my orginal Tape-out FAQ and I'm showing all
36 Magma customer tape-outs I found. And I'm just going to report this
as a "listing" instead. It's the right thing to do.
Magma Customer Tape-Outs
Date Size Clock (Mhz) Company Location Fab/um
-------------------------------------------------------------------------
8/99 100 K insts 100/200 Fujitsu San Jose, CA 0.25 Fujitsu
10/99 150 K insts 27/54/81 Fujitsu San Jose, CA 0.25 Fujitsu
6/00 100 K gates 110 TI Dallas, TX 0.18 TI
6/00 200 K gates 250 Vitesse Col. Springs, CO 0.18 TSMC 1p6m
7/00 57 K gates 26 QThink San Diego, CA 0.25 TSMC
7/00 100 K gates 125 TI Dallas, TX 0.15 TI
8/00 500 K gates 200 NEC Tokyo, Japan 0.13 NEC
8/00 7 K gates 480 QThink San Diego, CA 0.25 TSMC
8/00 5 K gates 50 QThink San Diego, CA 0.25 TSMC
9/00 5 K gates 24 QThink San Diego, CA 0.25 TSMC
10/00 5 K gates 100 QThink San Diego, CA 0.25 TSMC
* 10/00 600 K gates 66 Infineon Munich, Germany 0.18 C10
11/00 2.5 M insts 200+ 3Dlabs Egham, Surrey, UK 0.18 IBM 7SF
11/00 100 K gates 285 TI Dallas, TX 0.095 TI
11/00 75 K gates 155 Vitesse Col. Springs, CO 0.18 TSMC 1p6m
12/00 7 K gates 480 QThink San Diego, CA 0.18 TSMC
12/00 740 K insts 125 IMG Tech Kings Langley, UK 0.18 TSMC
12/00 70 K gates 183 QThink San Diego, CA 0.18 TSMC
11/00 175 K insts 100 Fujitsu San Jose, CA 0.25 Fujitsu
1/01 100 K insts 66 Fujitsu San Jose, CA 0.25 Fujitsu
* 2/01 600 K gates 66 Infineon Munich, Germany 0.18 C10
3/01 300 K gates 100 Broadcom San Jose, CA 0.18 TSMC
3/01 250 K insts 150 Fujitsu San Jose, CA 0.18 Fujitsu
4/01 3.5 M gates 166 Vitesse Col. Springs, CO 0.18 TSMC 1p6m
* 4/01 500 K gates 150 Infineon San Jose, CA 0.18 C10
5/01 114 K gates 155 TI Ottawa, ON 0.18 TI
5/01 336 K insts 80 STMicro Agrate, Italy 0.18 STMicro
5/01 466 K gates 108 TI Dallas, TX 0.18 TI
* 5/01 500 K gates 150 Infineon San Jose, CA 0.18 C10
5/01 450 K gates 155 TI Dallas, TX 0.18 TI
6/01 400 K gates 78 TI Dallas, TX 0.18 TI
* 6/01 500 K gates 150 Infineon San Jose, CA 0.18 C10
6/01 87 K gates 83 TI Ottawa, ON 0.18 TI
6/01 24 K insts 100 Signet Austin, TX 0.18 TSMC
6/01 75 K gates 155 Vitesse Col. Springs, CO 0.18 TSMC 1p6m
6/01 536 K gates 120 Broadcom San Jose, CA 0.18 TSMC
**11/01 160 K gates 125 Ericsson Stockholm, Sweden 0.13 TI GS40
* - while my Infineon contact confirmed that it has done 5 Magma
tape-outs in its http://biz.yahoo.com/bw/010614/2023.html
press release, Infineon would not give me the exact stats
on each tape-out. The Infineon stats presented here came from
Magma, so the higher than average gate counts may be suspect.
** - November 2001 hasn't come yet. It's a planned tape-out.
From: Chris Faerber <chris.faerber@infineon.com>
John,
Infineon has committed a press release with Magma regarding tapeouts.
The press release states the real amount of tapeouts we did with Magma
worldwide at all Infineon development centers. Infineon doesn't want
to spread partial-information by different sources, therefore the press
release shall be the one and only source for tapeout count.
- Chris Faerber
Infineon Germany
---- ---- ---- ---- ---- ---- ----
From: Morrie Berglas <morrie.berglas@powervr.com>
Hi, John,
My company uses Magma intensively, however, my group has not yet had a
tapeout. I can't comment on behalf of the other groups in the company in
terms of their progress with the tool.
My group is really hammering Magma and, although some subblocks have
congestion, area, and/or timing issues, quite a few blocks go through the
Magma flow seamlessly. The particular subblock I'm working on is flattened,
7.4 million square microns, 300 K cell instances and runs at 250 MHz, and is
one of the blocks which meets timing closure after routing. Our company
does not traditionally do the backend flow, and considering this, we are
still managing well with the Magma tools. They really do perform well,
give good results and are not overly complicated.
Our relationship with Magma is tight, but not with its R&D department, more
the AE department. During our intense P&R stage, we had a Magma FAE on site
at least one day a week, and on the other days, the FAE would be running
jobs for us. We've got a lot of licenses installed, and a fairly high
profile chip, so we really shouldn't expect any less on the support front.
BTW, we're not using Magma's VHDL synthesis tool. In our flow we still
synthesise with DC shell and wireloads, then use Magma for CTS, P&R, etc.
However, I do believe Magma re-optimises logic as required.
Our major partner, STmicro, performs our LVS/DRC backend tasks. I believe
they use a combination of Hercules, Excalibre and Apollo.
My block P&R'ed into a rectangle measuring 2471.38 um by 3019.83 um and the
final utilisation figure from the tool was 93.85%. So obviously, Design
Compiler gave Magma a bloated margin in synthesis. This bloat can be
attributed to many reasons: DC can't be constrained with the size of the
block, nor the macro placement, nor the number of metal layers, nor the
configuration of the power mesh, and so on. We've just learned to factor
this into the floorplan from day 1.
In terms of the problem blocks I personally haven't compared the results
against PhysOpt, but I'm sure the same problems exist there, too. Sometimes
we just try to jam too many cells and macros into too small or too irregular
a footprint. Considering some of the blocks I've seen go through and timed
after routing in Magma, I think a few of the negative comments made about
the tool are plainly untrue. Very large and very fast designs are feasible.
On a slightly separate note, its disappointing that you would be so quick to
dismiss what I consider a truly "new entrant" in the field. You must
respect them for what they are trying to accomplish. It takes a very long
time and lots of cash to develop such a large and complex application.
Magma's approach is different enough from PhysOpt's that is should be
allowed to continue competing. Who wants an industry where the market
leader kills all competition before they even have a chance to get off the
ground? Yeah, $87 mil is a lot, but maybe it takes $200 mil to properly
develop and grow an EDA tool from scratch? The tool is good, the support
we're getting is great, so maybe they're losing a ton of cash but, as an
engineer, I don't care one little bit.
- Morrie Berglas
PowerVR Technologies Kings Langley, UK
---- ---- ---- ---- ---- ---- ----
From: John Dyer <jdyer@qthink.com>
John,
We are tool, foundry and IP independent design services company currently
doing place and route work with Silicon Ensemble, IC Craftsman, Apollo and
Blast Fusion 2.0. We have ~50 people. Although our company has completed
a number of large designs, the largest is over 6M gates, the 7 tape-outs
that we have done with Blast Fusion have all been small designs (5 K to
70 K gates).
For all of our designs, we used conventional synthesis tools (Synopsys or
Ambit.) For 4 of them we got RTL or earlier handoffs and did the synthesis
ourselves. For the remaining 3 of them we got netlist handoffs. So in all
cases we used Magma for gates-to-placed gates, not RTL synthesis.
We used Calibre for physical verification on all of these designs.
All of these designs were digital control logic blocks for mixed signal
chips except for one 70 K gate design which was a purely digital chip.
Obviously with new software there were some bugs; however, we were able to
complete all of these designs without any help from Magma R&D. We did have
to do our own workarounds for antenna fixes, tho.
We have also been looking at PKS and Physical Compiler, but have not reached
any conclusions.
- John Dyer
QThink San Diego, CA
---- ---- ---- ---- ---- ---- ----
From: Gerry Atterbury <gatterbu@fmi.fujitsu.com>
John,
We have taped out a total of five chips using Blast Fusion.
The designs were all closely cooperative design tasks between Fujitsu and
Magma. Magma had roughly 1 to 3 people working on our chips and we had
about the same number. Whatever they had with people, we matched. We were
basically working together on these tape-outs.
We used Synopsys Design Compiler for our RTL synthesis on these chips. Our
Magma starting point on these chips was a gate level netlist. Our end point
was GDSII.
We actually used the Magma tools for DRC and LVS. During our original eval
of BlastFusion we had carried out a very careful correlation of the physical
verification results of Magma DRC with Cadence's Dracula. After we worked
through some minor mismatches, we were satisfied that Magma's DRC was
equivalent to Cadence Dracula.
The issues we encountered using Blast Fusion were:
1. Power router causing DRC and signal shorts. This was from overlapping
VDD and VSS power rings for macros and core ring. Magma fixed it in
release 1.0.
2. Detailed placement problems. The placer was placing some standard
cells outside the core area, between Macro and pad cells. This
problem occurred in beta version and was fixed in version 1.0.
3. Clock routing bug. This was core dump in Beta version and was fixed
in version 1.0.
4. The horizontal power strap colliding with standard cell power line.
This was fixed in version 1.0.
5. Runaway capacitance bug. This was due to low gain cells in 0.25um
library. See Appendix B for more information. This problem occurred
in beta version and was fixed in version 1.0.
6. Problem in the detailed placer where the placer was running forever.
This problem occurred in beta version and was fixed in version 1.0.
7. Crash in Mantle in 'run gate buffer load'. This was a bug in timing
analysis. This problem occurred in beta version and was fixed in
version 1.0.
8. 'recover_rising' arc on cells are named as SETUP tests in TLF. The
tlf2magma conversion was, as a result, replacing the 'recover_rising'
arc by setup arc and causing data-to-data timing tests to arise.
Magma tool was giving error message in optimization. This problem
occurred in beta version and was fixed in version 1.0.
9. Hold time:
1. The tool was incorrectly reading setup time as hold time
from the TLF.
2. Optimization for hold time check needed some tune-up, since
it was not reporting all hold time violations.
3. Simultaneous min/max analysis had a problem. This problem
occurred in version 1.0 and was fixed in version 1.1.
I realize these are mostly 1.0 bugs. The newer versions of Blast Fusion
don't have nearly as many bugs that 1.0 had.
What we liked about Magma was that it closed timing with no iterations
on a chip that had taken 5 iterations with a competitor's tools.
- Gerry Atterbury
Fujitsu Microelectronics San Jose, CA
---- ---- ---- ---- ---- ---- ----
From: Hung Hua <hung@signetdesign.com>
Hi, John,
We have used Magma to benchmark several designs that we taped out previously
with other tools. We tried PhysOpt from Synopsys. We also evaluated PKS
from Cadence and Saturn from Avanti. With PhysOpt we found it:
- deals only with placement and hence may require more iterations to
achieve timing closure.
- has serious capacity problems (on more than 150 K gates).
- the placement output by PhysOpt was not routable using Silicon/Ensemble
or Avanti tools.
So far the only design that may be counted as a Magma tape-out is a 100 K
gate block design. The block is going to be part of a tapeout of a big chip
that has lots of on-chip memories. It can be viewed as a hard IP delivered
to be integrated on the big chip.
Our engineer took the design through Blast Fusion. Reddy has learned and
used the tool for several benchmarks of his previous designs before doing
this tape-out. We also have other Magma users in-house working together
with Reddy.
The design has 24 K standard cells (approximately 100K nand2 equivalent
gates). The design also contains 4 embedded memories. Physically, the
memories occupied about 30% of the area of the block. In this case, Blast
Fusion was used to optimize a given structural netlist to obtain:
1.) Timing closure, but no focus on timing improvement.
2.) Clock tree design and skew balancing.
The block was designed to run at 100 Mhz as a whole. The block has been
delivered (as hard IP) for chip integration. The chip is planned to tapeout
the end of this month. The chip will be fabbed with TSMC 0.18.
We used Blast Fusion in a gates-to-placed-gates mode. Synopsys DC was used
to go from RTL to gates. We used Mentor Calibre for physical verification.
We found the Blast Fusion power router to be flaky. Had to fix some of the
power manually. Also the current version of the tool does not allow to
finish the power completely before the routing.
Timing correlation between Blast Fusion and PrimeTime had to be done on some
paths manually since they each break the timing loops differently.
We liked Magma's ability to achieve timing closure easily. Blast Fusion
takes timing into account every step of the Place and Route process. It
does every thing it can to fix the timing as more detailed parasitics become
available. This is a big plus for us since we would go through 4 to 6
iterations to fix transition and setup violations which popped up with
the actual parasitics.
- Hung Hua
Signet Design Solutions, Inc. Austin, TX
---- ---- ---- ---- ---- ---- ----
From: Hans-Olov Eriksson <Hans-Olov.Eriksson@era.ericsson.se>
Hi John,
During February & March this year we did an eval of Blast Fusion 2.1. Most
of the work was carried out by Magma FAE's in UK (we know those guys as they
used to work for Avanti Europe.) We gave them a netlist + some floorplan
info. In parallel, people in my group worked on the same design using our
standard layout flow based on Apollo/Saturn.
Our actual design was a transcoder ASIC consisting of 8 identical DSP cores
plus I/O blocks plus memory. The DSP core is an Ericsson in-house design
and was used as testcase for the Blast Fusion evaluation. Size of the DSP
core is 160 K gate. The total chip size is 1.5 M gates plus 5 M-bit SRAM.
We had two options for transcoder ASIC, use 16 transcoder ASICs on the board
each running at 125 MHz or go for 8 ASIC's running at 250 Mhz. It was with
the latter track we went to Magma and asked them if they could close timing
on 250 Mhz. We have previously had problems with our Avanti flow to reach
250 Mhz in a reasonable amount of time
We use TI as our vendor. For the 250 Mhz version, TI's SR40 technology
(high performance lib, 0.13 um) was considered. We finally decided to go
for the 125 Mhz version (because of lower risk) and we're using TI's GS40
(low power lib , 0.13 um). We're still designing the chip, tape-out is
planned for November this year.
We use Synopsys DC and Module Complier for the RTL part of this design. We
looked at Ambit 18 months ago, but at that time Ambit didn't have a datapath
compiler.
Our Magma evaluation went fine. Initial goals ( 250 Mhz after extraction in
less than two months) were met. In my opinion, Blast Fusion seems to be a
very good block-level tool well suited for "flat" time critical blocks like
DSP cores. Don't know if it's the best hierarchical chip level assembly
tool. I heard most people are using it for block design like ours. On this
evaluation we didn't focus on area and that could be the reason the achieved
size was not so impressive.
Big positive with Magma. Their evaluation goals were fulfilled 100%. The
reason we decided not to go for Blast Fusion and Magma was due to changed
project plans (focus on low risk because of economic slowdown in the telecom
market.) Their support people in UK are very competent and focused.
Impressive overall road-map. Magma seems to be the only EDA vendor to
provide a complete RTL-GDSII flow that includes synthesis, clock tree synths
and SI aware P&R.
Haven't tried Magma RTL synthesis though.
Usually, our back-end activities take place on-site at Ericsson in close
cooperation with our DSP design team. With TI, they run LVS/DRC for us
using their own internal tool. With VLSI, we used to run XCalibre/Calibre.
Now with Philips, they run Hercules for us. We only do physical design on
time critical blocks like DSP cores. In my group, we provide DSP cores to
many design units within Ericsson and we always deliver as "hard" macros.
The ASIC vendors are responsible for the top level assembly and that
includes transistor level DRC/LVS. We always run the "basic" cell level
LVS/DRC built into Apollo and it seems most design rule errors are found
at that level.
Last year we spent about 6 months on an eval of PhysOpt. We never closed
timing on that tool, the interface to Avanti was very complex and we could
only run the tool with on-site support from a Synopsys FAE.
In my opinion, Saturn is a good physical synthesis tool. We've used in all
our tape-outs since 1998.
- Hans-Olov Eriksson
Ericsson Radio System AB Stockholm, Sweden
---- ---- ---- ---- ---- ---- ----
From: [ Been There, Done That ]
John, I must be anon.
Magma problems:
Power router gets confused with non preferred preroutes. Lots of tcl
required to make sure vias are dropped in the right place. In an
unconventional design with io's and memories straddling the core
boundary, much time can be spent eliminating real and verifying false
power preroute opens and shorts.
Detail router doesn't always find a solution for off grid closely spaced
pins on macros. Similarly, the global router may see a macro pin as
accessible while the detail router says it is inaccessible.
Clock splitcells impede drc and antenna resolution. A clock splitcell
enforces an htree implementation for the clock tree. Since the clock
router places it instead of the detail router, a poor placement can make
drc or antenna violations difficult to eliminate. This results in
manually moving some of the splitcells to get drc/antenna clean.
Tie hi/lo pins handled by power router instead of detail router. The
power router was really built for meshes and rings and does not always
find the appropriate solution for tying a macro pin to power or ground.
ECO flow is not well tested for corner cases. The spare cell metal only
eco flow works very well, but macro pin changes or tie hi/lo pin changes
are difficult to implement.
GUI crashes more than it should, but that is not a major problem since
most long runs are done with batch scripts.
Magma strengths:
Fixes slew/cap/fanout violations flawlessly. Has separate buffering
commands for trees (>1000 pin nets), long wires, and large capacitance.
Running the same design on Saturn and Magma resulted in 30K slew
violations with Saturn and 6 slew violations with Magma (all 6 were due
to a placement blockage which prevented repeater insertion).
Powerful hold time fixing looks to add buffer at start point first, end
point next, and then every point in between. Minimizes number of buffers
added without damaging setup time.
Accurate RC extraction and delay calculation. RC extraction compares
well with quickcap. Ceff, slew degradation, and wire delay calculations
compares more closely to spice than Primetime when looking at high fanout
nets with long wires.
Automatic macro placement gives the user an excellent starting point for
minimizing global wire length and creating a routable design. It is
based on force driven placement instead of quadratic placement so large
blocks can be moved simultaneously with standard cells in order to find
the minimum global wire length while eliminating cell overlap.
Incrementally improving accuracy of routing models allow for appropriate
optimization decisions to be made throughout the flow. Initial
optimization is done with manhattan mode. Additional optimization is
done with a global route mode which understands detours as well as
estimated lateral capacitance according to the number of used tracks in
each small region.
Tcl interface to data model allows tremendous easy-to-use flexibility for
adding functionality to the tool. Some examples include a simple lef pin
abstraction of a block, design rule adherence on block boundaries by
placing buffers on primary pins, river routing of busses, modifying the
via structure in the power mesh, and extracting bond pad center
coordinates. Much easier to use than any lisp based language (e.g. skill
or scheme). Fully integrated static timing, placement, routing, clock
tree, optimization, extraction under one data model allows for tremendous
flexibility.
The noise avoidance during track routing through timing window and
crosstalk analysis is certainly powerful but I'm reserving judgement
until further investigation and correlation with Cadmos.
Useful skew during clock routing helps setup time without creating hold
time problems. This is extremely powerful on high performance designs.
Antenna fixing understands diffusion area related ratios and combines
global route, detail route, and diode insertion to converge on an antenna
clean solution.
Experiences with other physical synthesis tools:
Avanti has no good solution for fixing slew violations due to RC delay
except to run the clock tree synthesis tool on long wires. Saturn does
not address these slew violations.
Avanti does not handle hold time fixing very well. Saturn adds too many
buffers and doesn't always fix every path due to miscorrelation or some
other tool problem.
PhysOpt is based on the dc engine instead of PrimeTime. This is a fairly
basic delay calculator built for wire load models and not for
placed cells with high fanouts with different length wires on each
fanout. It does not include effective capacitance, slew degradation, or
accurate RC delay. Thus it is a poor tool for repeater insertion or long
wire slew violations.
PhysOpt does not correlate with the detours which are made by the backend
global router. Thus large loads and long wires can show up at the end of
the flow which PhysOpt did not fix since it did not make the same detour.
Cadence PBOPT is limited to only buffering and sizing and relies on elmore
delay calculation. A fairly basic tool which should be obsoleted by PKS if
they can ever get it to work.
- [ Been There, Done That ]
---- ---- ---- ---- ---- ---- ----
From: Hiroaki Maruyama <h-maruyama@pi.jp.nec.com>
Hello, John Cooley,
We've taped out using Blast Fusion v2.0. And now we are using Blast Fusion
v2.1 or later beta revision. We don't use Blast Chip except for trial.
Basically, our NEC internal team ran Blast Fusion, although we often needed
Magma's help.
We used Cadence Ambit as RTL synthesis, even though this is not standard
tool for us. And we use Magma in gatelevel-to GDSII.
For LVS/DRC, we mainly use Mentor.
In Blast Fusion, we found a lot of bugs and many differences with NEC
sign-off rules. Most of these are related to routing issues. Routing
issues are very heavy to design implement, but are very impotant to
reduce design iterations. Magma can repair these bugs right away. But
we requests various issues related to routing.
Blast Fusion needs very long runtime. Fortunately, our first test design
was very small, but I wonder whether or not Magma can handle huge design.
Magma says that Blast Fusion v3.0 can handle....
What I like about Magma is that it has true single database from RTL to GDS.
Magma can calculate delay with same engine during synthesis, place, CTS, and
route. Especially, Magma can handle the delay of both clock and signal
simultaneously. Therefore, we can design with less margins.
NEC and NMS (NEC MicroSystems) are using other physical synth tools. We use
Blast Fusion for only few particular designs now. We can not compare these
tools with same design, same time, same human resources, and so on.
However, as the result of our many trials, I think that Magma competes only
with Avanti, currently -- because, the results of PhysOpt and PKS depend on
CTS and routing tools.
- Hiroaki Maruyama
NEC Tokyo, Japan
---- ---- ---- ---- ---- ---- ----
From: Francis Larochelle <francis@ti.com>
John,
We've been successful in using Magma Blast Fusion (v2.1) at our ASIC design
centers. (Enclosed is our data on 8 Blast Fusion tapeouts.) In the first
half of this year, we have transitioned 100% of new block designs to Magma
Blast Fusion and are now transitioning top-level design as well. (The first
design is almost complete).
We have also successfully closed timing on additional evaluation designs
that include cores in the 250 MHz-300 MHz range and top level designs of up
to 2M gates.
Magma gives us one full time support guy on-site 2-3 days a week here at TI.
In addition, they also give us 2 to 3 others as part time local support.
We used Blast Fusion only in a gates-to-placed-gated mode. We used Design
Compiler for all the RTL-to-gates synthesis in these tape-outs.
To answer your last question on what DRC and LVS tool we are using:
We are using Magma's built-in layout verification capabilities first,
but once the Magma database is 100% DRC/antenna clean, we re-verify
that with our signoff (TI internal/K2) layout verification flow. We
have had a few issues where the signoff flow has found additional
errors but mostly the Magma inbuilt layout verification is correlating
well with our signoff layout verification flow.
I compiled specific information on issues we have had with running Magma
over the last year. We have found more or less efficient ways to deal
with them....
* Handling Multiple Modes
-----------------------
We have had some challenges on automating a flow which fixes all hold
violations for both mission and test modes. We typically put the
design in mission mode and do hold fixing. In many cases this fixes
all the hold violation in test mode as well. However, in some
circumstances there are additional test mode hold violations which
must be addressed in Blast Fusion.
It is dangerous to put the design in test mode and do hold fixing
because the tool can not see the mission mode constraints and would
put hold buffers in places that would cause the mission mode setup
times to fail. We normally have to do some manual / interactive
fixing of these "leftover" test mode violations (violations that remain
after mission mode hold time fixing). Magma has developed a new
capability in which the mission and test modes can be optimized
simultaneously by propagating all clocks through the timing graph at one
time. We are anxious to try this out in the near future. However, we do
realize that this capability will require some additional investment in
identifying false paths which are created by this overlapping clock
scenario . (i.e. paths launched by mission mode clock
and captured at scan data pins; paths launched by test clock and
captured by normal data pins.)
* High Fanout Nets
----------------
We had run into some problems with high fanout nets which
were not constrained by timing such as reset and enable lines.
This would cause the tool to build long chains of weak buffers
which resulted in some very unreasonable delays. (Sometimes
as high as 70-80 nsec.) We found that this problem could be avoided
by defining these signals as clocks. This forced the Magma clock router
to work on these nets which build a much better tree.
* Global Route vs Detailed Route Timing
-------------------------------------
In a few cases we had encountered timing surprises after detailed
routing due to the global route timing estimates being too conservative.
By default the tool is more pessimistic at global route to prevent
setup time surprises. However in certain situations a timing speedup
at final route can cause setup time failures too. (This happens on
IO-to-register paths when the IO clock timing is fixed and the on-chip
clock insertion delay speeds up significantly.)
Of course, the speedup can cause hold time failures as well. We normally
can fix this with some post routing ECOs. Magma does have a capability
to use congestion information to better tune the global route estimations.
We have used this with some success, however greater margins during
optimization need to be used since some of the tool's built in margin
is lost.
* IO Timing Constraints
---------------------
In working with full-chip designs, we originally had some difficulties
in meeting the IO timing constraints. We found that the normal placement
algorithms weren't sufficient when the IO timing constraints were tight.
In response to our difficulties Magma developed a coning algorithm which
can be activated to pull the cone of logic up to the first register close
to their associated IO cell. We have found this capability to be quite
successful in handling these types of paths.
* Interpretation of Constraints vs PrimeTime
------------------------------------------
For the most part, the interpretation of constraints between Magma and
Primetime have correlated pretty well. We have found a few differences
and have published a best practices guide to our users to help them
avoid these situations. One such example was the way the two tools
handled the latencies for the clocks used at the IOs. (The clocks
which determined the arrival times at the inputs and the required
times at the outputs). When the clocks were calculated in propagated
mode and a non-virtual clock was used, PrimeTime would use a latency of
zero, where as Magma would use the original latency defined for ideal
mode timing. This often caused reported timing failures in PrimeTime
when doing STA with back-annotated SDF. We found the problem could
be avoided by always using virtual clocks to define the IO timing.
This allowed the IO clock latencies to be controlled separately from
the on-chip clock latencies and made the two tools behave the same.
TI is using several other physical synthesis tools w/ good success (Synopsys
PhysOpt, Silicon Perspective First Encounter, Cadence PKS). However, we
believe that presently Magma Blast Fusion best meets our objective of
decentralizing complete/final physical design at our world wide ASIC design
centers.
- Francis Larochelle
Texas Instruments ASIC Dallas, TX
---- ---- ---- ---- ---- ---- ----
From: Marco Montalti <Marco.Montalti@st.com>
Hi, John,
Our engineer went to the Magma training class for a week in November. After
that we were supported by a Magma AE from the UK. The AE flew in every
2 weeks mostly, but there were times when he was here much more frequently.
When we ran into a Magma bug, he'd e-mail it to California. Overnight we'd
download a new version of Blast Chip that had the bug fixed. It was
quite good. The uk guy was quite skilled. We believed that if our engineer
couldn't use Blast Chip, it would be useless.
Our current production design flow is PhysOpt. We used a design that we
ran through PhysOpt as our test case. We took that design and ran it
through Magma Blast Chip 2.1, Cadence PKS, and Monterey Dolphin. Magma
gave the overall best results in terms of usability, execution time,
resources utilization such as # of CPUs, RAM and disk space, interop with
other tools, capability to import and export data at the different design
stages and timing results. One candidate was not capable to complete
the design (I can't say who.)
For the time being, we don't intend to replace the PhysOpt flow because it
is very stable. Our intention is to use Magma as a parallel solution. We
believe it is more automated than PhysOpt. With PhysOpt you must run
CTS, detailed routing, and after that, physical optimization. With Magma
you feed it a gate level netlist & timing constraints and all those steps
are executed with no more human intervention. Magma is basically CPU time.
For RTL, we use Synopsys Design Compiler, not Magma synthesis. We also
placed the blocks of our design using Synopsys Chip Architect. For
sign-off, we used Simplex for extraction, PrimeTime, and Calibre.
We liked Magma design implementation. It concurrently cares about placement,
routing timing closure, signal integrity, net loads, parasitic extraction and
all the possible checks. We liked the ability to access to a batch run from
remote to monitor the status of a job and Magma's debugging capability at the
DRC and LVS stages.
We have also seen Magma's beta code that will introduce real hierarchical
capability and we judge it one of the most powerful and usable we have
seen so far.
We only had one show stopping bug on the management of the combination of
one hard block and timing constraints between the rest of the logic and
the block itself. The problem has been solved in one week. We also had one
problem related to the parasitic extraction, quite quickly identified. The
solution has still to be verified.
There is another problem in the DEF writing currently resolved through a
work around.
In general, we have been favorably impressed because we expected more
problems from a tool this young.
We expect Magma will be an easy integration into our design flow that is
going to happen in the next month.
- Marco Montalti
STMicroelectronics Agrate Brianza, Italy
---- ---- ---- ---- ---- ---- ----
From: Paul Pontin <Paul.Pontin@3dlabs.com>
John,
We started looking at physical synthesis tools around May 1999. I talked to
Monterey and Magma at that time. Magma was the most willing to engage with
us. We checked out Synopsys some time later, but by then we were well
advanced with Magma.
As you know I am very pro Magma, it is probably worth giving you a bit of
background.
3Dlabs was moving from an ASIC flow to a COT flow. As such, we had no
legacy layout tools. Magma fitted the bill perfectly, it was a one stop
shop. The only other tools I needed were for LVS and DRC.
If I had already invested in Avanti or Cadence tools, then I may well have
leant more towards a PhysOpt type of product. PhysOpt would have fitted
in with a flow I was used to. However these flows are not unified, you are
in and out of different tools, each doing its own piece, till eventually you
end up with GDSII. As an engineer I find the purity of the Magma flow very
reassuring, you can tell when something is intrinsically right.
Magma worked very closely with us and we have a great tie in with their R&D
group. This is such a bonus when you are working on large complex chips
such as ours. I also feel that we have some steer into the direction that
the tools are developing. Magma really is listening to what the industry
wants. I just hope that they stay that way as they grow bigger.
Magma gave me a simple scripted flow that could get me from a netlist to
finished GDSII quicker, and with more predictable timing results, than
anything else we had come across.
The flow is set up with early trial netlists. Once this has been done the
GDSII can be generated very quickly (around 1 week for 300K placeable
instances). We simply re-run the scripts. This gives a huge time to market
advantage.
In general Magma S/W is very good, the problems we have fall into 3 groups:
Third party suppliers of RAMs, Standard cells, I/O's, foundary services, etc
have not done any QA with the Magma tools. We therefore come across some
compatibility issues that need sorting out. Not show stoppers, but annoying
none the less. The greater number of Avanti and Cadence seats means that
such issues are usually ironed out before the end user gets involved.
The interactive GUI is still a little flakey. This is a minor irritation
for us as we script everything in TCL, however it would be nice to have it
fixed.
The tool is still under constant development, new features are being added
all the time. My dilema is which release of Magma to use for the duration
of a project. We have found it best to develop the scripts on one release
and stay with it, even if some new sexy feature is available in a later
release. However this is a familiar problem to most engineering managers.
Magma's QA seems robust though.
Hercules was used for our tapeout, mainly for logistical reasons. However
since then I have purchased Calibre.
Our flow is VHDL RTL and we use Synopsys DC to get us to an outgoing
netlist. The Magma tools handles the netlist from that point on. We have
no IPO, ECO type loops. All optimization for timing closure is done by
Blast Fusion.
- Paul Pontin
3Dlabs Egham, Surrey, UK
---- ---- ---- ---- ---- ---- ----
From: [ I Wear My Sun Glasses At Night ]
Hi John,
Anon please.
We have now run around 20 blocks of 100-300k instances through Magma to
extracted timing, DRC and LVS clean from 4 chips. I don't know if you
count these as tapeouts as only one set went together to make a full chip.
We used Avanti Hercules for verification.
We have found that Magma treat any failure to achieve a fully placed, routed
and timed design as a bug -- unless we agree that its too small or fast. We
have had problems with timing and congestion solved in this way. Took some
work before we got the tool to fix all antenna violations.
Of all the tools we have tried (Avanti/Saturn, Physopt, Magma, PKS), Magma
has given us best timing closure. i.e. it gets the fastest clock in a
given area.
PhysOpt is being run by a partner of ours so we are able to compare results.
On the 3 precise like-for-like test cases that we have tried (2 different
technologies), Magma came out best for timing. We have also found that run
times tend to be shorter. We only compare results if the design is
routable - no point otherwise.
We used Magma gates-to-placed-and-routed-gates. DC did our RTL synthesis.
- [ I Wear My Sun Glasses At Night ]
|
|