Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS


( ESNUG 374 Item 7 ) -------------------------------------------- [06/14/01]

Subject: The Magma "Blast Fusion" Customer Tape-out Listing Before DAC

NO FOOLISH CONSISTENCY: Last week, after I announced that I was doing a
Magma customer tape-out count, 4 Magma users contacted me directly and the
management at Magma gave me a list of 12 customers who had done multiple
tape-outs.  I told Magma management then that I was going to count their
customer's tape-outs the *exact* same way I did last December as outlined
in my Tape-out FAQ at http://www.deepchip.com/news/tapeoutFAQ.html

I lied.

Why?  Because I had sent those 16 Magma users a fairly detailed questionare
and found the following running themes.

  - All of them used Magma in a gates-to-placed-gates mode.  That is, not
    one was using Magma's RTL synthesis.  Instead, they were mostly using
    Synopsys Design Compiler with a few Cadence Ambit-RTL users thrown in.

  - All but 1 weren't using Magma's built in DRC/LVS capabilities.  Instead
    they were mostly using Menter's Calibre + a few Avanti Hercules users.

  - The majority had prior bad experiences with Synopsys PhysOpt; and a
    few had disappointments with Cadence PKS and Monterey Dolphin.

  - All had 1 or more tape-outs using either Blast Fusion or Blast Chip.

OK, those are legitimate user experiences.  No problem here.  My problem was
on the phone, Scott Hamm of Vitesse told me he had a number Magma people on
his site helping him.  In fact, most Magma users reported they had used
Magma tools with a *lot* of help from Magma support and R&D:

  "The designs were all closely cooperative design tasks between Fujitsu
   and Magma.  Magma had roughly 1 to 3 people working on our chips and we
   had about the same number.  Whatever they had with people, we matched.
   We were basically working together on these tape-outs."

       - Gerry Atterbury of Fujitsu Microelectronics

  "Our relationship with Magma is tight, but not with its R&D department,
   more the AE department.  During our intense P&R stage, we had a Magma
   FAE on site at least one day a week, and on the other days, the FAE
   would be running jobs for us."

       - Morrie Berglas of PowerVR Technologies

  "Magma gives us one full time support guy on-site 2-3 days a week here at
   TI.  In addition, they also give us 2 to 3 others as part time local
   support."

       - Francis Larochelle of Texas Instruments ASIC

  "Magma worked very closely with us and we have a great tie in with their
   R&D group.  This is such a bonus when you are working on large complex
   chips such as ours."

       - Paul Pontin of 3Dlabs

  "Most of the work was carried out by Magma FAE's in UK (we know those guys
   as they used to work for Avanti Europe.)  We gave them a netlist + some
   floorplan info.  In parallel, people in my group worked on the same
   design using our standard layout flow based on Apollo/Saturn."

       - Hans-Olov Eriksson of Ericsson Radio System AB

So, from what these users are saying, Magma was in taxicab mode with a lot
of customers.  "OK, such support is normal for new technologies, John.
What's the big deal?", you ask.  In my Tape-out FAQ 6 months ago I wrote:

  "Also, another non-tape-out is if the EDA vendor runs their tool for the
   customer (i.e. "taxicab mode") instead of the user running the physical
   synthesis tools themselves.  Taxicab mode happens a lot in evals.  It's
   interesting, but it's not a customer tape-out as far as I'm concerned."

       - from http://www.deepchip.com/news/tapeoutFAQ.html

Now if I keep my word and stick to that FAQ standard that I promised Magma
management that I would stick to, here's my Magma tape-out list:

  Magma Customer Tape-Outs

  Date     Size     Clock (Mhz) Company    Location              Fab/um
  -------------------------------------------------------------------------

   7/00   57 K gates     26     QThink     San Diego, CA      0.25 TSMC
   8/00  500 K gates    200     NEC        Tokyo, Japan       0.13 NEC
   8/00    7 K gates    480     QThink     San Diego, CA      0.25 TSMC
   8/00    5 K gates     50     QThink     San Diego, CA      0.25 TSMC
   9/00    5 K gates     24     QThink     San Diego, CA      0.25 TSMC

  10/00    5 K gates    100     QThink     San Diego, CA      0.25 TSMC
  12/00    7 K gates    480     QThink     San Diego, CA      0.18 TSMC
  12/00  740 K insts    125     IMG Tech   Kings Langley, UK  0.18 TSMC
  12/00   70 K gates    183     QThink     San Diego, CA      0.18 TSMC
   3/01  300 K gates    100     Broadcom   San Jose, CA       0.18 TSMC

   5/01  336 K insts     80     STMicro    Agrate, Italy      0.18 STMicro
   6/01   24 K insts    100     Signet     Austin, TX         0.18 TSMC
   6/01  536 K gates    120     Broadcom   San Jose, CA       0.18 TSMC

But when I look at this list, I'm bothered.  This gives Magma all of 13
tape-outs.  Aargh!  This isn't telling the whole Magma user story here.
Then I remembered one of my favorite quotes:

    "A foolish consistency is the hobgoblin of little minds."

         - Ralph Waldo Emerson in his "Self-Reliance" essay (1841)

Now I'm anticipating angry letters from Synopsys and Silicon Perspectives.
They did really well in my tape-out count 6 months ago.  For this Magma
tape-out count, I'm ignoring my orginal Tape-out FAQ and I'm showing all
36 Magma customer tape-outs I found.  And I'm just going to report this
as a "listing" instead.  It's the right thing to do.

  Magma Customer Tape-Outs

  Date     Size     Clock (Mhz) Company    Location              Fab/um
  -------------------------------------------------------------------------

   8/99  100 K insts  100/200   Fujitsu    San Jose, CA       0.25 Fujitsu
  10/99  150 K insts  27/54/81  Fujitsu    San Jose, CA       0.25 Fujitsu
   6/00  100 K gates    110     TI         Dallas, TX         0.18 TI
   6/00  200 K gates    250     Vitesse    Col. Springs, CO   0.18 TSMC 1p6m
   7/00   57 K gates     26     QThink     San Diego, CA      0.25 TSMC

   7/00  100 K gates    125     TI         Dallas, TX         0.15 TI
   8/00  500 K gates    200     NEC        Tokyo, Japan       0.13 NEC
   8/00    7 K gates    480     QThink     San Diego, CA      0.25 TSMC
   8/00    5 K gates     50     QThink     San Diego, CA      0.25 TSMC
   9/00    5 K gates     24     QThink     San Diego, CA      0.25 TSMC

  10/00    5 K gates    100     QThink     San Diego, CA      0.25 TSMC
* 10/00  600 K gates     66     Infineon   Munich, Germany    0.18 C10
  11/00  2.5 M insts    200+    3Dlabs     Egham, Surrey, UK  0.18 IBM 7SF
  11/00  100 K gates    285     TI         Dallas, TX         0.095 TI
  11/00   75 K gates    155     Vitesse    Col. Springs, CO   0.18 TSMC 1p6m

  12/00    7 K gates    480     QThink     San Diego, CA      0.18 TSMC
  12/00  740 K insts    125     IMG Tech   Kings Langley, UK  0.18 TSMC
  12/00   70 K gates    183     QThink     San Diego, CA      0.18 TSMC
  11/00  175 K insts    100     Fujitsu    San Jose, CA       0.25 Fujitsu
   1/01  100 K insts     66     Fujitsu    San Jose, CA       0.25 Fujitsu

*  2/01  600 K gates     66     Infineon   Munich, Germany    0.18 C10
   3/01  300 K gates    100     Broadcom   San Jose, CA       0.18 TSMC
   3/01  250 K insts    150     Fujitsu    San Jose, CA       0.18 Fujitsu
   4/01  3.5 M gates    166     Vitesse    Col. Springs, CO   0.18 TSMC 1p6m
*  4/01  500 K gates    150     Infineon   San Jose, CA       0.18 C10

   5/01  114 K gates    155     TI         Ottawa, ON         0.18 TI
   5/01  336 K insts     80     STMicro    Agrate, Italy      0.18 STMicro
   5/01  466 K gates    108     TI         Dallas, TX         0.18 TI
*  5/01  500 K gates    150     Infineon   San Jose, CA       0.18 C10
   5/01  450 K gates    155     TI         Dallas, TX         0.18 TI

   6/01  400 K gates     78     TI         Dallas, TX         0.18 TI
*  6/01  500 K gates    150     Infineon   San Jose, CA       0.18 C10
   6/01   87 K gates     83     TI         Ottawa, ON         0.18 TI
   6/01   24 K insts    100     Signet     Austin, TX         0.18 TSMC
   6/01   75 K gates    155     Vitesse    Col. Springs, CO   0.18 TSMC 1p6m

   6/01  536 K gates    120     Broadcom   San Jose, CA       0.18 TSMC

**11/01  160 K gates    125     Ericsson   Stockholm, Sweden  0.13 TI GS40

    * - while my Infineon contact confirmed that it has done 5 Magma
        tape-outs in its http://biz.yahoo.com/bw/010614/2023.html
        press release, Infineon would not give me the exact stats
        on each tape-out.  The Infineon stats presented here came from
        Magma, so the higher than average gate counts may be suspect.

   ** - November 2001 hasn't come yet.  It's a planned tape-out.


From: Chris Faerber <chris.faerber@infineon.com>

John,

Infineon has committed a press release with Magma regarding tapeouts.

The press release states the real amount of tapeouts we did with Magma
worldwide at all Infineon development centers.  Infineon doesn't want
to spread partial-information by different sources, therefore the press
release shall be the one and only source for tapeout count.

    - Chris Faerber
      Infineon                                   Germany

        ----    ----    ----    ----    ----    ----   ----

From: Morrie Berglas <morrie.berglas@powervr.com>

Hi, John,

My company uses Magma intensively, however, my group has not yet had a
tapeout.  I can't comment on behalf of the other groups in the company in
terms of their progress with the tool.

My group is really hammering Magma and, although some subblocks have
congestion, area, and/or timing issues, quite a few blocks go through the
Magma flow seamlessly.  The particular subblock I'm working on is flattened,
7.4 million square microns, 300 K cell instances and runs at 250 MHz, and is
one of the blocks which meets timing closure after routing.  Our company
does not traditionally do the backend flow, and considering this, we are
still managing well with the Magma tools.  They really do perform well,
give good results and are not overly complicated.

Our relationship with Magma is tight, but not with its R&D department, more
the AE department.  During our intense P&R stage, we had a Magma FAE on site
at least one day a week, and on the other days, the FAE would be running
jobs for us.  We've got a lot of licenses installed, and a fairly high
profile chip, so we really shouldn't expect any less on the support front.

BTW, we're not using Magma's VHDL synthesis tool.  In our flow we still
synthesise with DC shell and wireloads, then use Magma for CTS, P&R, etc.
However, I do believe Magma re-optimises logic as required.

Our major partner, STmicro, performs our LVS/DRC backend tasks.  I believe
they use a combination of Hercules, Excalibre and Apollo.

My block P&R'ed into a rectangle measuring 2471.38 um by 3019.83 um and the
final utilisation figure from the tool was 93.85%.  So obviously, Design
Compiler gave Magma a bloated margin in synthesis.  This bloat can be
attributed to many reasons: DC can't be constrained with the size of the
block, nor the macro placement, nor the number of metal layers, nor the
configuration of the power mesh, and so on.  We've just learned to factor
this into the floorplan from day 1.

In terms of the problem blocks I personally haven't compared the results
against PhysOpt, but I'm sure the same problems exist there, too.  Sometimes
we just try to jam too many cells and macros into too small or too irregular
a footprint.  Considering some of the blocks I've seen go through and timed
after routing in Magma, I think a few of the negative comments made about
the tool are plainly untrue.  Very large and very fast designs are feasible.

On a slightly separate note, its disappointing that you would be so quick to
dismiss what I consider a truly "new entrant" in the field.  You must
respect them for what they are trying to accomplish.  It takes a very long
time and lots of cash to develop such a large and complex application.

Magma's approach is different enough from PhysOpt's that is should be
allowed to continue competing.  Who wants an industry where the market
leader kills all competition before they even have a chance to get off the
ground?  Yeah, $87 mil is a lot, but maybe it takes $200 mil to properly
develop and grow an EDA tool from scratch?  The tool is good, the support
we're getting is great, so maybe they're losing a ton of cash but, as an
engineer, I don't care one little bit. 

    - Morrie Berglas
      PowerVR Technologies                       Kings Langley, UK

         ----    ----    ----    ----    ----    ----   ----

From: John Dyer <jdyer@qthink.com>

John,

We are tool, foundry and IP independent design services company currently
doing place and route work with Silicon Ensemble, IC Craftsman, Apollo and
Blast Fusion 2.0.  We have ~50 people.  Although our company has completed
a number of large designs, the largest is over 6M gates, the 7 tape-outs
that we have done with Blast Fusion have all been small designs (5 K to
70 K gates).

For all of our designs, we used conventional synthesis tools (Synopsys or
Ambit.)  For 4 of them we got RTL or earlier handoffs and did the synthesis
ourselves.  For the remaining 3 of them we got netlist handoffs.  So in all
cases we used Magma for gates-to-placed gates, not RTL synthesis.

We used Calibre for physical verification on all of these designs.

All of these designs were digital control logic blocks for mixed signal
chips except for one 70 K gate design which was a purely digital chip.

Obviously with new software there were some bugs; however, we were able to
complete all of these designs without any help from Magma R&D.  We did have
to do our own workarounds for antenna fixes, tho.

We have also been looking at PKS and Physical Compiler, but have not reached
any conclusions.

    - John Dyer
      QThink                                     San Diego, CA

         ----    ----    ----    ----    ----    ----   ----

From: Gerry Atterbury <gatterbu@fmi.fujitsu.com>

John,

We have taped out a total of five chips using Blast Fusion.

The designs were all closely cooperative design tasks between Fujitsu and
Magma.  Magma had roughly 1 to 3 people working on our chips and we had
about the same number.  Whatever they had with people, we matched.  We were
basically working together on these tape-outs.

We used Synopsys Design Compiler for our RTL synthesis on these chips.  Our
Magma starting point on these chips was a gate level netlist.  Our end point
was GDSII.

We actually used the Magma tools for DRC and LVS.  During our original eval
of BlastFusion we had carried out a very careful correlation of the physical
verification results of Magma DRC with Cadence's Dracula.  After we worked
through some minor mismatches, we were satisfied that Magma's DRC was
equivalent to Cadence Dracula.

The issues we encountered using Blast Fusion were:

  1. Power router causing DRC and signal shorts.  This was from overlapping
     VDD and VSS power rings for macros and core ring.  Magma fixed it in
     release 1.0.

  2. Detailed placement problems.  The placer was placing some standard
     cells outside the core area, between Macro and pad cells.  This
     problem occurred in beta version and was fixed in version 1.0.

  3. Clock routing bug.  This was core dump in Beta version and was fixed
     in version 1.0.

  4. The horizontal power strap colliding with standard cell power line. 
     This was fixed in version 1.0.

  5. Runaway capacitance bug.  This was due to low gain cells in 0.25um
     library.  See Appendix B for more information.  This problem occurred
     in beta version and was fixed in version 1.0.

  6. Problem in the detailed placer where the placer was running forever.
     This problem occurred in beta version and was fixed in version 1.0.

  7. Crash in Mantle in 'run gate buffer load'.  This was a bug in timing
     analysis.  This problem occurred in beta version and was fixed in
     version 1.0.

  8. 'recover_rising' arc on cells are named as SETUP tests in TLF.  The
     tlf2magma conversion was, as a result, replacing the 'recover_rising'
     arc by setup arc and causing data-to-data timing tests to arise. 
     Magma tool was giving error message in optimization.  This problem
     occurred in beta version and was fixed in version 1.0.

  9. Hold time:

       1. The tool was incorrectly reading setup time as hold time
          from the TLF.
       2. Optimization for hold time check needed some tune-up, since
          it was not reporting all hold time violations.
       3. Simultaneous min/max analysis had a problem.  This problem
          occurred in version 1.0 and was fixed in version 1.1.

I realize these are mostly 1.0 bugs.  The newer versions of Blast Fusion
don't have nearly as many bugs that 1.0 had.

What we liked about Magma was that it closed timing with no iterations
on a chip that had taken 5 iterations with a competitor's tools.

    - Gerry Atterbury
      Fujitsu Microelectronics                   San Jose, CA

         ----    ----    ----    ----    ----    ----   ----

From: Hung Hua <hung@signetdesign.com>

Hi, John,

We have used Magma to benchmark several designs that we taped out previously
with other tools.  We tried PhysOpt from Synopsys.  We also evaluated PKS
from Cadence and Saturn from Avanti.  With PhysOpt we found it:

 - deals only with placement and hence may require more iterations to
   achieve timing closure.

 - has serious capacity problems (on more than 150 K gates).

 - the placement output by PhysOpt was not routable using Silicon/Ensemble
   or Avanti tools.

So far the only design that may be counted as a Magma tape-out is a 100 K
gate block design.  The block is going to be part of a tapeout of a big chip
that has lots of on-chip memories.  It can be viewed as a hard IP delivered
to be integrated on the big chip.

Our engineer took the design through Blast Fusion.  Reddy has learned and
used the tool for several benchmarks of his previous designs before doing
this tape-out.  We also have other Magma users in-house working together
with Reddy.

The design has 24 K standard cells (approximately 100K nand2 equivalent
gates).  The design also contains 4 embedded memories.  Physically, the
memories occupied about 30% of the area of the block.  In this case, Blast
Fusion was used to optimize a given structural netlist to obtain:
    
       1.) Timing closure, but no focus on timing improvement.
       2.) Clock tree design and skew balancing.
  
The block was designed to run at 100 Mhz as a whole.  The block has been
delivered (as hard IP) for chip integration.  The chip is planned to tapeout
the end of this month.  The chip will be fabbed with TSMC 0.18.

We used Blast Fusion in a gates-to-placed-gates mode.  Synopsys DC was used
to go from RTL to gates.  We used Mentor Calibre for physical verification.

We found the Blast Fusion power router to be flaky.  Had to fix some of the
power manually.  Also the current version of the tool does not allow to
finish the power completely before the routing. 

Timing correlation between Blast Fusion and PrimeTime had to be done on some
paths manually since they each break the timing loops differently.

We liked Magma's ability to achieve timing closure easily.  Blast Fusion
takes timing into account every step of the Place and Route process.  It
does every thing it can to fix the timing as more detailed parasitics become
available.  This is a big plus for us since we would go through 4 to 6
iterations to fix transition and setup violations which popped up with
the actual parasitics.

    - Hung Hua
      Signet Design Solutions, Inc.              Austin, TX

         ----    ----    ----    ----    ----    ----   ----

From: Hans-Olov Eriksson <Hans-Olov.Eriksson@era.ericsson.se>

Hi John,

During February & March this year we did an eval of Blast Fusion 2.1.  Most
of the work was carried out by Magma FAE's in UK (we know those guys as they
used to work for Avanti Europe.)  We gave them a netlist + some floorplan
info.  In parallel, people in my group worked on the same design using our
standard layout flow based on Apollo/Saturn. 

Our actual design was a transcoder ASIC consisting of 8 identical DSP cores
plus I/O blocks plus memory.  The DSP core is an Ericsson in-house design
and was used as testcase for the Blast Fusion evaluation.  Size of the DSP
core is 160 K gate.  The total chip size is 1.5 M gates plus 5 M-bit SRAM. 

We had two options for transcoder ASIC, use 16 transcoder ASICs on the board
each running at 125 MHz or go for 8 ASIC's running at 250 Mhz.  It was with
the latter track we went to Magma and asked them if they could close timing
on 250 Mhz.  We have previously had problems with our Avanti flow to reach
250 Mhz in a reasonable amount of time

We use TI as our vendor.  For the 250 Mhz version, TI's SR40 technology
(high performance lib, 0.13 um) was considered.  We finally decided to go
for the 125 Mhz version (because of lower risk) and we're using TI's GS40
(low power lib , 0.13 um).  We're still designing the chip, tape-out is 
planned for November this year.

We use Synopsys DC and Module Complier for the RTL part of this design.  We
looked at Ambit 18 months ago, but at that time Ambit didn't have a datapath
compiler.

Our Magma evaluation went fine.  Initial goals ( 250 Mhz after extraction in
less than two months) were met.  In my opinion, Blast Fusion seems to be a
very good block-level tool well suited for "flat" time critical blocks like
DSP cores.  Don't know if it's the best hierarchical chip level assembly
tool.  I heard most people are using it for block design like ours.  On this
evaluation we didn't focus on area and that could be the reason the achieved
size was not so impressive. 

Big positive with Magma.  Their evaluation goals were fulfilled 100%.  The
reason we decided not to go for Blast Fusion and Magma was due to changed
project plans (focus on low risk because of economic slowdown in the telecom
market.)  Their support people in UK are very competent and focused.
Impressive overall road-map.  Magma seems to be the only EDA vendor to
provide a complete RTL-GDSII flow that includes synthesis, clock tree synths
and SI aware P&R.

Haven't tried Magma RTL synthesis though.

Usually, our back-end activities take place on-site at Ericsson in close
cooperation with our DSP design team.  With TI, they run LVS/DRC for us
using their own internal tool.  With VLSI, we used to run XCalibre/Calibre.
Now with Philips, they run Hercules for us.  We only do physical design on
time critical blocks like DSP cores.  In my group, we provide DSP cores to
many design units within Ericsson and we always deliver as "hard" macros.
The ASIC vendors are responsible for the top level assembly and that
includes transistor level DRC/LVS.  We always run the "basic" cell level
LVS/DRC built into Apollo and it seems most design rule errors are found
at that level.

Last year we spent about 6 months on an eval of PhysOpt.  We never closed
timing on that tool, the interface to Avanti was very complex and we could
only run the tool with on-site support from a Synopsys FAE.

In my opinion, Saturn is a good physical synthesis tool.  We've used in all
our tape-outs since 1998.

    - Hans-Olov Eriksson
      Ericsson Radio System AB                   Stockholm, Sweden

         ----    ----    ----    ----    ----    ----   ----

From: [ Been There, Done That ]

John, I must be anon.

Magma problems:

Power router gets confused with non preferred preroutes.  Lots of tcl
required to make sure vias are dropped in the right place.  In an
unconventional design with io's and memories straddling the core
boundary, much time can be spent eliminating real and verifying false
power preroute opens and shorts.

Detail router doesn't always find a solution for off grid closely spaced
pins on macros. Similarly, the global router may see a macro pin as
accessible while the detail router says it is inaccessible.

Clock splitcells impede drc and antenna resolution.  A clock splitcell
enforces an htree implementation for the clock tree.  Since the clock
router places it instead of the detail router, a poor placement can make
drc or antenna violations difficult to eliminate.  This results in
manually moving some of the splitcells to get drc/antenna clean.

Tie hi/lo pins handled by power router instead of detail router.  The
power router was really built for meshes and rings and does not always
find the appropriate solution for tying a macro pin to power or ground.

ECO flow is not well tested for corner cases.  The spare cell metal only
eco flow works very well, but macro pin changes or tie hi/lo pin changes
are difficult to implement.

GUI crashes more than it should, but that is not a major problem since
most long runs are done with batch scripts.


Magma strengths:

Fixes slew/cap/fanout violations flawlessly.  Has separate buffering
commands for trees (>1000 pin nets), long wires, and large capacitance.
Running the same design on Saturn and Magma resulted in 30K slew
violations with Saturn and 6 slew violations with Magma (all 6 were due
to a placement blockage which prevented repeater insertion).

Powerful hold time fixing looks to add buffer at start point first, end
point next, and then every point in between.  Minimizes number of buffers
added without damaging setup time.

Accurate RC extraction and delay calculation.  RC extraction compares
well with quickcap.  Ceff, slew degradation, and wire delay calculations
compares more closely to spice than Primetime when looking at high fanout
nets with long wires.

Automatic macro placement gives the user an excellent starting point for
minimizing global wire length and creating a routable design.  It is
based on force driven placement instead of quadratic placement so large
blocks can be moved simultaneously with standard cells in order to find
the minimum global wire length while eliminating cell overlap.

Incrementally improving accuracy of  routing models allow for appropriate
optimization decisions to be made throughout the flow.  Initial
optimization is done with manhattan mode.  Additional optimization is
done with a global route mode which understands detours as well as
estimated lateral capacitance according to the number of used tracks in
each small region.

Tcl interface to data model allows tremendous easy-to-use flexibility for
adding functionality to the tool.  Some examples include a simple lef pin
abstraction of a block, design rule adherence on block boundaries by
placing buffers on primary pins, river routing of busses, modifying the
via structure in the power mesh, and extracting bond pad center
coordinates.  Much easier to use than any lisp based language (e.g. skill
or scheme).  Fully integrated static timing, placement, routing, clock
tree, optimization, extraction under one data model allows for tremendous
flexibility.

The noise avoidance during track routing through timing window and
crosstalk analysis is certainly powerful but I'm reserving judgement
until further investigation and correlation with Cadmos.

Useful skew during clock routing helps setup time without creating hold
time problems.  This is extremely powerful on high performance designs.

Antenna fixing understands diffusion area related ratios and combines
global route, detail route, and diode insertion to converge on an antenna
clean solution.


Experiences with other physical synthesis tools:

Avanti has no good solution for fixing slew violations due to RC delay
except to run the clock tree synthesis tool on long wires.  Saturn does
not address these slew violations.

Avanti does not handle hold time fixing very well.  Saturn adds too many
buffers and doesn't always fix every path due to miscorrelation or some
other tool problem.

PhysOpt is based on the dc engine instead of PrimeTime.  This is a fairly
basic delay calculator built for wire load models and not for
placed cells with high fanouts with different length wires on each
fanout.  It does not include effective capacitance, slew degradation, or
accurate RC delay.  Thus it is a poor tool for repeater insertion or long
wire slew violations.

PhysOpt does not correlate with the detours which are made by the backend
global router.  Thus large loads and long wires can show up at the end of
the flow which PhysOpt did not fix since it did not make the same detour.

Cadence PBOPT is limited to only buffering and sizing and relies on elmore
delay calculation.  A fairly basic tool which should be obsoleted by PKS if
they can ever get it to work.

    - [ Been There, Done That ]

        ----    ----    ----    ----    ----    ----   ----

From: Hiroaki Maruyama <h-maruyama@pi.jp.nec.com>

Hello, John Cooley,

We've taped out using Blast Fusion v2.0.  And now we are using Blast Fusion
v2.1 or later beta revision. We don't use Blast Chip except for trial.

Basically, our NEC internal team ran Blast Fusion, although we often needed
Magma's help.

We used Cadence Ambit as RTL synthesis, even though this is not standard
tool for us.  And we use Magma in gatelevel-to GDSII.

For LVS/DRC, we mainly use Mentor.

In Blast Fusion, we found a lot of bugs and many differences with NEC
sign-off rules.  Most of these are related to routing issues.  Routing
issues are very heavy to design implement,  but are very impotant to
reduce design iterations.  Magma can repair these bugs right away.  But
we requests various issues related to routing.

Blast Fusion needs very long runtime.  Fortunately, our first test design
was very small,  but I wonder whether or not Magma can handle huge design.
Magma says that Blast Fusion v3.0 can handle....

What I like about Magma is that it has true single database from RTL to GDS.
Magma can calculate delay with same engine during synthesis, place, CTS, and
route.  Especially, Magma can handle the delay of both clock and signal
simultaneously.  Therefore, we can design with less margins.

NEC and NMS (NEC MicroSystems) are using other physical synth tools.  We use
Blast Fusion for only few particular designs now.  We can not compare these
tools with same design, same time, same human resources, and so on. 

However, as the result of our many trials, I think that Magma competes only
with Avanti, currently -- because, the results of PhysOpt and PKS depend on
CTS and routing tools.

    - Hiroaki Maruyama
      NEC                                        Tokyo, Japan

         ----    ----    ----    ----    ----    ----   ----

From: Francis Larochelle <francis@ti.com>

John,

We've been successful in using Magma Blast Fusion (v2.1) at our ASIC design
centers.  (Enclosed is our data on 8 Blast Fusion tapeouts.)  In the first
half of this year, we have transitioned 100% of new block designs to Magma
Blast Fusion and are now transitioning top-level design as well.  (The first
design is almost complete).

We have also successfully closed timing on additional evaluation designs
that include cores in the 250 MHz-300 MHz range and top level designs of up
to 2M gates.

Magma gives us one full time support guy on-site 2-3 days a week here at TI.
In addition, they also give us 2 to 3 others as part time local support.

We used Blast Fusion only in a gates-to-placed-gated mode.  We used Design
Compiler for all the RTL-to-gates synthesis in these tape-outs.

To answer your last question on what DRC and LVS tool we are using:
We are using Magma's built-in layout verification capabilities first,
but once the Magma database is 100% DRC/antenna clean, we re-verify
that with our signoff (TI internal/K2) layout verification flow.  We
have had a few issues where the signoff flow has found additional
errors but mostly the Magma inbuilt layout verification is correlating
well with our signoff layout verification flow.

I compiled specific information on issues we have had with running Magma
over the last year.  We have found more or less efficient ways to deal
with them....

* Handling Multiple Modes
  -----------------------

  We have had some challenges on automating a flow which fixes all hold
  violations for both mission and test modes.  We typically put the
  design in mission mode and do hold fixing.  In many cases this fixes
  all the hold violation in test mode as well.  However, in some
  circumstances there are additional test mode hold violations which
  must be addressed in Blast Fusion.

  It is dangerous to put the design in test mode and do hold fixing
  because the tool can not see the mission mode constraints and would
  put hold buffers in places that would cause the mission mode setup
  times to fail. We normally have to do some manual / interactive
  fixing of these "leftover" test mode violations (violations that remain 
  after mission mode hold time fixing).  Magma has developed a new
  capability in which the mission and test modes can be optimized 
  simultaneously by propagating all clocks through the timing graph at one
  time. We are  anxious to try this out in the near future. However, we do
  realize that this capability will require some additional investment in 
  identifying false paths which are created by this overlapping clock 
  scenario . (i.e. paths launched by mission mode clock
  and captured at scan data pins; paths launched by test clock and
  captured by normal data pins.)

* High Fanout Nets
  ----------------

  We had run into some problems with high fanout nets which
  were not constrained by timing such as reset and enable lines.
  This would cause the tool to build long chains of weak buffers
  which resulted in some very unreasonable delays. (Sometimes
  as high as 70-80 nsec.) We found that this problem could be avoided
  by defining these signals as clocks. This forced the Magma clock router
  to work on these nets which build a much better tree.
      
* Global Route vs Detailed Route Timing
  -------------------------------------

  In a few cases we had encountered timing surprises after detailed
  routing due to the global route timing estimates being too conservative.
  By default the tool is more pessimistic at global route to prevent
  setup time surprises. However in certain situations a timing speedup
  at final route can cause setup time failures too. (This happens on 
  IO-to-register paths when the IO clock timing is fixed and the on-chip
  clock insertion delay speeds up significantly.)  

  Of course, the speedup can cause hold time failures as well.  We normally
  can fix this with some post routing ECOs.  Magma does have a capability
  to use congestion information to better tune the global route estimations.
  We have used this with some success, however greater margins during
  optimization need to be used since some of the tool's built in margin
  is lost.
 
* IO Timing Constraints
  ---------------------

  In working with full-chip designs, we originally had some difficulties
  in meeting the IO timing constraints. We found that the normal placement
  algorithms weren't sufficient when the IO timing constraints were tight.
  In response to our difficulties Magma developed a coning algorithm which
  can be activated to pull the cone of logic up to the first register close
  to their associated IO cell.  We have found this capability to be quite
  successful in handling these types of paths. 
   
* Interpretation of Constraints vs PrimeTime
  ------------------------------------------
  
  For the most part, the interpretation of constraints between Magma and
  Primetime have correlated pretty well. We have found a few differences
  and have published a best practices guide to our users to help them
  avoid these situations. One such example was the way the two tools
  handled the latencies for the clocks used at the IOs. (The clocks
  which determined the arrival times at the inputs and the required
  times at the outputs). When the clocks were calculated in propagated
  mode and a non-virtual clock was used, PrimeTime would use a latency of 
  zero, where as Magma would use the original latency defined for ideal
  mode timing. This often caused reported timing failures in PrimeTime
  when doing STA with back-annotated SDF.  We found the problem could
  be avoided by always using virtual clocks to define the IO timing.
  This allowed the IO clock latencies to be controlled separately from
  the on-chip clock latencies and  made the two tools behave the same.


TI is using several other physical synthesis tools w/ good success (Synopsys
PhysOpt, Silicon Perspective First Encounter, Cadence PKS).  However, we
believe that presently Magma Blast Fusion best meets our objective of
decentralizing complete/final physical design at our world wide ASIC design
centers.

    - Francis Larochelle
      Texas Instruments ASIC                     Dallas, TX

         ----    ----    ----    ----    ----    ----   ----

From: Marco Montalti <Marco.Montalti@st.com>

Hi, John,

Our engineer went to the Magma training class for a week in November.  After
that we were supported by a Magma AE from the UK.  The AE flew in every
2 weeks mostly, but there were times when he was here much more frequently.
When we ran into a Magma bug, he'd e-mail it to California.  Overnight we'd
download a new version of Blast Chip that had the bug fixed.  It was
quite good.  The uk guy was quite skilled.  We believed that if our engineer
couldn't use Blast Chip, it would be useless.

Our current production design flow is PhysOpt.  We used a design that we
ran through PhysOpt as our test case.  We took that design and ran it
through Magma Blast Chip 2.1, Cadence PKS, and Monterey Dolphin.  Magma
gave the overall best results in terms of usability, execution time,  
resources utilization such as # of CPUs, RAM and disk space, interop with
other tools, capability to import and export data at the different design
stages and timing results.  One candidate was not capable to complete
the design (I can't say who.)

For the time being, we don't intend to replace the PhysOpt flow because it
is very stable.  Our intention is to use Magma as a parallel solution.  We
believe it is more automated than PhysOpt.  With PhysOpt you must run
CTS, detailed routing, and after that, physical optimization.  With Magma
you feed it a gate level netlist & timing constraints and all those steps
are executed with no more human intervention.  Magma is basically CPU time.

For RTL, we use Synopsys Design Compiler, not Magma synthesis.  We also
placed the blocks of our design using Synopsys Chip Architect.  For
sign-off, we used Simplex for extraction, PrimeTime, and Calibre.

We liked Magma design implementation.  It concurrently cares about placement,  
routing timing closure, signal integrity, net loads, parasitic extraction and  
all the possible checks.  We liked the ability to access to a batch run from
remote to monitor the status of a job and Magma's debugging capability at the
DRC  and LVS stages.

We have also seen Magma's beta code that will introduce real hierarchical
capability   and we judge it one of the most powerful and usable we have
seen so far.

We only had one show stopping bug on the management of the combination of
one hard block and timing constraints between the rest of the logic and
the block itself.  The problem has been solved in one week.  We also had one
problem related to the parasitic extraction, quite quickly identified.  The
solution has still to be verified.

There is another problem in the DEF writing currently resolved through a
work around.

In general, we have been favorably impressed because we expected more  
problems from a tool this young.

We expect Magma will be an easy integration into our design flow that is
going to happen in the next month.

    - Marco Montalti
      STMicroelectronics                         Agrate Brianza, Italy

         ----    ----    ----    ----    ----    ----   ----

From: Paul Pontin <Paul.Pontin@3dlabs.com>

John,

We started looking at physical synthesis tools around May 1999.  I talked to
Monterey and Magma at that time.  Magma was the most willing to engage with
us. We checked out Synopsys some time later, but by then we were well
advanced with Magma.

As you know I am very pro Magma, it is probably worth giving you a bit of
background.
 
3Dlabs was moving from an ASIC flow to a COT flow.  As such, we had no
legacy layout tools.  Magma fitted the bill perfectly, it was a one stop
shop.  The only other tools I needed were for LVS and DRC.
 
If I had already invested in Avanti or Cadence tools, then I may well have
leant more towards a PhysOpt type of product.  PhysOpt would have fitted
in with a flow I was used to.  However these flows are not unified, you are
in and out of different tools, each doing its own piece, till eventually you
end up with GDSII.  As an engineer I find the purity of the Magma flow very
reassuring, you can tell when something is intrinsically right.
 
Magma worked very closely with us and we have a great tie in with their R&D
group.  This is such a bonus when you are working on large complex chips
such as ours.  I also feel that we have some steer into the direction that
the tools are developing.  Magma really is listening to what the industry
wants.  I just hope that they stay that way as they grow bigger.


Magma gave me a simple scripted flow that could get me from a netlist to
finished GDSII quicker, and with more predictable timing results, than
anything else we had come across.

The flow is set up with early trial netlists.  Once this has been done the
GDSII can be generated very quickly (around 1 week for 300K placeable
instances).  We simply re-run the scripts.  This gives a huge time to market
advantage.

In general Magma S/W is very good, the problems we have fall into 3 groups:

Third party suppliers of RAMs, Standard cells, I/O's, foundary services, etc
have not done any QA with the Magma tools.  We therefore come across some
compatibility issues that need sorting out.  Not show stoppers, but annoying
none the less.  The greater number of Avanti and Cadence seats means that
such issues are usually ironed out before the end user gets involved.

The interactive GUI is still a little flakey.  This is a minor irritation
for us as we script everything in TCL, however it would be nice to have it
fixed.

The tool is still under constant development, new features are being added
all the time.  My dilema is which release of Magma to use for the duration
of a project.  We have found it best to develop the scripts on one release
and stay with it, even if some new sexy feature is available in a later
release.  However this is a familiar problem to most engineering managers.
Magma's QA seems robust though.

Hercules was used for our tapeout, mainly for logistical reasons.  However
since then I have purchased Calibre.

Our flow is VHDL RTL and we use Synopsys DC to get us to an outgoing
netlist.  The Magma tools handles the netlist from that point on.  We have
no IPO, ECO type loops.  All optimization for timing closure is done by
Blast Fusion.

    - Paul Pontin
      3Dlabs                                     Egham, Surrey, UK

         ----    ----    ----    ----    ----    ----   ----

From: [ I Wear My Sun Glasses At Night ]

Hi John,

Anon please.

We have now run around 20 blocks of 100-300k instances through Magma to
extracted timing, DRC and LVS clean from 4 chips.  I don't know if you
count these as tapeouts as only one set went together to make a full chip.
We used Avanti Hercules for verification.

We have found that Magma treat any failure to achieve a fully placed, routed
and timed design as a bug -- unless we agree that its too small or fast.  We
have had problems with timing and congestion solved in this way.  Took some
work before we got the tool to fix all antenna violations.

Of all the tools we have tried (Avanti/Saturn, Physopt, Magma, PKS), Magma
has given us best timing closure.  i.e. it gets the fastest clock in a
given area.

PhysOpt is being run by a partner of ours so we are able to compare results.

On the 3 precise like-for-like test cases that we have tried (2 different
technologies), Magma came out best for timing.  We have also found that run
times tend to be shorter.  We only compare results if the design is
routable - no point otherwise.

We used Magma gates-to-placed-and-routed-gates.  DC did our RTL synthesis.

    - [ I Wear My Sun Glasses At Night ]

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)