( ESNUG 420 Item 2 ) -------------------------------------------- [10/22/03]

Subject: ( ESNUG 419 #1 ) Magma, Apollo, & Hierarchy; Magma Glass Boxes

> 1) Magma could end up with long run times if your constraints weren't
>    fairly mature.  The basic constraints (clocks, false paths,
>    multicycles) were almost 100% fine tuned, but the I/O constraints
>    were poor as the physical hierarchies were being defined at the
>    floorplanning stage.  We got into a mode of garbage-in/garbage-out
>    early on without gaining much in terms of effective floorplanning
>    or timing closure.
>
>        - [ The White House Leak ]


From: [ Blues Clues ]

Hi, John,

Here's a quick response to ESNUG 419 #1.  Please keep me anonymous.

This is true with DC as well though... if you overconstrain any synthesis
engine, it overworks the optimization loop as it doesn't converge to zero
and stop.  Synthesis is not a deterministic problem, and usually involves
a fairly large convergence loop.  The more timing problems a tool has to
address, the longer it takes to converge.

As far as IO constraints, most any "new generation" physical design tool
has this "limitation".  The problem is that in synthesis, timing paths are
not as tightly coupled as they are in the physical world.  In the DC world,
you can set your IO constraints overly tight, and DC will give you the best
it can, to some degree.  In the physical domain, a tight timing budget
doesn't just mean upsize logic, it means minimize net lengths.  So what
happens is all your IO connected logic gets pulled to the pins.  Depending
on your pin placement, this can completely destroy your placement.  A few
things I do:

  a) Always get in the habit of highlighting 1 or 2 levels of logical
     hierarchy on your layout and viewing it.  In most designs, most
     high-level logical hierarchy should get pulled together.  If you see
     small bits of logic hierarchy by the sides, and the rest somewhere else
     in your layout, this is usually a sign of overly tight IO constraints. 

  b) Try running your design first with a false path from all inputs and
     to all outputs.  This will give you a good idea of a theoretical
     "best case" for your internal IO paths.  Then remove these false
     paths, and try again, and you'll see the effect your IO budgeting
     has on overall timing.

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

If you have incomplete or relaxed constraints, the tool runs faster, as it
does not have to work as aggressively.  If you over-constrain the design
(or block), the runtime becomes longer.  That is because the tool tries
its best to meet the given timing constraints.

As for timing budgeting, I would use BlastPlan to do a top-down timing 
budgets for the blocks as a first cut.  During the early stage on the design
cycle, we usually recommend some (the keyword is is some) over-constraining
of the input/output delays on blocks; and refine those as the design
progresses towards final netlist.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA


> 2) Magma didn't support asynch constraints such as max_delay or min_delay.
>    It was a hit and miss routine that required an eventual timing ECO 
>    and caused a compromise in timing spec.

From: [ Blues Clues ]

This is supported, I believe.  I know we bugged Magma a year or so ago on
these and told them it was a show stopper.  Our blocks who were griping
are meeting timing now, so I assume its been fixed for a while.  Looking
at our constraint files, I see the following magma commands translated
from our set max delay and set min delay SDC commands: force timing maxdelay
and force timing mindelay.

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

The commands to set these are:

            force timing maxdelay [options]
            force timing mindelay [options]

These have been in BlastFusion from release 2.1.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA


> 4) Magma, by default, flattened the logical hierarchy.  The user would've
>    to define what layers to remain intact.  Again we were told that the
>    fewer layers, the better to meet timing.  This turned out to be a major
>    problem later in our design cycle, as functional ECO's were very
>    difficult to implement.  To state the obvious, our ports within logical
>    layers go away as the hierarchies collapse. 

From: [ Blues Clues ]

Again, a pretty old problem with P&R tools.  It does maintain hierarchy now,
but of course, if you tell it to maintain all hierarchy, you are really
limiting BlastFusion capabilities -- especially if you demand this
requirement and want to run scan reordering or clock tree synthesis. 
(confining clocktrees to have only one port on each module is a pretty
difficult constraint) Magma still isn't 100% perfect on this, but they are,
in my mind, much better than Apollo.  As a side note, I don't know if anyone
noticed, but Apollo used to reuse port names it no longer needed!  So even
though the port on a module was there, it might be logically different, and
in no way related.  It looked like Apollo kept a queue of "free ports", and
if Apollo no longer needed a port, it threw it in the queue.  If Apollo
needed to add a port, it would grab the top of the queue.

We've tried to convince our frontend designers to relax on this requirement
(via cleaning up constraints, and reworking test bench assumptions), and the
blocks that do this have better results.  It's a short-term pain versus
long-term pain thing (a.k.a. if you clean your constraints and rework your
test bench, the problem goes away and never rears its ugly head again... or
you can complain that its a tool bug, even though I believe its an honest
requirement, and beat up every single vendor that wants to mess with your
logical hierarchy, and fight this problem for as long as your RTL exists;
which for us is many project generations.)

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

I am surprised to hear this.  "force keep" can be applied on a net.  As a
rule of thumb, during the synthesis, if you are using "dont touch" any part
of the logic, I recommend to use "force keep" on those.  I have not seen
any surprise in that.

  "Once we saw our adders get replaced by XOR's, which were a poor choice
   in terms of area or timing.  I am not sure how, but that problem was
   resolved by a workaround."

Not sure how this could happen other than the user of the tool explicitly
clears the adders for the mapper to map to other logics.  By default Magma
does not upmap and remap the multi-output gates (in a netlist-to-GDS flow
in BlastFusion).  Therefore, this can only happen if the user explicitly
clears the mapping of the multi output cells.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA


> 5) We fed wireload models generated by Magma from previous runs into DC,
>    & used a setup margin upto 20% of clock period (fastest clock 138 MHz),
>    only to lose it and some more in later Magma runs!  That caused several
>    iterations of the same block by trying different block floorplans.

From: [ Blues Clues ]

Not to beat on a dead issue, but I think most people have probably seen that
wireload models just don't work any more... especially in 0.13 um.  Our
backend team doesn't even generate custom block-based models anymore.  We
went back to using very generic ones: "easy", "medium" and "hard".  Relating
back to your first issue, overly tight IO constraints are one great example
though of something that causes wireload models to not work real well.  It
pulls apart nets that should be close to each other, and causes your
standard deviation on net length to get hideous.

In Magma, one thing you might want to do is look at their gain distribution
early on.  This gives you a very high-level feel for how hard its going to
have to push to meet timing.  Its by no means a perfect "pass/fail"
criteria, but in general it gives you a decent idea.

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

As far as I know, BlastFusion does not have a command to generate wireload
models.  Not sure how the user in this particular case could generate
wireload models from Magma.  I assume they could write a script to get the
parasitic numbers from the database and format it in a way to resemble
wireload models.  From my experience with Magma, Magma does not recommend
to use any wireload model.

In addition, 20% margin seems to be very high.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA


> 6) Clock trees were two long.  At times Magma inserted upto 50 layers of
>    buffering.  After a few tries, we were able to get the # down to no
>    better than 33 for networks w/ a flop count of 30K+.  It was important
>    to us to keep a few of the clock networks very short.  In one of the
>    physical hierarchies (150 K instances, a half dozen memories) one clock
>    network contained ~2500 FF's.  Our hope was an insertion delay of
>    0.8 nsec, but we settled for 1.8 nsec with 13 layers.

From: [ Blues Clues ]

This has been improved VASTLY in 4.0.  Their clocktree code was completely
rewritten and we saw our clocktree sizes drop in half.   This is obviously a
good thing.  They also added alot of good commands to handle balancing
clocks, gated clocks, etc.  Most of the manual hand-tweeking we've always
had to do on same-block clocks has gone away.  We usually wait before
jumping to new tool versions, but this was such an important enhancement
that we jumped to 4.0 very quickly.  We even did this on a very
time-sensitive project, just a few weeks before tapeout, as the power
reduction we would achieve was quite drastic.

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

I cannot comment on this as I have not seen anything like this before in
Magma.  As you know, clock implementation is one of the most important steps
in the design.  It requires some planning to make sure clock tree
implemention satisfies the design requirements.  Sometimes floorplan can
adversely impact the clock tree.  Few flops pulled further away can cause
the increased depth in the clock tree.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA


> 7) According to Magma, there was a known top level routing problem called
>    "glass box."  This had to do with the lack of timing visibility of the
>    physical sub-hierarchies.  We ended up routing the top level w/o timing
>    by relying on PrimeTime post-process.

From: [ Blues Clues ]

"Glass Box" isn't a routing problem, it's a model, similar to an ILM model,
FRAM view, abstract, etc.  It's basically a generic data abstraction concept
Magma has to keep the information you need, to speed up your runs.  An
example would be at the top of your chip, you create a "timing glass box"
which will keep all of your inter-block logic (but no internal flop-to-flop
logic), so you can time and optimize only the top level paths, when you run
the top level.  I don't have a ton of experience with these, but they seem
to be very flexible compared to many past attempts at data abstraction.  If
there was a "glass box" problem, I'm guessing its that the model is so
open-ended that maybe the person was having a problem capturing the proper
data in the model to get your timing to correlate with PrimeTime.  Again I
haven't had a ton of experience with these, so I can comment.

As a side note on this: one of our frontend team's biggest complaints with
Magma, is we find correlation with PrimeTime issues.  (In many cases, it
turns out, PrimeTime is the one that is wrong.  Some of the bugs we've
discovered in Primetime via Magma are pretty scary!)  The ironic thing is,
that Magma actually correlates very well to PrimeTime/STAR-RCXT for us, and
it's only because of this that we can have as part of our methodology a
requirement of running a correlation check (pre- & post- layout) on every
iteration.  Apollo and even PhysOpt correlated so poorly, in many cases with
PrimeTime, that it would be impossible to run such a correlation check,
because it would take days to track down every error, clean things up, and
file bug reports with Synopsys.

    - [ Blues Clues ]


From: Gabriel Ho <gho=user  domain=fastrack-design spot calm>

"Glass box" in Magma is an abstraction technology to reduce the data size 
at the top level in a hierarchical design without sacrificing the accuracy
of crosstalk, noise, antenna, timing, etc.  I have done 5 hierarchical
designs that used glassbox and never thought it was "problem".  Quite
contrary, once you use it, you got to love this.  It streamlines the top
level integration.

    - Gabriel Ho
      Fastrack Design, Inc.                      San Jose, CA

         ----    ----    ----    ----    ----    ----   ----

From: Koko Mihan <kmihan=user  domain=icinergy spot calm>

Hi, John,

In ESNUG 419 #1, there was a person who had a bad experience with Magma and
was looking for virtual prototyping as way to solve his back end problems.
I would be very interested in interviewing this person.  Is it possible for
you to pass on my information to him/her.  If this is against your policy,
I understand.

    - Koko Mihan
      Icinergy Software Company                  Kanata, Canada


 Editor's Note: Sorry, Koko, anon means anon.  Over the years I get some
 interesting phone calls from EDA vendors trying weasil who or what company
 an anon letter came from.  I tell them all "anon means anon", too.  - John


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)