( ESNUG 421 Item 7 ) -------------------------------------------- [12/10/03]

Subject: ( ESNUG 419 #1 ) A Monterey User Speaks Up About Magma Complaints

>1) Magma could end up with long run times if your constraints weren't
>   fairly mature.  The basic constraints (clocks, false paths,
>   multicycles) were almost 100% fine tuned, but the I/O constraints
>   were poor as the physical hierarchies were being defined at the
>   floorplanning stage.  We got into a mode of garbage-in/garbage-out
>   early on without gaining much in terms of effective floorplanning
>   or timing closure.


From: Tim Lantz <tim=user  domain=taunetworks spot calm>

Hi John,

Here is a response to ESNUG 419 #1 from a Monterey user's perspective.

I would expect this of any tool that runs timing based algorithms.  The tool
only understand the constraints you give it.  If your timing constraints are
incorrect then the tool will incorrectly optimize the design.

The best way to avoid long run times due to incorrect constraints is to run
prototyping.  Monterey's Sonar can quickly analyze timing, congestion,
clock, and power.  You can further decrease runtime by adjusting the effort
settings and turning off high performance physical synthesis.  Once the
constraints have been verified and the tool is working on the correct timing
paths, then rerun the design with the required effort settings.  It is
typical to iterate through this process until the constraints are correct.

This trick also works for DC.  Set map effort to low and area optimization
off during timing constraints verification.  This is critical to increase
the efficiency of chip level builds.

Another consideration is how the tool handles critical range and clock
domains.  DC treats each clock domain separately and reduces the worst
negative slack for each independently.  It relies only on the critical
range to optimize the top paths within a clock domain.  Physical design
tools have a different perspective.  They tend to optimize only the top
critical paths irrespective of the clock domain.  Therefore a -2ns path in
one clock domain will prevent optimization of a -1ns path in a separate
clock domain.  So, if your IO constraints are incorrect, the tool will
spend all its timing optimizing the IOs and no time on the flop to flop
paths, even if they are on separate clock domains!

On large designs, Monterey offers time budgeting to assign IO port delays.
I have not used it so I can not make any claims to its effectiveness.


>2) Magma didn't support asynch constraints such as max_delay or min_delay.
>   It was a hit and miss routine that required an eventual timing ECO
>   and caused a compromise in timing spec.

Dolphin does support max_delay and min_delay.  I used this technique
on my last chip to get around the limitation of the timer only propagating
one clock through a clock gator.  Depending on the mode of our chip, a
memory could be accessed by multiple asynchronous clock domains.  Dolphin
will only time one clock through the clock gator so max delays were used
to control the other clock domains.


>3) We were told that Magma needed to be let loose to resynthesize logic.
>   But the choice was there to "force keep" selected instances.  However,
>   if the the list was extensive, then it would be very hard to meet
>   timing.  One could put a "force keep" on an instance but not on a net.
>
>   Once we saw our adders get replaced by XOR's, which were a poor choice
>   in terms of area or timing.  I am not sure how, but that problem was
>   resolved by a workaround.

In my opinion physical design tools return the best results when given
the most freedom.  Dolphin supports dont_touch, which should be used
carefully because it does limit optimization.  I have not had any reason to
use a dont_touch on any logic other than clock logic.


>4) Magma, by default, flattened the logical hierarchy.  The user would have
>   to define what layers to remain intact.  Again we were told that the
>   fewer layers, the better to meet timing.  This turned out to be a major
>   problem later in our design cycle, as functional ECO's were very
>   difficult to implement.  To state the obvious, our ports within logical
>   layers go away as the hierarchies collapse.

This problem has plagued the physical design world for decades.  Dolphin
handles it pretty well, but it isn't perfect.  There is no option to inhibit
optimization across a logical boundary other than a dont_touch on the
individual cells.  Our methodology allows optimization across logical
hierarchy because the physical design hierarchy is a subset of our
verification hierarchy.  Dolphin doesn't remove ports or reuse ports, but it
does move logic across the logical boundary and creates many new ports.
This does make post route gate level debug difficult.  Most of our
verification is run on the RTL and the gate level netlist from synthesis.

The biggest problem with optimization across hierarchical boundaries is the
application of timing constraints.  Our design had many -through, -false,
-multicycle, on hierarchical ports.  Dolphin will honor these correctly,
however after logic optimization the hierarchical port may no longer be the
correct location to apply the constraint.  Using PrimeTime, we converted
the hierarchical port constraints to leaf cell constraints and applied them
during P&R.  Dolphin will only size cells that are timing constraint points,
therefore our timing constraints would apply correctly pre and post route.


>5) We fed wireload models generated by Magma from previous runs into DC,
>   & used a setup margin upto 20% of clock period (fastest clock 138 MHz),
>   only to lose it and some more in later Magma runs!  That caused several
>   iterations of the same block by trying different block floorplans.

Wire load models are a waste of time.  There are many white papers available
that explain this and the options available to you.


>6) Clock trees were two long.  At times Magma inserted upto 50 layers of
>   buffering.  After a few tries, we were able to get the # down to no
>   better than 33 for networks with a flop count of 30K+.  It was important
>   to us to keep a few of the clock networks very short.  In one of the
>   physical hierarchies (150 K instances, a half dozen memories) one clock
>   network contained ~2500 FF's.  Our hope was an insertion delay of
>   0.8 nsec, but we settled for 1.8 nsec with 13 layers.

Dolphin's CTS capabilities are the best I have seen.  Our last chip had an
amazingly complex clock structure.  Due to legacy and poor design practice
on inherited IP, our chip contained 65 clock domains of which 36 were
generated.  This does not include the many gated versions of each clock.
It took us a month to generate a clock diagram!  Dolphin was able to balance
the required clock trees through gators, tristate buffers, and clock
generators with minimal insertion delays.


>7) According to Magma, there was a known top level routing problem called
>   "glass box."  This had to do with the lack of timing visibility of the
>   physical sub-hierarchies.  We ended up routing the top level w/o timing
>   by relying on PrimeTime post-process.

On our large designs, we generate a .lib model and a LEF model for the
lower level blocks.  These are then used at the top level to model timing
and physical characteristics.  However, this gives little visibility into
the block's real structure.  A "glass box" model replaces the .lib/LEF with
detailed models that show the physical placement and full timing to the
first sequential object.  It is similar to an ILM with the addition of
physical information.

Monterey is working on ILM modeling today, but I have yet to use it.  I have
had great correlation between Dolphin's timing and Simplex extraction into
PrimeTime.

    - Tim Lantz
      Tau Networks                               Scotts Valley, CA


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)