Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS


   Editor's Note: As expected, DAC had a serious drop in attendance:

                                2001    2002   Drop
                               -----   -----   ----
                   Attendees   5,955   4,347   -27%
                   Vendors     8,126   5,180   -36%
                   --------   ------   -----   ----
                   Total      14,081   9,527   -32%

   Interesting.  Voting with their feet, the EDA vendors had anticipated
   a 36% drop.  But there was only a 27% drop in attendance.  DAC turned
   out to be better than the EDA vendors had expected!

                                              - John Cooley
                                                the ESNUG guy

( ESNUG 395 Subjects ) ------------------------------------------- [06/26/02]

 Item  1: Magma Call-For-Papers & A Call For Magma Discussion In ESNUG
 Item  2: We've Had Some Really Bad Experiences with Cadence's pwrAnalysis
 Item  3: A Small Mob Brutally Pummels Cooley For Writing About PhysOpt-MPC
 Item  4: DC's Renaming Of Nets Troublesome With Verilog Gate Simulations
 Item  5: Should DC Buffer Up 40 Fanout Nets Or Should Backend Tools Do It?
 Item  6: What About Rectilinear Block Designs In PhysOpt/Jupiter Flows?
 Item  7: I Got Burned On Taxes Buying Ambit CD Instead Of Downloading It
 Item  8: ( ESNUG 393 #2 ) 12% On-Chip Timing Variations & IBM's EinsTimer
 Item  9: Any Commercial VHDL <-> Verilog Or Vera <-> Verisity Translators?
 Item 10: New Cadence User Asks "So What's A Cadence Flow That Works Now?"
 Item 11: A Chip Designer Seeking IP For DDR/SDRAM Interfacing And MMU's
 Item 12: ( SNUG 02 #7 ) SystemC Sucks -- Stop Comparing Superlog & SystemC
 Item 13: ( ESNUG 389 #8 ) Happy With Formality 2002.03 vs. Old Formality
 Item 14: Rational's Purify Can't Find Our Co-simulation Bug.  Any Ideas?

 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com


( ESNUG 395 Item 1 ) --------------------------------------------- [06/26/02]

From: Pallab Chatterjee <pallabc@siliconmap.net>
Subject: Magma Call-For-Papers & A Call For Magma Discussion In ESNUG

Hi, John,

I'm the president of the newly formed Magma User Group, FUSION.  There's
now a fairly large community of us who would like to interchange technical
ideas and comments.  As such, I am requesting current Magma Users and
those with questions about Magma air their technical comments/questions
in ESNUG.  Our inaugural event is the user group conference in Sept and I
would like to announce the Call for Papers here.  Magma users are invited
to submit technical papers discussing their experience in design methods,
conversion from other tools and interoperability with multi-vendor tools. 

         Inaugural Fusion Conference (FUSION 2002)
         September 19-20, 2002
         Westin Hotel
         Great America Parkway, Santa Clara, CA

Your conference registration is free if your paper is published.  Register
for the Inaugural Fusion Event at http://www.magma-da.com/fusion  The call
for paper is at http://www.magma-da.com/CallForPapers.html

    - Pallab Chatterjee
      SiliconMap, LLC


 [ Editor's Note: Since I encourage Avanti & Cadence discussion in ESNUG,
   I'd obviously love more user discussion of Magma in ESNUG, too.  - John ]


( ESNUG 395 Item 2 ) --------------------------------------------- [06/26/02]

From: Rob Stalker <rstalker@xtremespectrum.com>
Subject: We've Had Some Really Bad Experiences with Cadence's pwrAnalysis

Hi John,

I have an issue that perhaps your ESNUG readers could help me with...  Does
anybody else use Cadence's power analysis/IR drop tool, "pwrAnalysis" ?

We've been trying to use it here, but have been running into brick walls.
The bugs that we've uncovered have seemed so elementary, so basic, that it
makes me wonder just how many other chip companies are using this tool.
I'm suspicious that we're the only ones.

What we're stuck on at the moment is that the tool can't handle VCD toggle
files with busses.  Everything has to be scalar.  Yes, I know, you're
thinking "what real-world design doesn't use busses?"  Well, whatever
design that Cadence uses to regression test their code apparently doesn't.

The VCD parser appears to assign a 0 toggle rate to any busses -- which,
of course, doesn't exactly enhance the tool's accuracy.

I have a tech support case open with Cadence, but its been fermenting for
2 months now.  So, I'm wondering if anybody out there has gotten this tool
to work and if they have any tricks/tips/workarounds that they could share.

    - Rob Stalker
      XtremeSpectrum, Inc.                       Vienna, VA


( ESNUG 395 Item 3 ) --------------------------------------------- [06/26/02]

Subject: A Small Mob Brutally Pummels Cooley For Writing About PhysOpt-MPC

> That's why it was interesting that Synopsys, at the same SNUG, let users
> see its experimental PhysOpt-MPC (minimum physical constraints) tool.
> Basically, MPC is a prototyper for RTL jocks stumbling around in PhysOpt.
> MPC assumes a default floorplan for your block.  Your is aspect ratio is
> 0.8, cell utilization is 65 percent, origin is at 0,0, corner keepouts are
> 100 um, IO margins are 20 um, and the easiest pin placement is assumed.
>
> Using PhysOpt-MPC this way, an RTL jock responsible for a 300 K gate block
> in a 5 M gate design doesn't have to muck around with DEF or PDEF 3.0
> floorplans just to get a feel for how his block's timing roughly works.
> The quick & dirty PhysOpt-MPC runs let you know if you have a few critical
> paths and what they are.  Or it says that 75 percent of your paths are
> critical and you have a serious architectural problem with that block.
>
> And by tweaking the MPC defaults, an RTL designer can also find the rough
> trade-offs in metrics like timing vs. utilization vs. aspect ratios.  He
> can see how a 2:1 aspect ratio with 50% utilization may be his fastest
> design -- while at 1:1 and 60 percent, the block has the least conjestion.
> In a nutshell, MPC gives you early physical feedback on your RTL blocks.
>
>     - from "Gary's Weird Prediction"


From: Milan Lazich <milan.lazich@Magma-DA.COM>
To: John Cooley <jcooley@world.std.com>

John,

How come you didn't mention Magma?  We made a product announcement in this
area a month before Synopsys did.  We invited you -- and Gary Smith was
one of the speakers.

    - Milan Lazich
      Magma

         ----    ----    ----    ----    ----    ----   ----

From: John Cooley <jcooley@TheWorld.com>
To: Milan Lazich <milan.lazich@Magma-DA.COM>

Milan, I have two very good reasons:

   1.) I didn't have space in a 400 word column.  I also didn't mention
       Silicon Perspectives, either.

   2.) I honestly didn't know you had anything in that space.  I wasn't
       at your press conference, Milan.

Sorry about that.

    - John Cooley

         ----    ----    ----    ----    ----    ----   ----

From: Milan Lazich <milan.lazich@Magma-DA.COM>
To: John Cooley <jcooley@TheWorld.com>

John,

You may have two reasons, but they're not "very good" -- the first is
irrelevant, the second specious.

I personally invited you to the event and you declined.  I guess you also
didn't see Mike Santarini's coverage of the announcement.  Try reading
"EE Times" some time -- it's a pretty good rag.

    - Milan Lazich
      Magma

         ----    ----    ----    ----    ----    ----   ----

From: John Cooley <jcooley@TheWorld.com>
To: Milan Lazich <milan.lazich@Magma-DA.COM>

Milan, at the time that you invited me (and when I wrote that column), I was
burried deep in getting that massive SNUG'02 Trip Report out plus helping a
client with an EDA problem that I can't talk about.

I was lucky to make it to bed by 2:AM those days.  Again, sorry.

    - John Cooley

         ----    ----    ----    ----    ----    ----   ----

From: Jack Erickson <erickson@cadence.com>

Hi John,

We've had this in PKS since our initial release.  We highlighted it during
our demo at DAC.  If we get a chance to elaborate on this capability, here
is a good explanation from Craig Crump, one of our PKS experts:

  "You run PKS starting from RTL with the default 1x1 aspect ratio and 
   80% utilization.  You can even let PKS auto place the pins for you.
   During the initialization of the block PKS will keep the bus pins
   grouped.  Using this default floorplan you can get an idea if the
   block is anywhere close to making timing.

   Once you have decent results, on the early RTL passes, you can give the 
   gates to the physical design specialist to come up with the floorplan.
   This is better than floorplanning on estimates as there are gates with
   some hope of descent physical timing.

   As the RTL continues to refine, the physical design specialist can give 
   information that is useful in future passed of PKS.  Starting with info
   about the aspect ratio & utilization, and continuing with pin placements
   up to full DEF or PDEF floorplan specification.

   In this way the spins that you make through synthesis to refine the 
   design can get more and more refined floorplans input to them, and move
   toward timing and design closure at the same time."

I'll be interested to see the rest of the discussion on this.

    - Jack Erickson
      Cadence

         ----    ----    ----    ----    ----    ----   ----

From: Glenn Gullikson <glenng@cadence.com>

John, Jack,

Another point to note is that PKS suports "dynamic floorplanning".  The user
can set parameters like block halos, pin side assignments, power stripes
rules, etc. at the RTL level.  They get applied to the design at the point
they are needed; i.e. the pins are placed and power stripe rules are applied
after the bounding box for the block/die is calculated (since the bounding
box depends on the area of the design after preplacement optimization given
it is derived from the utilization), block halos are applied automatically
after block placement completes, rows are also automatically generated, etc.

One final point is that PKS allows the option to "grow" the block size
during post-placement optimization w/ a rules based approach to "anchoring"
macros to sides or other macros such that they can be fixed or allowed to
move in groups.  Pins, power stripes, block halos, rows, etc. are all
automatically updated based on the expanded block boundary.

   - Glenn Gullikson
     Cadence                                     San Jose, CA

         ----    ----    ----    ----    ----    ----   ----

From: Dave Reed <dave@mondes.com>

Hi John,

I'm afraid you're more than a little behind the times in the area of
silicon virtual prototyping.  Our product was first used by a customer on a
production design in December 2000.  Anyone who has been looking at this
issue seriously can quickly see the naivete of the PhysOpt-MPC approach that
you were touting in your column.  Floorplanning decisions have a profound
effect on the physical design process -- subtle floorplanning changes can
make huge differences in the final timing and area of a block.  The idea
that a "default" floorplan can somehow yield useful physical information
is completely wrong.

Today's real world prototyping tools allow designers to begin work on a
design long before a final gate-level netlist is available.  Rather than
producing a pseudo-floorplan, these tools allow users to benefit from the
best available information as they refine their designs.

It may be tempting to wish away the challenges of physical design, but to
imagine that you can replace physical domain knowledge with an approach as
trivial as the PhysOpt-MPC you described in your column widely misses
the mark.

    - Dave Reed
      Monterey Design Systems

         ----    ----    ----    ----    ----    ----   ----

From: Jean-Marc Calvez <jean-marc.calvez@st.com>

Hi, John,

We've had access to PhysOpt-MPC here.  I know that it is extremely popular
with front-end designers who positively can't stand to be waiting (a few
weeks sometimes) for a floorplan from the back-end person.  These users now
consider PhysOpt-MPC to be the best thing since sliced bread.  They all
wanted to try it when they learned of the feature.

However, in the only test I participated in, MPC was quite disappointing.
The context was a highly congested 0.18 standard cell block (no hard macro)
that had to be ported to 0.13, with an increase in clock frequency.  PhysOpt
barely succeeded to meet timing in 0.18 and the occupation ratio was in the
low 90%.  I wrote a small program to convert a 0.18 um DEF floorplan to 0.13
(basically keeping the same number of placement sites, placing the ports at
the same relative positions), and I tried that floorplan against MPC with
similar directives (aspect ratio and total area were the only directives 
specified).

Well, while my "scaled" floorplan met timing at the increased frequency and
ended up with an occupation ratio of 60%.  (Woohoo! Look Ma, no congestion!)
The MPC-driven physical synthesis was much more disappointing with timing 
violations all over the place (WNS about 2.5ns, TNS in the 400ns).  We
tentatively concluded that pre-placement of ports was really a crucial point
that shouldn't be overlooked, and our Synopsys FAE reemphasized the point.

Further experiments with various PhysOpt-MPC options yielded a no-violation
design (almost, 30ps WNS) but the area was about 20% larger than with the
scaled floorplan.

So I am not sure that MPC will be that efficient in helping the frontend 
designer figure the timing issues early in its design.  Sure, you get
results; whether they are significant or not is anyone's guess (if it
weren't for the scaled floorplan that met timing, just by looking at the
results from MPC, we'd sure consider to be in trouble as far as timing
closure was concerned).

As always, caveat emptor: there certainly is a place for MPC in a flow
but caution should be exercised (specifying *no* option is probably a very
bad idea, and the more information one provides, the better).

No silver bullet, I'm afraid.

    - Jean-Marc Calvez
      STMicroelectronics                         Grenoble, France

         ----    ----    ----    ----    ----    ----   ----

From: Nir Sever <nir@zoran.co.il>

Hi John,

I'm impressed by Synopsys marketing.  They can take the most trivial thing:
create a DEF file with a given size and random pin location and call it
"PhysOpt-MPC".  They are probably going to charge for it too.  Amazing.
This is an NCG task for a week.  Every P&R tool and floor plan editor has
had this for years.  Amazing.

    - Nir Sever
      Zoran                                      Israel

         ----    ----    ----    ----    ----    ----   ----

From: James Tyson <jtyson@altera.com>

Hi, John,

We've been doing exactly this kind of thing with PKS for a while.  It gives
us a better idea of area and speed for modules, without having to put any
effort into floorplanning, etc.

I'm surprised that this is only now a feature of PhysOpt.

    - James Tyson
      Altera European Technology Centre          High Wycombe, Bucks, UK


( ESNUG 395 Item 4 ) --------------------------------------------- [06/26/02]

From: Himanshu Bhatnagar <himanshu.bhatnagar@conexant.com>
Subject: DC's Renaming Of Nets Troublesome With Verilog Gate Simulations

Hi John,

We are running into problems with respect to gate level sims.  The Verilog
netlist produced by Design Compiler optimizes the net names and they become
"n1234" or something like that.  Our testbenches probe the internal nets and
therefore it becomes increasingly difficult to map the RTL signal names to
gate level signal names.  I was wondering if there is a variable in Synopsys
that controls the outcome of the signal names?  In other words, how can
preserve the RTL signal names in DC netlists?

    - Himanshu Bhatnagar
      Conexant Systems, Inc.


( ESNUG 395 Item 5 ) --------------------------------------------- [06/26/02]

From: Wayne Miller <Wayne.A.Miller@smsc.com>
Subject: Should DC Buffer Up 40 Fanout Nets Or Should Backend Tools Do It?

Hi John,

I'm looking for a guideline and/or experiences for large blocks or small
chips on when to allow Design Compiler to buffer high fanout nets, versus
having your physical tool insert a buffer tree later in the design flow.

In other words, if a net has 40 loads, should the Design Compiler be
allowed to buffer the net, or should it be classed as an ideal_net?  What
about 100 loads?  200 loads?  Obviously these aren't clocks, nor a master
scan test enable that goes to every flop.  These are nets in the grey area
in between.  We've had some congestion problems with an internal reset
that had a large fanout.  It was an easy fix to have layout insert a
buffer tree, but the question came back as to what threshold should
be used.

    - Wayne Miller
      Standard Microsystems Corporation


( ESNUG 395 Item 6 ) --------------------------------------------- [06/26/02]

From: Jay Pragasam <jlk@brecis.com>
Subject: What About Rectilinear Block Designs In PhysOpt/Jupiter Flows?

Hi John,

For our upcoming hierarchical design chip, I see that our blocks will be
well utilized based on the connectivity and functionality -- if our blocks
are allowed to be rectilinear (non rectangular) in shape -- rather than the
conventional rectangular shapes.  Even though the EDA tools out there claim
that they can handle rectilinear shapes, I am very positive that I will run
into a lot of implementation and integration issues in doing so.  So I am
wondering if your readers have some experience in handling rectilinear
blocks, especially with Synopsys (PhysOpt)/ Avanti (Jupiter/Apollo) tools.

Some stuff that I am curious about are

   1. How difficult is it in getting the pins assigned when the number of
      edges exceed 4?
   2. Is power routing capable of dropping straps of different lengths
      because of different dimensions in one direction or do I have to
      manually alter the lengths?
   3. Will writing out the GDSII have any problems?
   4. Can the parasitic extractor handle arbitrary shaped blocks?

Plus is there anything else that would make my life miserable here?

    - Jay Pragasam
      Brecis Communications                      San Jose, CA


( ESNUG 395 Item 7 ) --------------------------------------------- [06/26/02]

From: Mike Dini <mdini@dinigroup.com>
Subject: I Got Burned On Taxes Buying Ambit CD Instead Of Downloading It

Hi, John,

I have Ambit for RTL synthesis.  (Synopsys won't sell me DC, but that is
another story.)  Cadence is charging me tax on the maintenance support
agreement.  So rather than downloading the updates from the web, I am
forced to pay $360 dollars (8% of $4,500) for a single CD that is shipped
to me instead.  Synopsys does not charge tax on maintenance.  I still can't
get an answer from Cadence as to why is a maintenance agreement taxable.
And get this, I cannot change to the web based option, since I supposedly
opted for CD option when I bought the tool 2-3 years ago.  Seems like one
expensive CD.

    - Mike Dini
      The Dini Group


( ESNUG 395 Item 8 ) --------------------------------------------- [06/26/02]

Subject: ( ESNUG 393 #2 ) 12% On-Chip Timing Variations & IBM's EinsTimer

> Our current ASIC vendor, IBM, is having us do timing analysis with an
> on-chip variation of 12%, applied to both cell delays and
> interconnect delays.
>
> This makes certain kinds of timing very difficult to pass.  For example,
> using a PLL to zero out a 5 ns insertion delay requires a 5ns feedback
> path.  But 12% variation between the two is 600 ps, which is a big chunk
> of time these days.  Source-synchronous interfaces have similar problems.
>
> How realistic is this sort of thing?  Has anyone done a paper on this?
>
>     - Paul Zimmer
>       Cisco Systems


From: Hank Walker <walker@cs.tamu.edu>

Hi, John,

There have been academic and industry papers published on intra-die 
process variation.  The majority is deterministic, due to stepper field
gradients, pattern density sensitivity and OPC imperfections.  But 
there is also a significant random component.  The deterministic 
variation in ILD thickness can be +/- 30% in 180 nm aluminum.  I have 
seen industrial data showing variation in flush delays, ring oscillator 
delays, Fmax, etc., of neighboring chips that can easily be 10%, so 
this implies that much of it is random intradie variation.  The experts 
at this are microprocessor clock tree designers.  You might look for 
papers by Sani Nassif (of IBM Austin) in IEEE conference proceedings, 
or papers on "statistical design" or "parametric yield optimization".

So the vendor wants this type of timing to guarantee high parametric 
yield for a single-bin part pushing the limits.  They could not require 
this, but then they would take a yield hit and just make it up with 
higher wafer costs to you.

In summary, the highest performance designs have been using such 
approaches since 250 nm, and it is now starting to show up in ASICs. 
Unfortunately the tools to support statistical design are complete 
losers, except for analog circuits.  But there is a fair amount of 
university research, so hopefully the situation will improve.

    - Hank Walker
      Texas A&M University                       College Station, TX

         ----    ----    ----    ----    ----    ----   ----

From: Srinivas Kakumanu <kakumanu@time2mkt.com>

Hi John

We always use a four corner case timing closure to tape out our chips.

We get two SDF files, one contains cell delays (cell.sdf) and the other
one contains interconnet delays (connect.sdfRC).  We do STA on four
corner cases.

             cell.sdf(max), connect.sdfRC(max)  
             cell.sdf(max), connect.sdfRC(min)
             cell.sdf(min), connect.sdfRC(max)
             cell.sdf(min), connect.sdfRC(min).

This is an alternative to the on-chip-variance in PrimeTime, if not exactly
the same.  And I did see a difference of 300-500 psec variance on clock
nets in min-max and max-min cases.  And the methodology recommends to fix
all setup-hold violations in these FOUR corners before it is qualified
to tape-out.

    - Srinivas Kakumanu
      Time2mkt.com

         ----    ----    ----    ----    ----    ----   ----

From: Del Cecchi <dcecchi@vnet.ibm.com>

Yes, the tracking can be that bad.  I don't know if you are talking about
IBM as the vendor, but our tracking is about that.  The main contributor is
probably Leffective variation due to etch variation.

    - Del Cecchi
      IBM                                        Rochester, MN

         ----    ----    ----    ----    ----    ----   ----

From: Paul Zimmer <pzimmer@cisco.com>

Hi, John,

One explanation I got from someone else at IBM is that the variations happen
over a fairly small area.  We were picturing a slow, gradual change across
the die, but this doesn't seem to be the case.  Instead, it is more like
random variation.

There are probably both effects, a random device-to-device variation is
probably the biggest but there is likely also a longer range trend.  Some
of the "random" stuff now seems to be geometry dependent.

Which begs the question: If your path goes through 20 or so elements, and
the variation is sort-or random, isn't all-min vs all-max a little
extreme?  But statistical analysis of this would be very complex for the
tool...

This die variation problem hasn't been well known in the ASIC world.  We
have used multiple vendors, and only once were we asked to turn on
"on-chip variation" in PrimeTime (equivalent to linear_comb_delay in
EinsTimer).  Even then, the SDF files they supplied us with showed almost
NO variation between early and late, so it didn't do much.

    - Paul Zimmer
      Cisco Systems

         ----    ----    ----    ----    ----    ----   ----

From: Matt Weber <matt@siliconlogic.com>

Hi John,

IBM is the only ASIC vendor I know of that requires this analysis, so the
analysis is typically done in EinsTimer rather than PrimeTime.

How realistic is this sort of thing?  Obviously, two gates on the same chip
will not turn out exactly the same, due to small variations in the mask and
lithography, adjacency effects from neighboring circuits, and other
factors.

Twelve percent sounds like a lot to me (especially for the wires), but not
unreasonable.  Using one number isn't exactly accurate anyway.  Some cells
will vary only a few percent and others may vary far more than 12%.  I think
a single number is used to reduce the modeling complexity, and it generally
gives close enough results for the effects that we are trying to model.

Cross chip variation doesn't affect most of the data signals in your chip.
You've already verified that you meet setup requirements with worst case
delays.  If some gates in the data path are a little bit faster due to cross
chip variations, no problem.  Also, you've already verified that you meet
hold time requirements with best case delays.  If some gates in the data
path are a little bit slower, no problem.  Where cross chip variation really
bites you is in the clock trees.

Consider this circuit with a three level clock tree:

                                     flop1
                    clk1a   clk2a    |D Q|--[logic]-+
                  +--|>o--+--|>o--+--|>  |          |  flop2
             clk0 | clk1b   clk2b                   +--|D Q|
        X-----|>--+--|>o--+--|>o--+--------------------|>  |
                                   

We know from worst case analysis that the logic is fast enough to make
setup at flop2.  This includes some clock skew which is calculated based
on the clock tree parasitics.  However, it does not include any clock skew
caused by process differences between the clock drivers.  If, on the real
chip, clock drivers clk1a and clk2a end up at a slower process point than
clk1b and clk2b, you may still end up with a setup failure, even though
static timing said the path was okay.  Similar problems can happen with
hold checks and clock gating checks.

The differences that can occur on a chip between two instances of the same
cell can be modeled using on-chip variation analysis.  In EinsTimer, this is
done by turning on LCD (linear combination of delays) analysis.  One way to
enable the analysis in PrimeTime is with commands such as

   set_operating_conditions -analysis_type on_chip_variation $OPCON
   set_timing_derate -min 0.88 -max 1.0

If each stage of the clock tree above has 1ns latency and on chip variation
is 12%, the tool will initially say that the arrival times at the clock pins
of flop1 and flop2 can be different by 360ps (3ns * 12%), in addition to any
skew caused by placement and loading differences.  As Paul mentioned,
hundreds of picoseconds are not easy to find these days.  Fortunately, the
situation usually isn't quite this bad.  The clock tree goes through driver
clk0 for both flop1 and flop2.  Clk0 can't be fast for one of the flops and
slow for another.  It can be fast or it can be slow, but it is the same for
both flop1 and flop2.  So the 120 ps of on-chip variation that was
originally calculated for clk0 gets credited back.

In EinsTimer this process is called Common Path Pessimism Removal (CPPR).
I was hoping that Synopsys would have come up with a better name, but I see
that it is called Clock Reconvergence Pessimism Removal.  The variable to
enable it is timing_remove_clock_reconvergence_pessimism.  That rolls off
my tongue about as easily as "Peter Piper picked a peck of pickled
Primetime."

Anyway, if your clock insertion is 5 ns, most of your paths will still not
see 600 ps of on-chip variation. It all depends on how far back in the clock
tree you need to go to find the common point between the startpoint and
endpoint registers.

Although at first it appears that using on-chip variation analysis steals
a bunch of performance from you, I don't think this is necessarily the case.
Other ASIC vendors still must account for on chip variation.  Without
running on-chip variation analysis, they must cover it some other way.  I
assume it gets covered through padding the setup and hold margins of the
flops, requiring some set_clock_uncertainty even after the clocks are placed
and routed, or requiring XX ps of positive slack for timing signoff.  Any of
these methods effectively penalize ALL of the paths in your design.  By
doing on-chip variation analysis instead, you are able to stuff an extra
hundred picoseconds of logic into paths which are contained in a common
branch of the clock tree.  By modeling the effect more accurately, we are
able to get more performance out of the design.

    - Matt Weber
      Silicon Logic Engineering, Inc.            Eau Claire, WI

         ----    ----    ----    ----    ----    ----   ----

From: Paul Zimmer <pzimmer@cisco.com>

Hi, John,

The specific place we had the problem was with the PLL feedback.  The
PLL is there to zero the clock insertion delay.  How much the CPPR can
help depends on how much the feedback and clock paths have in common.
They SHOULD have a lot in common - the feedback normally comes from the
end of the clock tree, with a few extra gates to zero out the pad delay.

Getting the STA tool to recognize this common path is perhaps trickier.
But you have given me an idea.  I'm not sure that the way I'm constraining
it in EinsTimer is the best.  This mode is the only place that I'm using
RAT's instead of UDT's.  Perhaps if I can recode this as UDT's somehow
I can improve the analysis...

No, that won't work, because the PLL early/late times are calculated
once and then hung on the PLL outputs.  EinsTimer doesn't understand that
the PLL early at is based on clock tree max delay, and therefore should
be cppr'd with clock tree min delay.

Messy stuff.

    - Paul Zimmer
      Cisco Systems

         ----    ----    ----    ----    ----    ----   ----

From: Adam Shiel <adam@siliconlogic.com>

Hi John,

I work with Matt here at Silicon Logic Engineering and I do quite a bit of
work with EinsTimer.  Not knowing the system Paul is working with, I can't
figure out exactly what the problem he's seeing is based on the description.
For normal latch to latch paths, or to I/O, CPPR should find the common
clock point and compute the proper credit.  There is some extra pessimism
introduced by IBM's PLL adjust script in LCD mode that they've recently
added a command to eliminate.  This sort of sounds like what you're
describing.

Computing the feedback adjust, IBM's PLL adjust script makes some
assumptions whether the gates in the feedback path are running fast or slow.
The script looks either at the late arrival time at the PLL feedback for
early PLL or the early Arrival Time for late PLL.  Since it's just looking
at the either early or late Arrival Time, it's assumed that the gates on the
feedback path are at that process point.  It's inconsistent to allow the
other process point delays through those gates, since you just assumed
they're at a fixed process point.  However, that's what EinsTimer would do
by default for computing setups and holds in LCD.
 
It looks like IBM's introduced a new command in the last couple of months
that fixes the delay of the cells on the feedback path to a given PVT
point, even in LCD.  Ask your IBM AE for the methodology alert about
et::remove_pll_pessimism.

We just found out about the command this week; maybe your AE is better
about keeping you informed about methodology alerts than ours is.  I'm not
sure this is what your problem is but it sounds like it's in the ballpark.
Of course this advice may be worth exactly what you paid for it.  :)

    - Adam Shiel
      Silicon Logic Engineering                  Eau Claire, WI

         ----    ----    ----    ----    ----    ----   ----

From: Paul Zimmer <pzimmer@cisco.com>

Hi, John,

That was dead on.

Please thank Adam for jumping in.  His timing is perfect.  I *just* sent IBM
an email describing this phenomenon.  Basically, early RAT and early PLL are
inconsistent in their treatment of the clktree delays (or anything else
that's shared in the feedback path).

It'll be interesting to see if my AE responds with et::remove_pll_pessimism!
I'm going to go read up on it in the meantime.

    - Paul Zimmer
      Cisco Systems

         ----    ----    ----    ----    ----    ----   ----

From: Matt Weber <matt@siliconlogic.com>

Hi, John,

I had a Jimmy Buffet CD in my car the same day I was thinking about Paul's
problem.  My new version of "Margaritaville" has intermittently stuck in my
head ever since:

              Nibblin' on static timing,
              Has me crazy and rhyming,
              All my critical paths seem to have me foiled.
              Slipping the tapeout,
              Watching my boss pout,
              See those managers--they're beginnin' to boil.

              Wasting away again in static-timing-ville,
              Searchin' for my lost multiplexed clock,
              Some people claim that there are wireloads to blame,
              So I guess....I'll take a shot with PhysOpt.

I think Paul has gotten me in trouble.  People are wondering why I was
laughing out loud in my office at 8:00 this morning.

    - Matt Weber
      Silicon Logic Engineering, Inc.            Eau Claire, WI

         ----    ----    ----    ----    ----    ----   ----

From: Paul Zimmer <pzimmer@cisco.com>

Hi, John,

By the way, I got an answer from my IBM AE.  She says that

                            remove_pll_pessimism

should be executed BEFORE the pll_adjust command is invoked.  Is that what
Matt was told as well?

    - Paul Zimmer
      Cisco Systems

         ----    ----    ----    ----    ----    ----   ----

From: Matt Weber <matt@siliconlogic.com>

Yes, remove_pll_pessimism gets executed before pll_adjust.  I actually
haven't tried it yet myself, but a couple of other people here have, and
those were the instructions.

    - Matt Weber
      Silicon Logic Engineering, Inc.            Eau Claire, WI


( ESNUG 395 Item 9 ) --------------------------------------------- [06/26/02]

From: Bill Billowitch <wdb@agere.com>
Subject: Any Commercial VHDL <-> Verilog Or Vera <-> Verisity Translators?

Hi, John,

I've recently joined Agere Systems heading up their design reuse efforts.
We, like many companies, have a variety of tools and languages in use.
What are there vendors out there offering translators for VHDL <-> Verilog
or Vera <-> Verisity?

    - Bill Billowitch
      Agere Systems                              Allentown, PA


( ESNUG 395 Item 10 ) -------------------------------------------- [06/26/02]

From: Albert Ma <ama@cag.lcs.mit.edu>
Subject: New Cadence User Asks "So What's A Cadence Flow That Works Now?"

Hi John,

I'm trying to get a complete Cadence-based tool flow up and I'm confused by
Cadence's product line.  In particular I'm referring to:

     Preview Silicon Ensemble
     DSM Silicon Ensemble
     PKS (aka Silicon Ensemble PKS)
     First Encounter
     SOC Encouter

We don't have First Encounter, but we do have PKS 4.0, DSM SE 5.3, and
Preview (IC446).

All these tools have substantial overlap.  It seems like Cadence is moving
towards First Encounter and PKS as the complete solution.  That's great for
the future, but what about now?

We're doing semi-custom design.  We have full-custom datapath and standard
cell control.  We need to do floorplanning, synthesis, P&R, and chip
assembly.  We want to do everything flat (but with clustering).  There seems
to be a couple of ways to doing things, but I wanted to ask how others
have handled it.

BTW, we use multiple power supplies on the chip (with inherited connections
in Composer).  How is the power connectivity information conveyed between
this new mix of Cadence tools?

    - Albert Ma
      M.I.T.                                     Cambridge, MA


( ESNUG 395 Item 11 ) -------------------------------------------- [06/26/02]

From: Nir Sever <nir@zoran.co.il>
Subject: A Chip Designer Seeking IP For DDR/SDRAM Interfacing And MMU's

Hi John,

We are looking for IP in the field of DDR/SDRAM memory interface and MMU.
Do you know of anyone in this business?

    - Nir Sever
      Zoran                                      Israel


( ESNUG 395 Item 12 ) -------------------------------------------- [06/26/02]

Subject: ( SNUG 02 #7 ) SystemC Sucks -- Stop Comparing Superlog & SystemC

> Superlog Superlog Superlog.  By the time SystemC is done, it combines all
> the disadvantages of C (a contrived event model and no concept of time)
> with all the disadvantages of HDLs (runtime "challenges" -- continental
> drift anyone on a 70+ million transistor design? -- and a lack of high
> level abstraction mechanisms.)  Superlog addresses these in a much
> cleaner way.  
>
> In addition, it's high time to dispel the myth of if we can use C to
> design ASICs, then C programmers can be ASIC designers.  That's BS.
>
>     - Tom Heynemann of Compaq


From: Nick Skelton <nick.skelton@freehand.se>

Hi John,

You're misfiling Superlog by lumping it into threads about SystemC.  Surely
everyone has now seen that design in C is not going to fly.  You should be
listing Superlog with the verification languages like E and Vera.

In design and especially sub block verification, Superlog is good.  The
subblock designers, who really want have have access to all the
verification features, don't want to have to learn a new weird language.

We've been using Superlog Systemsim for 3 years now.  We like
it.  It's robust and fast, can't do VHDL, but it does simulate C together
with Verilog and Superlog rather well.  We use that a lot.

    - Nick Skelton
      Freehand DSP AB                            Sweden


( ESNUG 395 Item 13 ) -------------------------------------------- [06/26/02]

Subject: ( ESNUG 389 #8 ) Happy With Formality 2002.03 vs. Old Formality

> Another thing we liked was easily setting constraints directly in its GUI.
> Compare point matching are a major pain for LEC tools.  Ideally, you want
> the tool to do all the matching automatically.  I still haven't seen a
> tool that can automatically match all points all the time.  But Formality
> 2002.03 got pretty close.  With it's new interative matching capabilities,
> we were able to quickly verify that our constraints worked before running
> a complete run.  Writing batch scripts was as easy as running the GUI.
>
>     - Pontus Pleven
>       Ericsson Technology Licensing AB           Lund, Sweden


From: Brian Coffey <brian.coffey@analog.com>

Hi John,

We've found Formality 2002.03 is certainly easier to use than past versions.
It's GUI is now flow based, similar to TetraMax.  It's also a much more
useful debugging tool than in the past with this new GUI.

It's script driven approach also has been simplified slightly with a
gates vs. gates comparison needing just the following commands:

  read_db -technology library.db
  set hdlin_auto_top true 
  read_verilog -r synthesised.vg 
  read_verilog -i edited.vg 
  verify 

Older scripts might not work with the latest version of Formality, however
Synopsys have provided a translator.  

Synopsys also has a command

               write_hierarchical_verification_script

which can be very useful in debugging hierarchical designs narrowing the
scope to just the failing sub blocks.

We've seen Formality 2002.03 show capacity and runtime improvements over
previous versions.  More importantly for us the tool was able to verify a
design with large complex arithmetic components that it wasn't able to
verify before.  It was a gate vs. gate comparison of two FLAT netlists.
Formality was not using its new multiplier solvers - which rely on
DesignWare hierarchy present in the netlists. 

    - Brian Coffey
      Analog Devices                             Limerick, Ireland


( ESNUG 395 Item 14 ) -------------------------------------------- [06/26/02]

From: Steve Tjiang <tjiang@tensilica.com>
Subject: Rational's Purify Can't Find Our Co-simulation Bug.  Any Ideas?

Hi, John,

We're trying to find a memory leak in a large co-simulation, i.e. we're
comparing a C-model with a Verilog model together with Vera as the test
bench.  The leak is probably in the C model and it shows up only when we
run in co-simulation, not standalone.  I tried to run Rational's Purify
tool on the entire co-simulation, but to date, have had no success
purifying VCS.  I am wondering if anybody on ESNUG have had any luck
doing this.

    - Steve Tjiang
      Tensilica, Inc.


============================================================================
 Trying to figure out a Synopsys bug?  Want to hear how 13,958 other users
  dealt with it?  Then join the E-Mail Synopsys Users Group (ESNUG)!
 
     !!!     "It's not a BUG,               jcooley@TheWorld.com
    /o o\  /  it's a FEATURE!"                 (508) 429-4357
   (  >  )
    \ - /     - John Cooley, EDA & ASIC Design Consultant in Synopsys,
    _] [_         Verilog, VHDL and numerous Design Methodologies.

    Holliston Poor Farm, P.O. Box 6222, Holliston, MA  01746-6222
  Legal Disclaimer: "As always, anything said here is only opinion."
 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)