Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS


  Editor's Note: My sympathies go out to all the Indian engineers who
  read ESNUG and who have friends and family effected by that horrible
  earthquake in India.  My hopes are with you.

                                               - John Cooley
                                                 the ESNUG guy

( ESNUG 364 Subjects ) ------------------------------------------- [02/01/01]

Item  1 : ( ESNUG 363 #1 )  Chip Architect (Ironically) CAN'T Do Hierarchy!
Item  2 : ( ESNUG 363 #2 )  Watch Out!  Presto Tied Gate Outputs To Ground!
Item  3 : Memory Leak In "Automated Chip Synthesis" (ACS) Design Budgeter
Item  4 : A Quickie Report Of A 0.18 TSMC Tape-Out Using Magma BastFusion
Item  5 : Mentor Renoir vs. Innoveda's Visual HDL with ALDEC & ClearCase
Item  6 : Latch-Based Designs Are HELL In Lib Compiler, DC, & DFT Compiler
Item  7 : Is report_qor An Undocumented Synopsys DC Or PhysOpt Tcl Command?
Item  8 : VCS Newbies Have Troubles Finding Online Manuals & VirSim Commands
Item  9 : An Odd Synopsys Marriage Of DesignWare And Smartmodel Licenses...
Item 10 : ( ESNUG 360 #3 )  PhysOpt Signal Integrity & Primetime Crosstalk
Item 11 : ( ESNUG 345 #5 )  Well, Synchronicity's DesignSync Works For Us
Item 12 : A Tcl Script That Tricks PhysOpt Into Cleaning Up Net Congestion
Item 13 : Is An All NAND Or NOR Gate Lib The Best Lib For Design Compiler???
Item 14 : ( ESNUG 363 #11 )  Cadence PBOPT, Synopsys LBO, & FlexRoute Tricks
Item 15 : ( ESNUG 363 #14 )  Pass-Thrus In Hierarchical Physical Designs

 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com


( ESNUG 364 Item 1 ) --------------------------------------------- [02/01/01]

Subject: ( ESNUG 363 #1 )  Chip Architect (Ironically) CAN'T Do Hierarchy!

> ... although the placement Chip Architect produced was very good, it did
> not completely meet timing and the ECO timing improvement features of the
> tool were broken.
>
> I attempted and was able to write a complex script to have Chip Architect
> do repeater insertion, but it would only work after I flattened the entire
> design (a hierarchical attempt only produced core dumps).  This resulted
> in timing being met, but led to further misery as the tool only had the
> ability to produce a flat netlist from the flattened physical hierarchy
> and did not keep separate logical and physical views.  This in itself was
> ugly, but only a show-stopper when we attempted to run several other tools
> which couldn't handle a totally flat netlist for a design of that size.
>
>     - Jon Stahl
>       Avici Systems                              N. Billerica, MA


From: "Tom Ayers" <tomayers@believe.com>

John,

We have a copy of the Chip Architect tool and have attended the training
sessions.  While they loudly tout that it supports hierarchical placement,
what is really true and mentioned offhand by John Stahl is that it can ONLY
do separate placement on each and every hierarchical block.

That means that even DesignWare inserted levels of hierarchy must be
physically regioned on the die if left in the netlist.  This is absurd for
large designs with hundreds of hierarchical instances!  You can only run
placement starting with lowest levels of hierarchy and work up.  No
logical/physical variation is allowed, which is just a downright odd
choice for a company who sold Floorplan Manager suggesting that you
regularly encounter logical/physical mappings.

I would also suspect that this choice has some impact on utilized die area
as it seems like driving hierarchical placement down to the lowest levels
of the netlist and as we know, there is a die area savings by running flat
P&R over hierarchical P&R.

We are currently doing our FPGA prototypes and I promise some material on
that once we have boards up and running.

    - Thomas Ayers
      Vice President,  HW Engineering
      Believe, Inc.


( ESNUG 364 Item 2 ) --------------------------------------------- [02/01/01]

Subject: ( ESNUG 363 #2 )  Watch Out!  Presto Tied Gate Outputs To Ground!

> Presto makes me nervous.  Changing the way MUXes and arrays are translated
> to gtech sounds like two different circuits to me.  Infer enumerated types
> should be defaulted to off to be compatible with earlier releases.
>
>     - Dennis Milton
>       Stratus Computer                           Marlboro, MA


From: Scott Evans <scott@sonicsinc.com>

Hi, John,

I had tried using the Presto option with the 2000.05 release and ran into at
least one bug was reported as a timing loop.  This resulted in STAR 113486.
These loops did not exist when Presto was not used.  At that point, I dumped
out a netlist to see what DC was talking about.

    GTECH_OR2 C1723 ( .A(N2188), .B(N2189), .Z(1'b0) );
    GTECH_AND2 C1724 ( .A(1'b0), .B(N2187), .Z(N2188) );

When I looked at the gates I noticed that it assigned the output of an OR
gate tied to 1'b0 !!!  (I'm sure the physical layout software would love to
route ground to the output of a gate.)  Furthermore, DC seemed to be
treating this constant as a real signal so when the constant was used
as an input to the gate which originally generated the 1'b0, DC thought
this was a loop.  So, I would suggest that you carefully examine the
error/warning/informational messages you get in DC if you try Presto
in 2000.05.

    - Scott Evans
      Sonics, Inc.                               Mountain View, CA


( ESNUG 364 Item 3 ) --------------------------------------------- [02/01/01]

From: Tom Fairbairn <tomf@pdd.3com.com>
Subject: Memory Leak In "Automated Chip Synthesis" (ACS) Design Budgeter

Hi John,

Thanks as ever for ESNUG, but I've a question for the ESNUG folks.  Has
anybody out there been using Automated Chip Synthesis (ACS)?  

A long time ago (well, about the time 99.10 was released, actually), a man
from Synopsys came on site & told us all about the new synthesis environment
called ACS.  ACS promised to eliminate the need for coming up with block
level constraint budgets.  ACS promised to automate Design Budgeting.  ACS
promised to look after our synthesis data for us.  Pretty cool, we thought.

A few months ago we tried it on a real design.  The ACS process ran out of
memory while generating the constraints for the second phase (pass 1)
compile.  The Design Budgeter has a memory leak bug.  Synopsys identified
a workaround, but that only works in 2000.5.  There's no fix for 2000.11.
We're trying to get Synopsys to commit to a fix but it's hard going.

A question I asked Synopsys when we first found the problem was "has anyone
else used ACS?".  They've been unable to come up with anything, so I was
wondering if anyone in the ESNUG community would like to share their
experiences?  How they found its compile times, how well constraints were
met, impact on layout, gotchas and successful techniques etc.

    - Tom Fairbairn
      3Com Europe Ltd.                           Hemel Hempstead, UK.


( ESNUG 364 Item 4 ) --------------------------------------------- [02/01/01]

From: Jim Whitaker <jim.whittaker@powervr.com>
Subject: A Quickie Report Of A 0.18 TSMC Tape-Out Using Magma BastFusion

Hi John,

Read your newsletter each week - keep up the good work!  I saw your request
for tapeout info, but hesitated to respond as we have not technically taped
out yet.  We are however in good shape, and none of the remaining issues are
due to Magma.  The chip is real and will be fabbed -- you can decide if this
counts or not!  We have full chip LVS passing and 4 Antennas at top level to
fix plus a handful of DRCs - expect tapeout in the next 2 weeks.  The design
is 13 Million transistors (LVS count), core clock 125Mhz in 0.18u TSMC
generic.  The design is broken into 10 top level modules, 9 of which we did
P&R in Magma for and 1 remaining module (for historical reasons) plus the
top level are all done in Avanti.  We get to DRC, LVS timing clean blocks in
1 pass through Magma, and hit our target timing (a first for us).  The design
includes approx 50 ram macros, 3 clock domains and some analogue macros.
Placable instances per block 50-150k.

    - Jim Whittaker
      Imagination Technologies                   Germany


( ESNUG 364 Item 5 ) --------------------------------------------- [02/01/01]

Subject: Mentor Renoir vs. Innoveda's Visual HDL with ALDEC & ClearCase

> My design group is considering purchasing one of the two tools in the
> subject line (Mentor Renoir vs. Innoveda's Visual HDL).  If you have
> experience with _both_ of them, please let me know your thoughts on the
> relative strengths/weaknesses of the two.
>
>     - Eric Holbrook
>       Agere


From: "Roger Boyer" <rlboyer2@home.com>

I don't know if this continues to be the case, but when I used VisualHDL to
tie together text based blocks (created as HDL rather than their state
machines, etc) it wasn't able to do the one thing this sort of tool should
be able to do - it couldn't manage propagating signal name changes through
the various levels of hierarchy.  My feeling was that if I had to do this
by hand, why use the tool at all?

Things to consider:

  - I was using VHDL, the Verilog support might be different (but I doubt
    it, if they could solve this in one language but not the other then
    you have other things to worry about.)

  - I was using this on a Sun work station

  - This was about 3 years ago (you might ask if this has been changed or
    fixed.)

Renoir?  Never used it, I'd like to know if it's any good myself!

    - Roger Boyer

         ----    ----    ----    ----    ----    ----   ----

From: "Dean Susnow" <susnow@home.com>

I have used Renoir on the last 2 large ASIC designs.  The tool is powerful
and very easy to use.  We evaluated both tools 2 years ago and decided to
purchase Renoir.  We have been very happy with our decision.  Mentor
Graphics is very receptive to customer feedback and incorporates requests
in future product releases.

    - Dean Susnow

         ----    ----    ----    ----    ----    ----   ----

From: "Roger Boyer" <rlboyer2@home.com>

One issue that I rarely see handled well in "code generation" tools is
configuration management.  Perhaps I make a bigger deal of this than others
becuase I come from more of a SW background your typical HDL coder, but when
integration time hits "hot & heavy" there is a premium on being able to
reproduce a given version with a high degree of certainty.  Most of these
types of tools don't interact well with CM tools such as ClearCase.

Does anyone have any good or bad experiences with this?  In my opinion, this
is as important a feature other "usability" issues.

    - Roger Boyer

         ----    ----    ----    ----    ----    ----   ----

From: "Markus Meng" <meng.engineering@bluewin.ch>

That's one view, however you could also give ALDEC a try.  You get most out
of the money.  In Renoir, what I hated most in the projects done so far, is
the somewhat not so easy to use component mechanism.  The differentiation
between a component and a cell doesn't make to much sense to me.

Another annoyance in Renoir is that you can't display concurrent FSM on one
sheet.  You will get several sheets.  That's something really bad, where you
can easily loose the overall view of the design.

In general Renoir is not a 'just do it tool' it is sometimes a little bit
complicated.  Especially the printing and signal naming mechanism on the
sheet's you are working on is sometimes messy.

    - Markus Meng

         ----    ----    ----    ----    ----    ----   ----

> One issue that I rarely see handled well in "code generation" tools is
> configuration management.  Perhaps I make a bigger deal of this than
> others because I come from more of a SW background your typical HDL coder,
> but when integration time hits "hot & heavy" there is a premium on being
> able to reproduce a given version with a high degree of certainty.  Most
> of these types of tools don't interact well w/ CM tools such as ClearCase.

From: kenkovaa@gamma.hut.fi (Kim Gunnar Enkovaara)

This has been the problem with many tools.  Fortunately Renoir 2000 has many
fixes in the version management side and supports directly many version
management packages (CC for example).  The support is better than in 99.x
packages.

    - Kim Enkovaara
      Helsinki University of Technology


( ESNUG 364 Item 6 ) --------------------------------------------- [02/01/01]

From: [ Socks, the Clinton's Homeless Cat ]
Subject: Latch-Based Designs Are HELL In Lib Compiler, DC, & DFT Compiler

Hi John,

Please make me anon.

I'm an experienced Synopsys DC user but new to latch-based design.  My
company's design methodology is latch-based, mixed freely with FFs (which
are just std. cells containing two latches).  Clocks are dual-phase and
distributed in both true/inverted polarity.  Scan methodology is fairly
similar to clocked-LSSD.  Hence there are 4 functional-mode and 4 scan-mode
clocks.  Currently we have to wire them manually, this obviously has many
drawbacks.  I recently had negative experiences trying to raise the level
of automation using DC and Library-Compiler.  Also we tried out
DFT-Compiler.  Am I missing something or is this just the state of things?


Issue 1: Modeling of master-slave clocking, "set_signal_type" problem

I found out that DC does not propagate "set_signal_type ... clocked_on_also"
attribute in hierarchical designs (unlike the clock properties).  (This is
needed for the master/slave FF modeling style of SYNTH-481941.html)

Since we parameterise register primitives, etc, all our designs are deeply
hierarchical.  I find it absurd to have to manually propagate this attribute
down the hierarchy before a compile.  (If we omit it, DC only sees one of
our clock phases, and uses a one-phase enable-FF from our library.)

Apparently it used to propagate until 2000.05, then R&D took it out for some
reason that nobody could tell me.  This wan't documented in the release
notes either, which leads me to think very few people use it.  Now they are
reluctant to put it back in 2000.11.  My AC raised STAR 113503 on this, to
allow users to choose the behaviour, but can anyone give me one reason why
this should not be the default?  Until then, mixed latch/FF designs will
have to be modelled as if there was only one clock.  (We don't personally
care about timing modeling inside DC, but I'm sure that raises issues, too.)

For good measure I also experimented with "create_clock -related_clock" but
that's even more flaky.  SOLVIT is just in a complete mess as regards
modeling and synthesis on this.


Issue 2: Modeling of extra clock pins, and implications

We distribute true- & inverted-sense of each clock phase, hence 4 clock
pins.  Lib-Compiler can't handle it, neither can DC, doesn't matter whether
the pin group has attribute 'clock' or not.  Obviously the inverted clocks
don't occur in the statetables.  (There was also no way DFT-Compiler would
accept these were scan-controllable.  Without at least making the statetable
unreadably long.)  It would be nice to override the tendency to blackbox
everything they can't understand.  If we could at least override, we could
script the wiring fix inside synthesis.  As it is we have to generate a
kludge library with full physical connections and wire things up, after
scan.  If I could at the very least direct all three tools to just ignore
the pins completely, let alone usefully understand the concept of >2 clocks:

  create_clock ph1
  create_clock ph2   -related_clock ph1 /*this is the slave clock*/
  create_clock ph1_b -related_clock ph1 /*just wire any subsequent
  clocks*/
  create_clock ph2_b -related_clock ph1

Further, not only did DC/DFT-Compiler refuse to use them, they failed to
find their scan-equivalents using either mapping method.  If two cells
are scan-equivalent, except for dangling pins which match by name, then
scan-equivalent cells should be found. e.g. non-scan FF and scan-FF with
ph2, sph1, sph2 dangling.  ("test_allow_matching_unconnected_pins = true"?)

And finally, even when I manually instantiated them for DFT-Compiler, it
automatically assumed having unconnected pins being 'X' makes a scan-cell
uncontrollable, and there was no command using something like:

  set_test_ignore / or
  set_test_hold /   to stop it.

Surely there will always be pins that DFT-C cannot be made to understand, so
isn't it about time we had a command for this?


Issue 3: Bussed signals not allowed in a Lib-Compiler statetable.

This is only slightly annoying.  (We bus the scanclocks, to make the wiring
scripting less painful.)


I'm interested in all responses, John.

Thank you.

    - [ Socks, the Clinton's Homeless Cat ]


( ESNUG 364 Item 7 ) --------------------------------------------- [02/01/01]

Subject: Is report_qor An Undocumented Synopsys DC Or PhysOpt Tcl Command?

> Does anyone know about the report_qor command in dc_shell-t?  We use it,
> but I can't find any documentation on it anywhere.  Makes me wonder how
> many secret, undocumented features exist in Design Compiler...
>
>   ...
>   report_qor > ${REPSPATH}/\${TOP_DESIGN}_hier.pass\${PASS}.qor.rpt;
>   ...
>
> Is all I found in Solvnet where report_qor was being used in a script for
> budgeting.
>
>     - Christian Cabal
>       Hewlett Packard


From: jmcalvez@club-internet.fr (Jean-Marc Calvez)

As I understand it, it is an "official" command in psyn_shell (so you want
to look it up there), which is unofficially available in dc_shell/TCL.  It
prints out various metrics related to quality of results (number of
violations/TNS/WNS per clock domain, area, synthesis runtime)...

    - Jean-Marc Calvez                           Grenoble, France

         ----    ----    ----    ----    ----    ----   ----

From: Christian Cabal <cabal@rsn.hp.com>

Thanks.... is psyn_shell the Physical Compiler product?  When we use it in
dc_shell-t after a job, it reports this:

    Compile CPU Statistics
    -----------------------------------
    Resource Sharing:              0.00
    Logic Optimization:            0.00
    Mapping Optimization:          0.00
    -----------------------------------
    Overall Compile Time:          0.00

is this a bug?

    - Christian Cabal
      Hewlett Packard

         ----    ----    ----    ----    ----    ----   ----

From: jmcalvez@club-internet.fr (Jean-Marc Calvez)

Yes, psyn_shell is the Physical Compiler product.

If, by a job, you mean a synthesis run, then it is obviously wrong (or you
have a CPU I would love to get my hands on!).  On the other hand, as long as
it remains an undocumented command, you have little ground to complain to
your Synopsys FAE (btw, I got non-zero values in psyn_shell; never tried it
in DC).

    - Jean-Marc Calvez                           Grenoble, France


( ESNUG 364 Item 8 ) --------------------------------------------- [02/01/01]

Subject: VCS Newbies Have Troubles Finding Online Manuals & VirSim Commands

> Is there an online VCS manual?
>
>     - Hung


From: Berend Ozceri <berend@cisco.com>

Look in $VCS_HOME/doc/UserGuide/*.pdf

where $VCS_HOME is the root of your VCS installation.

    - Berend Ozceri
      Cisco Systems, Inc.

         ----    ----    ----    ----    ----    ----   ----

From: Srinivasan Venkataramanan <srini@realchip.com>

I am using VCS with the VirSim front-end GUI.  This is the first time I am
using this tool set and am already quite impressed with it.  I am facing
one major problem though, please see if any one of you can help me out.

When I want to add signals/variables to the Waveform window I do:

   1.) Open the "Hierarchy Browser"
   2.) Go to the scope
   3.) Select signals of interest
   4.) Click on "Add" to a specific group (I create groups in the
       Waveform window)

Now, I would like to do this not via mouse clicks, but via COMMAND LINE,
makes sense - right?  Now the problem is I am unable to find a command for
this.  Please let me know if there is one (similar to ModelSim's TCL
"add wave *" or NC's "probe -all" ).  I have tried looking up in the
documentation but in vain.

    - Srinivasan Venkataramanan (Srini)
      RealChip                                   Chennai (Madras), India


( ESNUG 364 Item 9 ) --------------------------------------------- [02/01/01]

From: [ Kenny, from South Park ]
Subject: An Odd Synopsys Marriage Of DesignWare And Smartmodel Licenses...

Hi, John,

According to Synopsys sales, licenses for Synopsys LMG "Smartmodels"
(SIMMODEL-PREM) licenses and Synopsys DC DesignWare (DesignWare-Basic)
will soon be interchangable.(?)

The Synopsys LMG models are a library of standard components which have
been used in board level simulations for years.  An ASIC designer's
company may have licenses for these tools which they may know nothing
about.  Why Synopsys would want to integrate the license for these two
products which have completely different funciontality I have no idea,
but I plan to capitolize on this in my ASIC designs.

    - [ Kenny, from South Park ]


( ESNUG 364 Item 10 ) -------------------------------------------- [02/01/01]

Subject: ( ESNUG 360 #3 )  PhysOpt Signal Integrity & Primetime Crosstalk

> In your Boston SNUG Trip Report, you mentioned that Aart de Geus gave the
> keynote address.  Two of the interesting tidbits are of special interest
> to me:
>
>      a. PrimeTime is now in beta with crosstalk analysis capability
>      b. plans to integrate signal integrity capability into PhysOpt
>
> Do you have more information on this?  Where can I learn more, in addition
> to from Synopsys source?
>
>     - Lun Ye
>       Lucent Technologies, Inc                   Allentown, PA


From: [ A Dallas Synopsys AE ]

Hi John,

I would like to comment on Lun Ye's questions about PrimeTime and signal
integrity.

When Aart mentioned PrimeTime with crosstalk analysis capability, he was
talking about PrimeTime-SI.  PrimeTime-SI adds signal integrity analysis
capabilities to PrimeTime.  The first release of PrimeTime-SI focuses on
support for crosstalk analysis which is caused by coupling capacitance
between neighboring nets in 0.18 um and below.  This coupling capacitance
leads to crosstalk and can cause timing failures (such as the speed-up or
slow-down of net delays).

PrimeTime-SI uses static timing techniques along with coupling capacitance
information in the back-annotated parasitics file to compute the speed-up
or slow-down on nets due to crosstalk.  PrimeTime-SI is integrated into the
PrimeTime static timing analysis engine, so it offers high capacity and
performance to analyze crosstalk effects at the block or full-chip level.

As Aart stated at Boston SNUG, PrimeTime-SI is currently available to
limited partners in a beta program.  The formal release of the product is
being targeted for the first half of 2001.  Currently, the focus is on
analyzing timing effects only.  In the future, PrimeTime-SI would be
integrated into the Synopsys physical synthesis solution forming the
backbone of a signal integrity solution throughout the flow. 

    - [ A Dallas Synopsys AE ]


( ESNUG 364 Item 11 ) -------------------------------------------- [02/01/01]

Subject: ( ESNUG 345 #5 )  Well, Synchronicity's DesignSync Works For Us

> We used ClearCase for an ASIC and it works great.  We used DesignSync from
> Synchronicity on one project.  It was a disaster.  DesignSync is like an
> early beta version of RCS that they charge a lot of money for!  We had
> problems with corrupted databases and lost files.  In ClearCase we see
> nothing like this.  It just works in the background as a good revison
> control system should do.  In the past we have used RCS on many projects
> but it started to consume a lot of the designers time to manage the system
> and create scripts so we started to look around for alternatives.  After
> the DesignSync mistake we have settled on ClearCase, which is also the
> system our software designers are using.
>
>     - [ We Got Burned, Too ]


From: Grant Erwin <gwe@cypress.com>

Hi, John.

Personally, I think your readers are being a little tough on Synchronicity's
DesignSync.  It is a part of the very fabric of life here at Cypress, and
we are successful in using this tool to allow us to distribute design work
among Cypress design centers worldwide.  Sure, we have a few problems, and
sure, it takes longer than we'd like to copy data from India or the UK.  We
have to devote CAD resources to its maintenance.  But it is working for us,
and we have made dozens of chips successfully with it.

I do the Synchronicity administration for Cypress's Washington State design
center, and I came from companies which used Clearcase and before that RCS
and SCCS.  I don't have anything really bad to say about any other software
except I haven't ever seen it working where design teams on several
continents successfully design with the data visible everywhere, and have
it work. 

I believe that this tool does have room for improvement, but I just don't
see the horror story scenario here.  I'm not speaking for Cypress, this is
just my own viewpoint.

    - Grant Erwin
      Cypress                                    Washington State


( ESNUG 364 Item 12 ) -------------------------------------------- [02/01/01]

From: [ A Synopsys AC in Dallas ]
Subject: A Tcl Script That Tricks PhysOpt Into Cleaning Up Net Congestion

Hello John,

I wanted to share with your readers a Tcl script a fellow AC here in Dallas
developed to help deal w/ a severely congested design in Physical Compiler.

Here is a little background.  We were using Physical Compiler v1.1 to
synthesize a datapath module which was a performance AND routing headache.
The routing headache resulted from the output pin placements on the
module.  All of the outputs were packed very tightly in the lower right
hand corner of the module, producing a significant congestion "hot spot" in
that area.  Physical Compiler provided several different ways to attack
congestion in the design and all of these had a positive effect.  The issue
we faced was that these approaches work globally on the design.  We wanted
a way to focus the congestion relief algorithms to work specifically in
this one area.  Hence, the creation of the Tcl script.

The Tcl script is a procedure called routing_obs.  This procedure provides
a way to define a region on the die and reduce the routing resources
available within that area.  The user can reduce the routing resources in
the vertical and/or horizontal directions.  This is accomplished by creating
"stripes" of routing obstructions in the region, with the user determining
the width of the stripes as well as their spacing.  The result is that as
Physical Compiler optimizes the design for congestion, this region will be
seen as having less routing resources than the overall design, and the
congestion algorithm will thus work harder.

Example:

      ___________________________________________
      |                                          |
      |                                          |
      |                                          |
      |                ______ {5000 10000}       |
      |               |      |                   |
      |               |      |                   |
      |               |      |                   |
      | {1000 4000}   |______|                   |
      |__________________________________________|


Assume that you had the situation above and the region where you wish to
focus the congestion relief is the smaller box above.  The coordinates of
the lower left hand corner are {1000 4000} and the coordinates of the upper
right hand corner are {5000 10000}.   You have decided to reduce routing in
the horizontal direction only and want to create a routing obstruction 
placed every 300 microns.  The routing_obs routine would be called as:
     
    routing_obs 1000 4000 5000 10000 horizontal 300 horiz_obs 

Note that the width of the routing obstruction would be determined by the
definition of the site variable in the routine.  For this example, each
obstruction would be 13.6 microns wide.  The name of each obstruction will
be the horiz_obsXX where XX would be a counter  value.

To use the procedure, add the section below to your .synopsys_dc.setup file
or create a file with the section and source it as part of your Physical
Compiler setup.

  proc routing_obs {bound_llx bound_lly bound_urx bound_ury method step
  prefix} {
  #
  # Notes:
  # 1) The procedure allows the user to build three types of routing 
  #    congestion grids based on the "method" used when calling the 
  #    procedure
  #      horizontal - builds a grid of obstructions on the horizontal 
  #                   metal layer
  #      vertical   - builds a grid of obstructions on the vertical 
  #                   metal layer
  #      checker    - builds a grid of obstructions on both the vertical
  #                   and horizontal metal layer
  # 2) The procedure assumes the coordinates used in the procedure  
  #    call are in 1 micron units.  This is the value expected by the 
  #    create_obstruction command.   
  # 3) The site variable below is the Y-dimension of your placement 
  #    site defined in the lef file. The site height is used to set 
  #    the width of the routing obstruction.   For this case it seemed 
  #    that a good target was to make a routing obstruction that would 
  #    cover the width of a placement row. The user can set this to 
  #    whatever they wish but this gives a good starting point.
  #
  #    ------> The 13.60 um site value in the script should be adjusted
  #            based on the user's library.
  #
  # 4) The layers used for the routing obstructions are hard-coded below.
  #    Make sure this corresponds to what is defined for your routing 
  #    layers.  In this example, the technology had 5 layers of metal 
  #    organized in an HVHVH fashion.   For this example below, I use 
  #    MET3 for horizontal and MET4 for vertical.
  #   
  #   ------> The user may also have different layer names: M4, MET4, 
  #           or METAL4, and will need to modify the script accordingly.

  #
  # setup default variables
  #
     set site 13.60
     set location_x $bound_llx
     set location_y $bound_lly
     set counter 0

     if {($method == "horizontal") || ($method == "checker")} {
  #
  # set obstructions above start point
  #    -- obstruction are set to be shorter than the cell height
     while {$location_y < $bound_ury} {   
       set temp [expr $location_y + $site] 
       set x1 $bound_llx
       set y1 $location_y 
       set x2 $bound_urx
       set y2 $temp
  #
       set temp_list [concat $x1 $y1 $x2 $y2]
       puts "coordinates = $temp_list"
       create_obstruction -name "$prefix$counter" -layer MET3 -coordinate $temp_list
       set counter [expr $counter + 1]
       set location_y [expr $location_y + $step]
     }
     };# end of the horizontal method
  #
     if {($method == "vertical") || ($method == "checker")} {

  # set obstructions right of start point

     while {$location_x < $bound_urx} {
       set temp [expr $location_x + $site]
       set x1 $location_x
       set y1 $bound_lly
       set x2 $temp
       set y2 $bound_ury
  #
       set temp_list [concat $x1 $y1 $x2 $y2]
       puts "coordinates = $temp_list"
       create_obstruction -name "$prefix$counter" -layer MET4 -coordinate $temp_list 
       set counter [expr $counter + 1]
       set location_x [expr $location_x + $step]
     }
     };# end of the vertical method

     echo "The total number of obstructions created = $counter"

  };# end of the procedure


Using this script, we were able to focus the congestion algorithms and
eliminate the problem.

Two caveats to your readers.  

  1) Use this procedure judiciously.  The congestion relief algorithms in
     Physical Compiler can solve most of your congestion issues, however,
     if you have a design with significant congestion problems in a
     localized are, this Tcl procedure might help!

  2) Whenever you are dealing with congestion issues, you may find that
     displaying the congestion in a GUI allows you to quickly see the
     problem.  I would encourage your readers to use the Physical Compiler
     GUI to visually look at the congestion and observe the effect of using
     this procedure.

Thanks,

    - [ A Synopsys AC in Dallas ]


( ESNUG 364 Item 13 ) -------------------------------------------- [02/01/01]

From: Sai Kishore R <skishore@virtualipgroup.com>
Subject: Is An All NAND Or NOR Gate Lib The Best Lib For Design Compiler???

Hi John,

This is observation on one of my design.

  1. I compiled the design with normal techniques using Design Compiler.

  2. I compiled the same design by putting set dont_use attribute on all
     cells except NAND and NOR gates.  I got good timings with less area
     compared to the previous case.

Is it a unique case for my library or true for all libraries?  If it is
true, why bother with other cells?  Do you have any idea about this?

I tried this with a 16-bit multiplier in WSMC 0.25 technology.

    - Sai Kishore
      Qualcore logic ltd.


( ESNUG 364 Item 14 ) -------------------------------------------- [02/01/01]

Subject: ( ESNUG 363 #11 )  Cadence PBOPT, Synopsys LBO, & FlexRoute Tricks

> As we get a good idea of the block sizes and shapes we floorplan the top
> level, with around 15 blocks & the I/O cells.   We ship the top level off
> to the third party & they insert clocks at the top level.  So far so good.
>
> Now they try to fix the timing on the long nets between blocks using
> QPOPT, and the results are horrendous.  At this point we would've already
> generated timing models for each of the blocks, and they are inputs to
> the QPOPT runs.  In many different attempts at top level timing
> optimization, QPOPT has not been able to put in an appropriate number of
> buffers/repeaters to achieve reasonable timing.  I did some experiments
> with long nets and various numbers of buffers and found that I should be
> able to go 5 mm in about 1.2 nS even with a less than optimum repeater
> scheme.  QPOPT isn't even getting close.  When we talked to Cadence R&D
> about this they basically said that QPOPT isn't intended to do this type
> of optimization.
>
> So my question is, how are other people doing the timing optimization
> (buffer and repeater insertion) at the top level for a hierarchical
> physical flow?
>
>    - Chris Simon
>      General Dynamics Information Systems       Minneapolis, MN


From: [ Intel Inside ]

John,

As usual, please keep me anonymous.

One challenge of any design flow is to build a methodology that works around
any weaknesses that your tools may have.  This is usually not enough.  You
need to do other things to help make your tools/methodologies have less work
to do.  For example, it's standard practice to either flop all signal inputs
or outputs at partition boundaries to get the best synthesis results and to
help top level timing issues.  I don't know if you did this for your design.
You can take this a step further and have all partition inputs and outputs
flopped.  Having no combinational logic between flop boundaries at partition
edges may have some implications on your design, but will surely give your
signals more time to traverse the top level real estate.  You can even go
further and duplicate your output flops to insure that all signal outputs
have a fanout of one input.  This really helps solve top level timing
issues.  Naturally, these have implications to your RTL, but it's food for
thought.  If you can make the tools have an easier problem to solve, less
hand effort will be required.

    - [ Intel Inside ]

         ----    ----    ----    ----    ----    ----   ----

From: "Lee Keep" <Lee_Keep@eur.3com.com>

Hi John,

I read Chris' article on ESNUG with interest as I have been working in the
area of hierarchical physical design timing closure for a few years now.
I have a few questions for my own clarification and some suggestions that
may help.  You may well have tried some of these already - but here goes
anyway....

 1) Timing budgets for inter-block paths

    You say that your block level timing is pretty much OK - but how did you
    allocate timing budgets for paths that cross between blocks during the
    synthesis phase?  Did the timing budget include any allowance for top
    level interconnects based on your 1.2 ns per 5 mm observation?

 2) Top level routing

    Did you attempt any form of top level routing prior to closing the
    timing of the sub-blocks?  Maybe a methodology where you complete
    the routing of your top level floorplan (black box sub-blocks), followed
    by a parasitic extraction and propagation of the values to your
    sub-blocks could help.  Even a top level global route may provide a
    better starting point for you sub-block constraints during synthesis.
    That way, passing down some of the effort at the top level into the
    sub-block which you know is easier to close.

 3) Sub-block I/O buffering

    I personally recommend a strategy where you insert buffers, connected to
    all signal I/O ports within each of your sub-blocks.  We use a dc_shell
    script to do this post synthesis.  These buffers should be given
    priority during sub-block placement to ensure they are placed in a cell
    row as close to the I/O port as possible.  We use Avanti P&R and their
    TDF constraint format that allows these weightings to be applied.  By
    choosing a sensible naming convention for these buffers you can also
    highlight them post-placement to ensure the've gone in the correct
    location.  It's been a while since I used SEDSM but I seem to remember
    something similar can be achieved.  Ensure you'dont touch' these buffers
    during any subsequent optimisation passes as QPOPT may well try to
    remove them at the sub-block level.

    This approach has helped us minimize the number of repeaters required in
    the top level layout - but some of our longest top level nets still
    needed some manual work.

 4) Repeater insertion / ECO placement

    You indicate that QPOPT can't insert enough buffers to do the job.  How
    densly placed is your logic?  I've seen placement utilisations so high
    that prevent the optimiser from inserting the number of buffers it
    wants.  However, I expect it's more to do with the tool running out of
    steam.  I also remember hearing a Cadence get-out clause that these
    tools could only provide incremental timing improvements of 5-10% -- and
    those were the days of PBOpt.  Seems this benchmark probably holds true
    today.

    How happy are you that your repeaters are being placed in a sensible
    location by QPOPT?  We use Synopsys LBO to fix our timing broken netlist
    (as opposed to a layout-engine based optimiser such as QPOPT or Saturn).
    We found that LBO was able to add a sufficent number of repeaters, but
    when it came to the ECO placement -- they were going in the wrong place
    -- sometimes causing the timing of a particular path to get even worse.
    The suggested location in our PDEF was not honoured by the ECO placer,
    requiring some manual placement work.

 5) Sub-block pin optimisation

    When you floorplan the top level, are you doing the pin optimsation of
    sub-block interfaces or is the third party?  Just wondering if sub-block
    port locations are as optimal as possible in you floorplan?  I guess
    with 15 blocks you are constrained in many directions when is comes to
    this so finding the optimal solution is tough.

I don't think there's a magic solution here that will save the day yet - the
best you can achieve with many of these tools is to get the amount of manual
repeater tweaks into the ten's rather than hundreds/thousands.

BTW, what clock speeds and process geometery are we talking here?

    - Lee Keep
      3Com                                       UK

         ----    ----    ----    ----    ----    ----   ----

From: [ A Synopsys FlexRoute AE ]

John,

I am a member of the Synopsys FlexRoute CAE team, and have been working on
top-level repeater/buffer insertion within FlexRoute for  the last 6 months
or so.  This capability just became available in our latest (Rev1.5)
FlexRoute release as of January 26, 2001.  We have done extensive in-house
testing, and are confident in our algorithms, but I must admit that no
customer has used it on a production design yet. 

FlexRoute is a gridless router, designed specifically as a top-level router
in a hierarchical system.  We knew all along that repeater insertion was
critical, and have been working on it for some time.

There are two basic modes, timing-driven and length-based, each of which
I will describe briefly.

Timing-Driven Repeater Insertion
--------------------------------

1. Requires a TBEF (Timing Based Exchange Format) constraint file that
   contains the following info for each top-level net:

   a. driver cell name, and hierarchical RC tree representation from inside
      the hierarchical block to the top-level pin of connection on that 
      block (usually on the edge of the block, but not required).
   b. receiver cell name(s), and also hierarchical RC info as described
      above, also includes a section to describe the arrival time budget
      and required input slew rate.

   Note: This is an ASCII format which can be easily generated with PERL
         scripts etc., future FlexRoute versions will derive this info
         directly from a design .db and/or PrimeTime STAMP/ILM models.

2. Requires a .db timing library database for the standard cells to be
   used for repeater insertion, and the driver and receiver cells.

3. The most useful option we have found from customer feedback is a rise 
   time (the same as a Max. Transition DRC check in DC/PC) optimization. 
   The user specifies a list of inverters and buffers which can be used, 
   and FlexRoute will insert the inverters or buffers as appropriate (will
   not change signal polarity of course).

4. The end result are legal, non-overlapping repeater locations, based on 
   defined DEF ROW/SITE locations.  No placement legalization step is
   required.

5. We have tested this on a variety of net types on large designs, including
   200 pin reset/scan_enable type nets, which get reasonable solutions of
   10-20 buffers, with all receiving pins meeting the rise time spec.

Length-Based Repeater Insertion
-------------------------------

1. All that is required is specification of a single inverter cell and 
   single buffer cell.

2. The design team must select these cells, and a specified "length" which
   will meet their rise time or other timing goals.

3. This is obviously very fast, but shows good promise for top-level
   repeater insertion.

4. The end result is the same as in Timing-Driven Repeater Insertion,
   legal, non-overlapping repeater locations.

One of the main advantages of both of these algorithms are that they are
based on FlexRoute coarse or detailed route net topologies, which fully take
into account all routing obstacles, as opposed to other techniques that may
use simplistic Steiner estimates for routes.  The end result are buffer
locations that take into account routability, with stable and predictable
results that lead to timing closure.

    - [ A Synopsys FlexRoute AE ]


( ESNUG 364 Item 15 ) -------------------------------------------- [02/01/01]

Subject: ( ESNUG 363 #14 )  Pass-Thrus In Hierarchical Physical Designs

> We are having our first experience with a large hierarchical design and
> are trying to find the best way to handle global routing of signals that
> cross blocks.  We don't think the top-level routing ability of Apollo will
> do a good enough job, and it also won't allow us to place buffers or flops
> where we need them.  We are thinking instead of embedding pass-thrus in
> different blocks where needed to better facilitate the travel of these
> global signals across the chip.  As best we can tell, this will require
> adding the pass-thru connectors (and flops if we use them) to the RTL of
> the blocks containing the pass-thrus.  However, we'd prefer not to mess
> directly with the RTL that the designers are working with.
>
> One idea we had was to put wrappers around the top-level RTL blocks
> and add the pass-thru's (and flops if needed) to the wrappers...
>
>       - Jeff Winston
>         Conexant Systems


From: [ Puff, the Magic Dragon ]

Hi John,

First of all, please keep me anonymous.

One proven way to handle top level net timing is to insert repeaters on a
wire length base. This means that you insert a buffer every X um of wire.
The number is calculated such that the transition at the input of the
receiving buffer won't be less than your max_transition rule assuming a
typical transition time at the input of the transmitting buffer.

Now, you can think of two cases:

  1. You have channels between the blocks, where the top level nets route.
     In this case you should put these repeaters in the channels.  This is
     tricky, since usually you don't have placement regions there, and
     connecting the buffers to the supplies isn't automatic.

     One way to overcome this is to always define some placement regions
     at the top level, around the blocks.

  2. You route the top level nets through the blocks.  In this case, adding
     the buffer is easy.  You just preplace the buffer where you want, and
     let the P&R do the rest.

Best Regards,

    - [ Puff, the Magic Dragon ]


( ESNUG 364 Networking Section ) --------------------------------- [02/01/01]

San Diego, CA -- Magis Networks (pre-IPO) needs layout & verification gurus
for high-speed wireless home networking chips.  "gbell@magisnetworks.com"

Marlborough, MA -- Axiowave Networks (pre-IPO) seeks ASIC and board-level
design and verification engineers. No headhunters. "ewhitney@axiowave.com"


============================================================================
 Trying to figure out a Synopsys bug?  Want to hear how 11,000+ other users
    dealt with it?  Then join the E-Mail Synopsys Users Group (ESNUG)!
 
       !!!     "It's not a BUG,               jcooley@world.std.com
      /o o\  /  it's a FEATURE!"                 (508) 429-4357
     (  >  )
      \ - /     - John Cooley, EDA & ASIC Design Consultant in Synopsys,
      _] [_         Verilog, VHDL and numerous Design Methodologies.

      Holliston Poor Farm, P.O. Box 6222, Holliston, MA  01746-6222
    Legal Disclaimer: "As always, anything said here is only opinion."
 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)