Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS


  > I loved Gregg Lahti's letter the instant I read it.  He wrote about
  > using VCS again after not using it for 4 years.  He was disappointed
  > with its primitive GUI, that PLI 2.0 was missing, and that VCS lacked
  > useful user documentation.  ...
  >
  > I called Mark Warren at Synopsys.  I grinned as I told him that he should
  > thank Gregg for starting such a useful VCS discussion.  "Yea, yea, yea,"
  > Mark cynically replied.  "I'll send Gregg a fruit basket from Synopsys."
  >
  >     - from "Life In The Fast Lane"  (EE Times 12/03/01)


  From: "Gregg Lahti" <gregg.lahti@corrent.com>

  Hi, John,

  I received two fruit baskets from the Synopsys people!  One from Mark
  Warren (the VCS Marketing Manager) and the other from Tim Schneider (our
  local AE).  The baskets were great, and my coworkers closed in on both
  of them like hungry piranhas.  Because it was inedable, I got to keep
  the yellow lightbulb VCS squishy thing that Tim threw in.  I assume it's
  to squeeze when I get irritated at my failing VCS sims or it's a really
  odd cat toy.

  Now I wonder what I would have received if I had complained really hard
  about PhysOpt, BSD Compiler, DC ...   ;^)

  Enclosed is a picture; maybe you could it put up on DeepChip?

      - Gregg Lahti
        Corrent Corp.                            Tempe, AZ


( ESNUG 387 Subjects ) ------------------------------------------- [01/23/02]

 Item  1: ( ESNUG 386 #10 ) Another Customer Pleads For C-Level's Technology
 Item  2: ( ESNUG 386 #3 ) Simplex, Star-RC, HSIM, Calibre, Spectre, & SPICE
 Item  3: ( ESNUG 386 #1 ) This PhysOpt Bug Only Involves Buffers/Inverters
 Item  4: ( ESNUG 385 #11 ) Can't Duplicate This "insert_dft -physical" Bug
 Item  5: Cadence Ambitware & Synopsys DesignWare vs. Rolling Your Own IP
 Item  6: ( ESNUG 386 #5 ) Readers Review The "Advanced Chip Synthesis" Book
 Item  7: Get Cadence Dracula To Accept The DFII Database Instead Of GDSII
 Item  8: How Do I Eliminate Extraneous DC Clock-Gating Hierarchy/Wrappers?
 Item  9: Burned By "interface_timing"; Where Can We Learn Advanced LIBERTY?
 Item 10: A Free Alternate User-Written Tcl Design Compiler / PrimeTime GUI
 Item 11: Where To Download Specman "e" Mode Editors For Either Emacs Or VIM
 Item 12: Seeking A Code Coverage Tool That Covers Schematic Entry Blocks
 Item 13: ( ESNUG 385 #14 ) Get The Absolute Minimum PLI With Cadence NC-SIM
 Item 14: ( ESNUG 386 #16 ) BSD Compiler Is Getting Better In Fits & Starts
 Item 15: ( ESNUG 386 #6 ) Formality And Verplex Each Have Their Own Issues
 Item 16: How You Code Latches & Flip-Flops *Greatly* Impacts VCS Runtimes

 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com


( ESNUG 387 Item 1 ) --------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #10 ) Another Customer Pleads For C-Level's Technology

> C-Level Design had an incredible tool which allowed you to write your HDL
> in ANSI-C or C++.  Their programming style guide showed you how to
> transform sequential C into something that could emulate the parallel
> operation of hardware.  The key was that you could write the C-HDL at a
> low level (identical to Verilog or VHDL), but C simulation speeds were
> 3 orders of magnitude faster on large simulations.  This C-HDL design
> could then be automatically translated to synthesizable Verilog - which
> synthesized to remarkably fast logic using Synopsys Design Compiler. ...
>
> If Synopsys takes this C technology and creates the next generation HDL
> that is capable of simulating at these remarkable speeds, I will be
> very happy.
>
>     - Dan Joyce
>       Compaq                                     Austin, TX


From: "Doug Blettner" <dblettner@yottanetworks.com>

Hi John,

I just read Dan's ESNUG post and think it's right on.  Thank Dan, for
taking the time and effort to post it.  I hope it clears up some of those
misconceptions in the engineering community about C-Level's tools.  I, too,
thought it was great technology.  We didn't get to complete a project with
it, but we did buy it and were starting to use it on some projects.  (This
was at another company, not Yotta.)  My engineers found the learning curve
to be mild, liked using the tool, and got very good performance improvement.
(The projects didn't complete because of business reasons.)

When I heard about the Synopsys purchase I had the same reaction as Dan.  I
hope the technology continues, but I'm afraid it will suffer the same fate
as MOTIVE, which we owned and were using successfully until Synopsys bought
and killed it in favor of PrimeTime.

Do you know if it's still possible to buy the C Level tools now?

    - Doug Blettner
      Yotta Networks, Inc.                       Plano, TX


( ESNUG 387 Item 2 ) --------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #3 ) Simplex, Star-RC, HSIM, Calibre, Spectre, & SPICE

> There is no known Calibre/Star-RC flow.  Avanti and Mentor have been
> discussing the possibility of this flow.  I'm curious if other readers
> would be interested in it; we are very willing to do this if customers
> request it.  Star-RC (and Star-RCXT) right now only reads LVS data
> from Hercules.
>
>     - John Lee
>       Avanti                                     Fremont, CA


From: "Joel Jensen" <JensenJ@sharpsec.com>

I use Calibre for LVS and Star-RC (not XT) for extraction and would like to
use the two together for custom designs.

Count my vote.

    - Joel Jensen
      Sharp Microelectronics                     Camas, WA

         ----    ----    ----    ----    ----    ----   ----

From: "Grego Sanguinetti" <grego@accelerant.net>

Hi, John,

Please count us in as an interested customer.

We will soon be attempting to do an evaluation of LPE tools.  The two tools
at the top of the stack are Simplex and Avanti.  We currently own Calibre LVS
but not Hercules.  It's going to be difficult to do an eval of Star-RCXT
without getting netname correspondance information out of an LVS run.

Simplex claims to be able to do this.

Ultimately we would like to get a backannotated netlist so that we can feed
it into either Spectre or Hsim, which is what we currently do with Xcalibre.

Our semicustom flow will be easier to deal with because it will be able to
use *SPF type output.

But we are a mixed signal house.  At one point or another we have to get
down to a transistor level and get as accurate as possible.

We are currently taking the SPICE netlist output from Xcalibre LPE/LVS runs
and doing our own netlist reduction and inductor extraction.  We then
rebuild the SPICE netlist (to include whichever LPE subcircuit the user
wants) and feed it the recombined netlist to HSIM.  If the user wants to
use Spectre or SpectreRF, then we use a tool built around Gemini to map
naming so that they can do hierarchical probing.  That way we can probe
internal nodes without having to know which nodes are of interest prior
to the LPE run.

It is always tough evaluating LPE tools at the transistor level, but having
a correspondence file to start with gives you a head start.  Otherwise
it is a daunting task to figure out what you are looking at.

    - Grego Sanguinetti
      Accelerant Networks, Inc.


( ESNUG 387 Item 3 ) --------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #1 ) This PhysOpt Bug Only Involves Buffers/Inverters

> One of the ACs in San Diego told me about a situation where PhysOpt can
> put "extra inversions" in your netlist and create bad logic.  This
> problem can occur if your library has certain cell that can act as *both*
> a buffer and an inverter.
>
>     - Mike Montana
>       Synopsys, Inc.                           Dallas, TX


From: "Chris Kiegle" <ckiegle@us.ibm.com>
To: "Mike Montana" <montana@synopsys.com>

Hi Mike,

I saw your ESNUG 386 #1 post and I wanted to check on something....

You said it is only a problem if the pins are in a different order.  All
libraries have to have an inverter.  If they also have a buffer cell, there
is always going to be mismatch on order between the multi-output cell and
one of the two (inverter or buffer), right?  Does this problem also impact
sequential elements??  In other words, if I have 3 D flip flop cells, one
that has D output, one that is an ~D output, and one that has both outputs,
will it also hit this problem??

I'm just wondering if there is any relation to the library functional
descriptions.

    - Chris Kiegle
      IBM Microelectronics                       Burlington, VT

         ----    ----    ----    ----    ----    ----   ----

From: "Mike Montana" <montana@synopsys.com>
To: "Chris Kiegle" <ckiegle@us.ibm.com>

Hello Chris,

Here are some more details regarding the bug documented in ESNUG 386 #1.

First of all, let me assure you that the bug does NOT impact optimization
of sequential cells or combinational cells.  The problem is strictly
limited to optimization of buffers/inverters.

I've spoken with the product team to get more specifics of when this
problem occurs.  The problem ONLY occurs when the PhysOpt optimization
engine tries to replace a simple buffer (one input and one non-inverted
output) or simple inverter (one input and one inverted output) with a
complex buffer (one input with a non-inverted and inverted output).  All
other types of optimizations with these cells work fine.

Keep in mind that this bug will be fixed in a PhysOpt patch release
scheduled for EST availability the first part of February 2002.

    - Mike Montana
      Synopsys, Inc.                           Dallas, TX


( ESNUG 387 Item 4 ) --------------------------------------------- [01/23/02]

Subject: ( ESNUG 385 #11 ) Can't Duplicate This "insert_dft -physical" Bug

> 1) insert_dft seems to spend a *long* time, fixing DRC violations.  We
>    are getting better runtimes and lower gatecounts with insert_scan
>    followed by a 'physopt -incremental -eco'.
>
>     - Neel Das
>       Corrent Corp.                              Tempe, AZ


From: "Andrew Copper" <acopper@synopsys.com>

John,

So far we have not seen any such issues in-house or from other customers
using "insert_dft -physical" concerning long run-times to fix DRCs or
larger gate counts (as compared to using a combination of "insert_scan
-physical" and "physopt -inc -eco" commands.)

The "insert_dft -physical" command automatically performs any necessary
placement legalization and timing optimization apart from fixing
testability violations.  It replaces the need to run three separate
commands of "insert_scan -physical" followed by "legalize_placement -eco"
and "physopt -incremental".

Maybe Neel could send us a test case?

    - Andrew Copper
      Synopsys, Inc.                             Mountain View, CA


( ESNUG 387 Item 5 ) --------------------------------------------- [01/23/02]

Subject: Cadence Ambitware & Synopsys DesignWare vs. Rolling Your Own IP

> Is there a timing performance penality using the Synopsys DesignWare vs.
> trying to code the module at a finer granularity?  For instance, an
> adder.  There are various adder architectures to trade-off, but I suspect
> DW uses one approach, but have not researched this.
>
> What about comparators to check for limits reached?
>
> Any folks gown through this trade-off?  i.e. Is the synthesis tool smart
> enough to use one approach over the other based on the width of the
> operands?
>
>     - Jim O'Keefe


From: Andrew MacCormack <andrewm@tality.com>

DesignWare does do trade-offs of different adder architectures for you, as
does the Ambitware stuff in the Cadence synthesis tool.  If you really must
use DC, you can improve its default results (if you have the appropriate
licence) by adding the following:

           set synthetic_library "dw_foundation.sldb"
           set dw_prefer_mc_inside true

Why isn't this the default, then?  Because you need to have an extra DC
licence to use it.  Don't recall which one, sorry.  There is a similar
"datapath" option when running Ambit which I think is specified on the
ac_shell command line.

    - Andrew MacCormack
      Cadence/Tality                             Livingston, Scotland

         ----    ----    ----    ----    ----    ----   ----

From: Lars Rzymianowicz <larsrzy@ti.uni-mannheim.de>

Yup, Designware does do trade-offs and it does a pretty good job on it (as
Ambit might do, too.  ;-)  I wouldn't try to implement arithmetic stuff on
your own.  Only full-custom multi-bit cells would be better, I guess.

Concerning the special DC license you're talking about, I think it's
DC-Ultra.

Some useful things are:

  - report the selected implementation after a compile w/ "report_resources"
  - you can force DC to use one implementation with "set_implementation"
  - you can give preferences to different DW implementations

The DesignWare User Guide is also very good, just have a look at it.

    - Lars Rzymianowicz
      University of Mannheim                     Germany

         ----    ----    ----    ----    ----    ----   ----

From: "Ruei-Shiang Suen" <rueishiang1@attbi.com>

For adder/subtractor/inc/dec, DW does very good job.

For more advanced components, if you do not use all the functions provided,
you will get timing/area penalty.

    - Ruei-Shiang Suen

         ----    ----    ----    ----    ----    ----   ----

From: chrispy@synopsys.com (Chris Papademetrious)

Just a quick response to this.

DesignWare is specifically constructed such that, if you are not using
certain features of components (various status flags in FIFOs, particular
flags in a 6-flag comparator, etc.), the logic will automatically be
optimized away.  No dangling logic will be left to implement features which
are unused/disconnected.  This is also true when you don't use all the bits
of an arithmetic operator, such as just using the MSBs of an adder.  The
mechanism by which this works is boundary optimization.

Also there was discussion about what licenses were needed for:

           set synthetic_library "dw_foundation.sldb"
           set dw_prefer_mc_inside true

The only license needed for this is a standard DesignWare-Foundation, which
is almost always packaged with a Design Compiler seat.  Although the second
line enables the advanced Module Compiler datapath engine to construct the
datapath in Design Compiler, no Module Compiler license is necessary - just
a seat of DC, combined with DesignWare Foundation.

However, you do need DC-Ultra for the carry-save transform stuff which
combines multiple arithmetic operators (the 'partition_dp' command). 

    - Chris Papademetrious
      Synopsys, Inc.                             Mountain View, CA


( ESNUG 387 Item 6 ) --------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #5 ) Readers Review The "Advanced Chip Synthesis" Book

> Someone suggested to me a book named "Advanced Chip Synthesis" by Himanshu
> Bhatnagar from Connexant.  It's summary says that the book explains about
> the advanced chip synthesis techniques using the Synopsys tools from DC to
> PhysOpt to Primetime to Formality.  Has anyone read this book?  Is it any
> for knowing about different synthesis methodologies?  We have Synopsys
> manuals and documents; is this book any different from what we can learn
> from them?  I would appreciate comments on ESNUG on this.
>
>     - Jay Pragasam
>       Brecis Communications


From: [ Dr. Pepper ]

John, keep me anonymous.

IMHO, the book is just not worth it.  It's way overpriced and does not give
any insights at all.  You are much better off reading the online SOLD docs
from Synopsys. 

    - [ Dr. Pepper ]

         ----    ----    ----    ----    ----    ----   ----

From: "Suresh Gopalrathnam" <suresh@synplicity.com>

Hello John,

I saw Jay's query in ESNUG regarding this book by Himanshu.  We have it here
and I have read half of the book.  I find it to be pretty outdated and some 
sort of a quick-guide to Synopsys manuals.  It has basically condensed many 
Synopsys manuals into this one book.  I don't think it will be useful for 
someone who has quite a bit of experience with synopsys DC and PhysOpt.

    - Suresh Gopalrathnam
      Synplicity, Inc.

         ----    ----    ----    ----    ----    ----   ----

From: David Tester <d.tester@iee.org>

Hi John.

We saw preliminary versions of many chapters from this book when I was at
Conexant.  Like most VLSI books there are some good ideas, but they're
buried deep and covered with much that you'll already know.  Probably a
better investment (time and money) to read the Synopsys manuals again.

    - David Tester

         ----    ----    ----    ----    ----    ----   ----

From: "Jim O'Keefe" <jok@erols.com>

Hi, John,

I needed a jump start for PrimeTime.  I had used LSI logic's static path
analysis tool in the past and had been out of the full flow for a bit.
Mostly was doing back-end and sims for a foundry which was behind the
times.  For a seasoned person, the book might not be very good.  But, for
a person starting out or even as a good reference if you don't do
synthesis every-day, I think it is very good.

It gives you some good tips and at least you can develop a strategy.  If
a strategy is in place the book helps newbies understand why.

I would vote to have the book in your group's library.

    - Jim O'Keefe

         ----    ----    ----    ----    ----    ----   ----

From: Bret Daline <BDaline@focusinfo.com>

This book may be worth reading, but it's not worth buying.  I read it.  I
would not recommend it, especially at $115.  I generally agree with the
somewhat negative reviews at amazon.com, where I bought my copy.  If I
haven't talked you out of reading it, I'll give you a great deal on a copy
in real good shape.  Stick with SOLD and SNUG papers -- they are cheaper
and better.

    - Bret Daline
      Focus Enhancements, Inc.                   Hillsboro, OR


( ESNUG 387 Item 7 ) --------------------------------------------- [01/23/02]

Subject: Get Cadence Dracula To Accept The DFII Database Instead Of GDSII

> I recently modified my existing Dracula rules file to accept DFII database
> instead of GDS2.  I am not able to successfully get it to work.  Pasted
> below is the error message.
>
>   */N* CDSIN43  ( REV. 4.8.03-2000 / SUN-4 S5R4  /GENDATE: 22-MAR-2000/ )
>   */N* EXEC TIME =08:34:20       DATE = 8-JAN-2002     HOSTNAME = faith
>   *INFO: This CDSIN works with 4.3.4/4.3.3 dfII
>   *INFO: Loading dbRead.cxt skillCore.cxt 
>   *Note: Loading would fail and CDSIN would abort if the context version
>          is not compatible.
>   Cannot open LIB dp_mod
>
>   ** ABORT AT ROUTINE : CDSIN43^@      STATEMENT # =  104
>
> I am wondering why CDSIN43 is being used instead of CDSIN44 (my version #
> is 4.4.5).  Also, if the path to the library is /<path>/library, should
> the following variables in the description block be set to.
>
>     1. INDISK = /<path>  
>        LIBRARY = library 
> or
>
>     2. INDISK =/path/library  
>        LIBRARY = library ?
>
> Thanks for your earliest attention.
>
>     - Ashwin Krishnakumar
>       University of Utah


From: andrewb@cadence.com (Andrew Beckett)

Hi Ashwin,

You need to have

                 SYSTEM=CADENCE_44DD

in the description block rather than SYSTEM=CADENCE.  (Check this in the
manual in case my memory is failing me.)

    - Andrew Beckett
      Cadence

         ----    ----    ----    ----    ----    ----   ----

From: Gopi Bandaru <gbandaru@cadence.com>

Ashwin,

Check the SYSTEM command in the Description block of your Dracula rule
file.  It should be set to 'Cadence_44dd' for reading in dfII 4.4.x data.
If SYSTEM is set to 'Cadence', it will default to 4.3.4 dfII data.

    - Gopi Bandaru
      Cadence


( ESNUG 387 Item 8 ) --------------------------------------------- [01/23/02]

From: "Gregg Lahti" <gregg.lahti@corrent.com>
Subject: How Do I Eliminate Extraneous DC Clock-Gating Hierarchy/Wrappers?

Hi John,

We're using the clock gating and scan insertion features in DC with a custom
clock gating cell.  I've specified the clock_gating_style as a sequential
latch with control_point set to before and some appropriate bitwith and
max_fanout settings.  Works fine through synthesis, but the output result
from DC adds a boatload of unique SNPS_CLOCK_GATE* wrappers around the clock
gating cell for all instances where the clock gating cell would be added:

  module SNPS_CLOCK_GATE_HIGH_hdwb_0 ( CLK, EN, ENCLK, TE );
  input  CLK, EN, TE;
  output ENCLK;
      GCKAD4 latch ( .TE(TE), .EN(EN), .CLK(CLK), .ENCLK(ENCLK) );
  endmodule

Power Compiler adds this hierarchy at elaboration to attach a batch of
attributes required to map the clock gating cells, presumably for clock
gating and scan insertion.  However, when going through the layout flow,
all of these modules represent an extra layer of hierarchy that needs to
get processed.  Is there a way eliminate this extra layer of hierarchy and
still preserve the clock gating and scan attributes?  It seems to me that
DC should just instance the cell and attach the attributes to the instance.
Removing this layer of hierarchy would really cut down the amount of
instances that need to get processed further down the layout methodology
path.

    - Gregg Lahti
      Corrent Corp.                              Tempe, AZ


( ESNUG 387 Item 9 ) --------------------------------------------- [01/23/02]

From: "Vance Harral" <vanceh@ia.nsc.com>
Subject: Burned By "interface_timing"; Where Can We Learn Advanced LIBERTY?

Hi John,

Our group has a basic understanding of how to build .lib files for our custom
cell designs, with timing and power data.  However, we've been scared by a
few gotchas on our recent designs -- the latest of which was the magic
"interface_timing" attribute.  These experiences led us to ask, "How does
one become a LIBERTY guru, anyway?"  After all, wading through the reference
manual only gets you so far...

Synopsys has a class, but the next available one has been cancelled due to
lack of interest, and the course description seems a bit on the basic side
for our needs, anyway.  So, does anyone out there in ESNUG land have info
and/or recommendations on "advanced" LIBERTY training?

    - Vance Harral
      National Semiconductor


( ESNUG 387 Item 10 ) -------------------------------------------- [01/23/02]

From: Alexander Gnusin <alexg12@tclforeda.net>
Subject: A Free Alternate User-Written Tcl Design Compiler / PrimeTime GUI

Hi, John,

I just put up on my site an alternate GUI for DC/PrimeTime users.  It's at:

            http://www.tclforeda.net/syntools/SynView.htm

You start this GUI directly from dc_shell-t by typing a "synview" command.

    - Alexander Gnusin
      www.TCLforEDA.net


( ESNUG 387 Item 11 ) -------------------------------------------- [01/23/02]

Subject: Where To Download Specman "e" Mode Editors For Either Emacs Or VIM

> I need a Specman e editor with highlighting and edit process completion.
> If you also know how can I link an editor with Specman, please send me 
> a sign.  Thanks.
>
>     - Stefan


From: Srinivasan.Venkataramanan@intel.com

Have you tried Emacs with Specman Mode?  It works fine for me.  Also I
believe there is a VIM mode file, see: http://www.specman-mode.com

    - Srinivasan Venkataramanan
      Silicon Systems (Intel)                    Bangalore, India


( ESNUG 387 Item 12 ) -------------------------------------------- [01/23/02]

From: "Mathias Kohlenz" <mathias_news@kohlenz.com>
Subject: Seeking A Code Coverage Tool That Covers Schematic Entry Blocks

I'm currently looking for a code coverage tool which is able to handle
blocks designed with schematic entry.  This is probably not possible for
the schematic design itself, however for the synthesized design it should
show which gates were driven.  So far, we used "Nccov" which came with
the latest release of the Cadence NC-Verilog simulator, however it neither
handles primitive instantiations nor "assign" statements.

    - Mathias Kohlenz


( ESNUG 387 Item 13 ) -------------------------------------------- [01/23/02]

Subject: ( ESNUG 385 #14 ) Get The Absolute Minimum PLI With Cadence NC-SIM

> By running VCS with only the Signalscan PLI compiled in VCS (but not used),
> I got a speed-up of 8 to 10 percent on a 3 Mgate RTL design and close to
> 20 percent on a 500 kgate RTL design.
>
> By not compiling in any PLI routines I got a speed-up of 42 to 48 percent
> on both my small and large design.
>
>     - Anders Nordstrom
>       Nortel Networks, Ltd.                      Ottawa Canada


From: "Kathleen Meade" <meade@cadence.com>

Hello John,  

Thought I'd respond to ESNUG 385 #14 and explain how to use the options in
NC-Sim to limit access control for optimal performance but allow access
that is required for Tcl scripts and PLI applications.

In general, NC-Sim runs with minimal debugging capability. To access signals,
modules and lines of code during simulation, global options can be applied
for read, write and connectivity access:

   %  ncelab  -access r|w|c  <other_options>  
   %  ncverilog  +access+[rwc]  <other_options>

If you want to allow/limit access for specific objects, modules or instances,
you create an access file and specify access for those objects.  Then you
use the -afile option when running the elaborator.

   %  ncelab -afile access.txt test.top <other_options>

If you know that your simulation run will require some access for Tcl, PLI
or probing that aren't known in advance (i.e. the PLI calls or probes that
aren't in your code) but you don't know which objects are affected, you can
automatically generate an access file.  To do this, use the -genafile option
when you invoke the elaborator.  When you simulate, the objects that are
accessed by Tcl or PLI applications are monitored along with the types of
access required for each object, and when you exit the simulation, an access
file is created.  For example:

  %  ncelab -genafile access.txt test.top <other_options>
  %  ncsim test.top <other options>

After the simulation completes, the access.txt file is generated. You then
use the -afile option in future runs to control access (like above): 

  %  ncelab -afile access.txt test.top <other_options>

If you are using ncverilog, the command line options are:

  +ncgenafile+access_filename and +ncafile+access_filename.

I hope this helps your readers!

    - Kathleen Meade
      Cadence Design Systems                     Atlanta, GA


( ESNUG 387 Item 14 ) -------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #16 ) BSD Compiler Is Getting Better In Fits & Starts

> I wouldn't call it perfect as of yet, but BSD Compiler works and it is now
> automated enough that a even co-op student can use it in our design flow.
>
>     - Shin Wu, ASIC Division
>       Atmel Corporation                          Columbia, MD


From: [ The Cat In The Hat ]

Hi, John,

We, too, use BSD Compiler for JTAG insertion on our ASICs and have had our
share of problems.  I thought I might share some comments and, if you don't
mind, ask some questions.  I must remain anonymous here.

> 1.) BSD Compiler was not able to insert Boundry Scan with the design core
>     in the netlist, contrary to BSD Compiler User Guide's claims.  We
>     overcome this by removing the core before inserting Boundry Scan.

I did not try to run the insertion with the full core.  I used a core shell
module to speed up the compile time.  For simulation, I used assigns for
driven/driving ports in the core.

> 2.) There was the problem of not being able to pass the simulations with
>     the vectors generated by BSD Compiler to verify synthesized 1149.1
>     components.  We overcome this by breaking up vectors to verify
>     individual components, such as TAP, BSR, TDR and reset mechanism.
>     Somehow vectors generated this way did not have any problems in
>     simulations.

What tests in particular failed when run as a whole?

I did not see any problems, but then I am not sure that I have covered
everything as there doesn't seem to be any fault coverage type numbers
to be had.

> 3.) BSD Compiler had problems with our Synopsys Atmel libraries as well.
>     It could not synthesize functional 1149.1 components when input
>     buffers supported both true and complementary outputs going into the
>     array.  It couldn't implement the HIGHZ instruction with bidi buffers.

We are having trouble getting BSD to verify our LVDS pad.  When running
check_bsd, the tool complains about the pad having no path from a-to-y.  We
are using TI as a vendor.  The pad works fine in simulation, but the tool
cannot seem to work with it.  In order to go on to generate BSDL and test
vectors, I have to replace the LVDS with dummy non-differential pads to get
past check_bsd.  Our Synopsys AE is working on finding a solution.
Unfortunately, BSD Compiler does not have as much visibility as Design
Compiler or PrimeTime.

Do you have any problems with differential input pads, or ones that have
power-down inputs and float outputs?

Our bidirectional pads work okay if one is careful about how they handle
the tristate control pin on the output pad cell.  The tool needs to be
able to control the tristate control pin from the tap for the HIGHZ test
to work.

The tool also cannot support multi-channel pad modules, such as those
used in highspeed interfaces.  We are using SERDES modules which have up
to eight-channels and the tool cannot interpret their function.  I had to
create dummy pad cell arrays to do the JTAG insertion and then restore
the SERDES modules afterwards.  I have asked them to support these modules
in future releases and they said they would try, but no real commitment.

I am presently using 2001.08-SP1 on the advice of my Synopsys AE.

Anyway, just thought I would share my experiences and hopefully the ESNUG
readers can help with my BSD Compiler questions.

    - [ The Cat In The Hat ]

         ----    ----    ----    ----    ----    ----   ----

From: "Gregg Lahti" <gregg.lahti@corrent.com>

Hi, John,

I had the opportunity to use BSD Compiler on our last chip.  I agree with
Shin Yuan Wu that BSD Compiler is usable and getting better.  Some things
that I found that made life more interesting using the 2001-08 rev of
BSD Compiler:

 1) Insertion of Boundary Scan with the core wasn't an option.  We built a
    dummy of the core module for it to work on instead.  We also had issues
    with our pad cells, where we used RTL-style models for functionality in
    the BSD test bench creation.  Unfortunately, BSD Compiler also included
    the guts-level functionality modules of the pad cells in the output
    Verilog netlist.  A minor tweak which corrected this problem was to
    remove all pad designs before we wrote out the Verilog.

 2) Convincing BSD Compiler to not insert MUXing in the input path
    (observe-only function) wasn't as clear cut as it could have been.  Even
    though I specified the type 4 input Boundary Scan cells to use, BSD
    Compiler wouldn't use them unless I specified a huge input delay to
    "persuade" the tool to use the correct type of Boundary Scan cells.
    This behavior is spelled out in one page of the docs, but jumping
    through this hoop seems redundant and somewhat broken, IMHO.

 3) I had a case somewhere in the debugging of #2 where setting:
   
           set_fix_multiple_port_nets -all -buffer_constants

    wasn't being honored to not include assign statements in the Verilog.
    Of course this is a kludge to fix the real issue of Apollo's crappy
    netlist reader, but it still caused me to hand-insert buffers on nets
    so I could get the netlist into layout.  At some point while fixing #2
    and replacing the type-2 & type-7 cells with type-4 cells, the problem
    vanished.

 4) It would be a nice feature in BSD Compiler to allow multiple TAP
    controllers.  It wasn't clear how we could achieve this using BSD
    Compiler, so we built a separate muxing block to route the JTAG
    pins appropriately to the second TAP controller in our block for our
    embedded uP's.

My local AE (Taimei DeZeeuw) and the San Diego AE (Robert Moussavi) spent
some time helping us get the initial scripts in place which got us going
pretty quickly and saved us from initial headaches.  Overall, BSD Compiler
worked as expected and provided a test bench which worked well.  Some
minor annoyances were found, but it did the job and didn't cause me to
lose too much sleep in the process of using it.

    - Gregg Lahti
      Corrent Corp.                              Tempe, AZ


( ESNUG 387 Item 15 ) -------------------------------------------- [01/23/02]

Subject: ( ESNUG 386 #6 ) Formality And Verplex Each Have Their Own Issues

> I was successful in running both tools and they both gave me the correct
> results.  However, Formality was easier to use.  It had better debugging
> capabilities, and I liked the fact that it has a TCL interface.  The
> performance results also clearly favored Formality.  In verifying a 700K
> design, approximately 40% RAM (using Magma P&R), Formality was 25% faster
> and 4X better in capacity over Verplex on the same design.
>
> Our team did decided to buy Formality because of our eval results.
>
>     - [ From The Land Of Pokemon ]


From: Jay Pragasam <jlk@brecis.com>

Hi John,

Based on our usage I think Formality and Verplex LEC both have advantages
and drawbacks on their own.

  1. On flat netlists, both Formality and LEC run more or less the same
     time, but setting up Formality is more simpler, if you are a person
     who hates the renaming culture of LEC.

  2. The debugging capabilities to me are not drastically different.  Both
     EC tools present the same debugging utilities with different wrappers.

  3. But doing hierarchical verification with Formality is a pain.  It is
     so nice with LEC that we can write out a hierarchical script and also
     the tool takes care of keeping the whole design in memory and traverse
     into every submodule for verification.  In Formality you must manually
     create scripts to verify every submodule.  Since black boxing isn't
     that easy with Formality, we have to take extra effort to tell the
     tool that I have verified the submodule and so treat it as a black box.
     Also if you have clock trees, then the user has to say that the
     different clock nets going into the submodule are all the same and one
     clock when going hierarchical with Formality!!

  4. Formality is BAD in handling parametrized designs.  It can't do the job
     if there are a lot of mathematical manipulations of parameters whose
     value is used inside the design.  You have to tweak the RTL to make the
     tool happy.  Synopsys claimed that it was fixed in 2001.08-SP1, but
     doesn't appear so.  Maybe in the SP3 version.

Both EC tools sweat big time if there are additions like A+B+C+D+E.  They're
not really smart about the implementation and often fail if it's implemented
as (A+D)+(B+C)+E for example.

    - Jay Pragasam
      Brecis Communications


( ESNUG 387 Item 16 ) -------------------------------------------- [01/23/02]

From: "Mark Warren" <mwarren@synopsys.com>
Subject: How You Code Latches & Flip-Flops *Greatly* Impacts VCS Runtimes

Hi, John,

With the last few versions of VCS, the coding style used for *synchronous*
devices has become critical.  If VCS clearly understands all your flip-
flops and latches, by default it will use its aggressive cycle-based
algorithms (which can result in large speedups and even more memory
reductions.)  But if VCS does not like your coding style for flops or your
clock-gen circuitry, then VCS's old event-driven algorithms dominate.
You'll then be stuck simulating at those much slower Cadence NC-Verilog
speeds.  This holds true for both RTL and gate-level (UDP) designs.

This letter is to show how minor changes in coding styles can have dramatic
effects on VCS simulation performance.


VCS Optimizations
=================

The algorithms VCS uses are split into two camps.  The first is Language
optimizations.  (These are also know as Front-End optimizations.)  These
optimize logic to be evaluated and minimize the events in the VCS event
queue.  All event-driven simulators use an event queue to schedule and
propagate events.  Event queues work well on asynchronous behavior (like
timing through random logic), but they are inherently slow due to the
overhead of this flexibility.

The second type of VCS optimizations are called RoadRunner which can be
thought of as cycle-based algorithms that trigger on synchronous logic.
Many synchronous devices only need a clock edge to propagate events so they
do not need the flexibility of the event queue.  This is the concept behind
the VCS RoadRunner algorithms; minimize the flexibility for these types
of constructs in order to get a bigger runtime speedup.  

Language and RoadRunner algorithms are on by default in VCS.  No special
compile-time flag needed.  A much more aggressive family of optimizations
within VCS (called Radiant optimizations) also exist, but I don't have the
time to discuss them here.  Radiant optimizations are enabled with the VCS
compile switch +rad.


Flip-Flop Coding Styles
=======================

It is very important that the coding style of synchronous elements in your
design are *race* free and coded in a way that VCS can understand them.  If
VCS recognizes the latches/FFs in your design, then it will automatically
turn on a RoadRunner Cycle optimization and give a good speedup.  VCS will
accelerate most flip-flop coding styles accepted by Design Compiler.  (These
are documented in the on-line VCS UsersGuide.)

Here are the general rules for coding synchronous devices in VCS:

  * Standard flip-flop coding looks like:

                         always @(posedge clk)
                                a <= b;

    This is a perfect coding style for VCS.  Non-Blocking Assignments (NBA)
    ensure that no race conditions exist between flip flops, and since there
    is no delay, VCS can simulate this flop using cycle-based algorithms.


  * Adding a delay slows down evaluations:

                         always @(posedge clk)
                                a <= #1 b;

    This example adds an unnecessary delay on the right-hand side of the NBA.
    Some users like to use this delay so waveforms are staggered and you can
    see which outputs are caused by which flops.  Others believe the delay
    helps avoid race conditions.  This is not true.  The nature of how NBAs
    propagate data ensures that ordering is correct, thus the delay is not
    needed.  The main problem with this delay is that it inhibits VCS from
    using cycle-based optimizations.  When VCS sees the delay, it assumes
    that the designer used it because some other logic relies on the delay
    to work properly (such as asynchronous feedback.)

    For regressions, you can force VCS to ignore any delay on the RHS of
    NBAs by using the compile-time "vcs +nbaopt" flag.  This will allow more
    RoadRunner optimizations (and a resulting in a speed up) but it may
    expose race conditions if NBAs were used in inappropriate places (such
    as inside "initial" blocks.)


  * Blocking assigns without delays are prone to races:

                         always @(posedge clk)
                                a = b;

    This coding style is prone to race conditions.  The IEEE LRM states that
    all "always" blocks act in parallel and you can not guarantee the order
    of execution.  Therefore you can not guarantee that b will or will not
    be updated before it is propagated to a.
  

  * Blocking assigns with delays are not recommended:

                         always @(posedge clk)
                                a = #1 b;

    Blocking assignments do not ensure proper ordering of events in daisy-
    chained flip-flops, so they require a #1 on the RHS to avoid race
    conditions.  Since this inhibits VCS cycle-based optimizations, this
    coding style is also not recommended.


  * Adding asynchronous resets:

                    always @(posedge clk or negedge rst)
                       begin
                         if (rst == 0)
                            a <= 0;
                         else
                            a <= b;
                       end

    Adding resets to flops generally will not adversely affect VCS
    optimizations.


Latch Coding Styles
===================

There are many ways to code a latch in Verilog.  These examples all work well
and will be properly accelerated in VCS:


  always @(clk or d)     always @(clk or d)    always @(clk or d)
    if (clk)               if (~clk)             q <= clk ? d : q;
       q <= d;                q <= d;


  always @(clk or scan_clk or d or scan_d)     always @(clk or enable or d) 
    if (clk)                                     if (clk)         
       q <= d;                                      if (enable) q <= d;
    else if (scan_clk)  
            q <= scan_d;     


Clock Drivers
==============

There are many ways to drive clocks in the Verilog language.  VCS is
usually happy with any clock coding style used.  Since VCS cycle-based
optimizations occur at the block level, there is no performance penalty for
having multiple clocks (even if they share paths or are asynchronous).  An
important rule to follow is that clock signals should never be driven by a
non-blocking assignment (NBA) -- it nullifies the benefits of using NBAs
inside of flops and is very prone to race conditions.


Coding Styles To Avoid 
======================

A few inefficient constructs usually don't impact the performance of VCS.
Use whatever you need on a local level to achieve the needed functionality.
But blocks that get instantiated thousands of times or "always" blocks
that get triggered thousands of times have a MUCH larger impact on overall
performance.  Here are some other coding styles which should be avoided:

  1) Use of delays inside "always" blocks

  2) Use of signal display/monitor inside "always" blocks that are not
     really necessary (i.e. a display to flag an ERROR message is OK but
     displaying signal values at every edge can seriously hurt performance.)

  3) Use of fork/join constructs 

  4) force/release assign/deassign

  5) multiple event controls - multiple "@" or triggers inside the
     "always" block

  6) Use of Verilog "task" calls that have event/delay control in them.

  7) Use of memories inside "always" blocks

  8) Don't drive a clock with an NBA 

  9) Avoid modeling flops with transistors

  10) Avoid any strength-based modeling

  11) Avoid named blocks


UDPs Coding Style
==================

Many vendor-supplied UDPs are well written, so generally VCS can infer the
latches and flops properly.  Trouble comes when a user includes only table
entries for particular input combinations in his UDP.  Uncovered input
combinations in a UDP definition is defined by the Verilog LRM to output X,
which can result in non-optimized UDPs.  This could become a VCS performance
problem if this ambiguity results in VCS not recognizing the clock input to
the UDP.  For illegal input combinations that are not expected to happen, a
good practice is to define "no change" explicitly. 


Use The VCS Built-In Profiler
=============================

Since synchronous devices are often instantiated thousands of times in a
typical design, they can have a major impact on performance.  If a flop
model does not get accelerated, it will show up high in the VCS profile
list.  It's wise to occasionally compile and run VCS with the "vcs +prof"
flag to see which constructs are taking the most CPU time in your
simulation.  This is often the best way to find an offending module.

    - Mark Warren
      Synopsys, Inc.                             Cupertino, CA


============================================================================
 Trying to figure out a Synopsys bug?  Want to hear how 12,000+ other users
    dealt with it?  Then join the E-Mail Synopsys Users Group (ESNUG)!
 
       !!!     "It's not a BUG,               jcooley@TheWorld.com
      /o o\  /  it's a FEATURE!"                 (508) 429-4357
     (  >  )
      \ - /     - John Cooley, EDA & ASIC Design Consultant in Synopsys,
      _] [_         Verilog, VHDL and numerous Design Methodologies.

      Holliston Poor Farm, P.O. Box 6222, Holliston, MA  01746-6222
    Legal Disclaimer: "As always, anything said here is only opinion."
 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)