> From: Andrew MacCormack 
  >
  > Fridays just aren't the same without your pearls of wisdom, John.  I see
  > that you are keeping DeepChip.com up to date with news.  Lots of people
  > on the Yahoo CDN bulletin board are waiting for your take on the latest
  > Cadence SP&R tape-out story and, of course, we've all forgotten how to
  > synthesize since the last installment of ESNUG!  Come back!
  >
  >     - Andrew MacCormack, Senior Design Engineer
  >       Tality                                 Livingston, Scotland


  Watch what you ask for, Andrew, because you just got it!  :)  It's normal
  for me to 'disappear' immediately after a conference (like a SNUG or a DAC
  or an OVI) because I usually have a lot of catch up work to do *plus*
  write a Trip Report during that time following a conference.  But, like a
  really bad recurring nightmare for the EDA vendors, ESNUG is back up and
  running again, Andrew!  :)   It's good to be back.

                                               - John Cooley
                                                 the ESNUG guy


( ESNUG 360 Subjects ) ------------------------------------------- [11/02/00]

Item  1 :  Wall St. Asks "PhysOpt & Silicon Ensemble -- With Or Without PKS?"
Item  2 :  Give PhysOpt Rectangular Areas To Place In & Disable Area Recovery
Item  3 :  Questions On PhysOpt Signal Integrity & Primetime Crosstalk News
Item  4 :  We're In Purgatory with Sun Workstations For EDA; What About HP?
Item  5 :  Synopsys Optimal DC Library Guidelines Documentation Disappeared!
Item  6 :  Any Users Have Experiences w/ GARDS from Silicon Valley Research?
Item  7 :  "W484" Logic-Munging Bug Involving DC 00.05-1 & DW Foundation Libs
Item  8 : ( ESNUG 359 #5 )  DC/PT Script To Detect Paths Across 2 Clocks
Item  9 :  How To Get A Mirror Copy In A Cadence Virtuoso Layout Drawing
Item 10 : ( BSNUG 00 #5 )  7 Silicon Perspectives First Encounter Tape-outs
Item 11 :  Genesys Replies To Industry Gadfly "Memory BIST Follow-Ups" Column
Item 12 :  Design Compiler VHDL Parser Problem with Aggregates In Expression
Item 13 :  User Seeks A Detailed Comparison Of Synopsys SystemC vs. CynLib
Item 14 :  PrimeTime QTM Documentation Typo For create_qtm_constraint_arc
Item 15 : ( ESNUG 359 #6 )  Design A 64-bit+ Multiplier-Accumulator (MAC)
Item 16 :  Trying To Find The PrimeTime Tcl Load Command, Not DC's Load_Of
Item 17 :  Synopsys Tcl 'Collections' in DC Not Like Tcl 'Lists'!  Why!???
Item 18 :  Customer DC Headaches w/ 0.01% Limitation on Fixing Violations

 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com


( ESNUG 360 Item 1 ) --------------------------------------------- [11/02/00]

From: Garo Toomajanian <GToomajanian@dainrauscher.com>
Subject: Wall St. Asks "PhysOpt & Silicon Ensemble -- With Or Without PKS?"

Hi John,

Many of us Wall Street watchers read ESNUG because we're monitoring the
adoption of physical synthesis tools.  From the many customer tape-out
stories we've read in ESNUG, it appears to those of us in the financial
community that Physical Compiler is catching on with designers.  With
Cadence's SE-PKS now released, are designers seeing (or do they anticipate)
any advantage to using SE-PKS with Physical Compiler as compared to using
Physical Compiler and SE *without* PKS features?  Are Physical Compiler
users interested in SE having PKS features?

I could ask Aart and Ray these questions, but I'd rather hear what the users
have to say about this.

    - Garo Toomajanian, Research Analyst
      Dain Rauscher Wessels                      Boston, MA


( ESNUG 360 Item 2 ) --------------------------------------------- [11/02/00]

From: James Andolfo <James.Andolfo@matrox.com>
Subject: Give PhysOpt Rectangular Areas To Place In & Disable Area Recovery

Hi, John,

We just taped out our largest chip yet using a hierarchial PhysOpt G2PG
flow.  This is our second PhysOpt tape-out.

I lead the softblock "placement through routing" effort in which PhysOpt
was used to perform the initial placement, placement of ECO cells (both
timing and functional), and to incrementally fix setup and hold violations
for all 15 softblocks.  The largest softblock was 1.4 Mgates and the
smallest was about 40 Kgates.  The total size of our 0.18 chip was just over
5 Mgates.

How did PhysOpt perform?  Well, we were not disappointed.  PhysOpt performed
significantly better than our previous Cadence placement flows.  For most
nets we saw good correlation between PhysOpt estimated timings and the
backannotated post-placement Primetime timings (a far cry from our past
wire_load_model experiences).  One key, though, was to follow some basic RAM
placement guidelines during the floorplanning stage (i.e. to place RAMs such
that the remaining placement area in your design was as rectangular as
possible.)  If it wasn't possible to leave a rectangular area for PhysOpt,
we defined the rows such that the resources given PhysOpt were close to
rectangular in shape.  Why does this help?  Well, one reason could be that
the Steiner estimations do not compensate for the reduced routing resources
when crossing sections of RAMs.

A technique we used to save run time for the 1.4 Mgates module during
placement was to turn off the area recovery mode.

Also a good rule of thumb to improve run time is to define the power
structure such that it isn't necessary to set partial placement blockages
(which, for example, are needed when cells use common metal layers with the
power straps.)

All in all we only experienced a few "teething problems", most of which were
tool database conversion related.  Since a lot of our placement was
performed with beta versions and since we froze PhysOpt at revision 1.1d,
these may already have been resolved in the newest version of PhysOpt.

    - James Andolfo
      Matrox                                   Montreal, Canada


( ESNUG 360 Item 3 ) --------------------------------------------- [11/02/00]

From: lye@mhcnet.micro.lucent.com (Lun Ye)
Subject: Questions On PhysOpt Signal Integrity & Primetime Crosstalk News

Hi John,

I have the following two Primetime questions:

(1) In your Boston SNUG Trip Report, you mentioned that Aart de Geus gave
    the keynote address. two of the interesting tidbits are of special
    interest to me:

      a. PrimeTime is now in beta with crosstalk analysis capability
      b. plans to integrate signal integrity capability into PhysOpt

    Do you have more information on this?  Where can I learn more, in
    addition to from Synopsys source?

(2) How can I speed up these commands (in any order)?

    report_timing -nosplit -nets -through net_abc -delay_type min_rise
    report_timing -nosplit -nets -through net_abc -delay_type min_fall
    report_timing -nosplit -nets -through net_abc -delay_type max_rise
    report_timing -nosplit -nets -through net_abc -delay_type max_fall

Many thanks in advance,

    - Lun Ye
      Lucent Technologies, Inc                   Allentown, PA


( ESNUG 360 Item 4 ) --------------------------------------------- [11/02/00]

From: Jon Stahl <jstahl@avici.com>
Subject: We're In Purgatory with Sun Workstations For EDA; What About HP?

Hi John,

I was wondering if I could poll the ESNUG community for *recent* experiences
regarding the use of HP workstations instead of Sun as a platform for EDA.

The reason I bring this up is that we have had a ridiculous amount of
trouble with our Sun hardware and software over the last few years, and we
are at the point of researching alternatives.

Now, before I get a bunch of responses along the vein of "these people must
not know what they are doing because we have used Sun for X+ years and have
never had a problem"; I ask the reader to postulate the following things:

  1) We also have X+ years of Sun hw/sw experience and started out as Sun
     bigots a few years ago.

  2) Without going into details, we have had an unreasonable amount of
     trouble with our current Sun hw/sw (hardware being big E6000 and E6500
     servers.)

  3) We have explored all reasonable avenues of resolving our problems with
     Sun, paying top dollar for the highest level of support, and even
     replacing all of our hardware in spite of the fact that diagnostics
     did not show errors.

Our top concerns are that we may switch and find ourselves with the same set
of problems with HP workstations -- that our EDA software may not be as
stable, or supported as well, or even exist on the HP platform.

(Yes, we've seen the recent news of the Sun memory failure "cover-up".  Some
of our problems may have this root cause, but most do not.)

Thanks for any feedback.

    - Jon Stahl
      Avici Systems, Inc.                        N. Billerica, MA


( ESNUG 360 Item 5 ) --------------------------------------------- [11/02/00]

From: Chris Kiegle <ckiegle@us.ibm.com>
Subject: Synopsys Optimal DC Library Guidelines Documentation Disappeared!

Hey John,

I was looking for the DC Ultra Library Guidelines documentation that was
mentioned in ESNUG 343 so I could point someone to it for reference.  It was
titled "DC Ultra Library Guidelines, dated Oct 99," and had a link at
Synthesis-625.html on SolvNet.  I know I accessed it before, but it seems
to be AWOL from SolvNet now.  I know they might be working on some new
documentation, but it it would be nice to access the old one until they
finish.  Any idea where I can find it?

    - Chris Kiegle
      IBM                                        Burlington, VT


( ESNUG 360 Item 6 ) --------------------------------------------- [11/02/00]

From: John Redford <jredford@chipwrights.com>
Subject: Any Users Have Experiences w/ GARDS from Silicon Valley Research?

Hi John,

Does anyone have experience with using the GARDS place and route tool from
Silicon Valley Research (www.svri.com) ?  It looks nice on the surface
(and inexpensive!), but I'd feel better about it if people have actually
built chips with it.

    - John Redford
      ChipWrights


( ESNUG 360 Item 7 ) --------------------------------------------- [11/02/00]

Subject: "W484" Logic-Munging Bug Involving DC 00.05-1 & DW Foundation Libs

> Hi, John,
>
> There appears to be a logic-munging bug in dc_shell 2000.05-1.  I don't
> fully understand the bug and can't give you a small and non-confidential
> test case yet, but here's what I know:
>
> SYMPTOMS: Writing Verilog out gets error messages due to internal database
> corruption - the number of bits of a bus attached to a port of a
> designware component does not equal the the number of bits of the port.
> The Verilog netlist written is also broken and cannot be legally read
> back in to dc_shell.  Neither link nor check_design finds anything wrong
> with the corrupted database.
>
> At first we thought this might be due to the user using the new Verilog
> readers described in Synthesis-713.html, since he had set the variables
> (enable_verilog_netlist_reader & hdlin_enable_presto) to enable them.
> However later we saw the bug in cases where those variable were not set.
> So, this is probably irrelevant.
>
> The bug seems to be related to DesignWare add/subtract/increment/decrement
> elements, and has only been seen in situations where the RTL gives a
> Verilint W484 warning.  We were using DW Foundation, so that could also be
> a factor.
>
> W484 warnings are seen when the receiving variable for an add or subtract
> result does not have enough bits to hold all possible resulting values
> (carry or borrow is lost).  For example, I often see coding like:
>
>      reg [7:0] a, b, c;
>
>      c = a + b;
>
> This isn't quite right because the sum of 2 8-bit numbers may take 9 bits
> to hold, and Verilint correctly complains.  If you change it to:
>
>      reg [7:0] a, b;
>      reg [8:0] c;
>
>      c = a + b;
>
> then the warning goes away.  Changing our Verilog source code to eliminate
> the W484 warnings eliminated the problem, i.e. it apparently no longer
> triggered the bug.
>
> RECOMMENDATIONS: Until we know exactly what's causing this bug, I would
> recommend that you:
>
>   (1) Run Verilint on your input Verilog RTL, and fix it to eliminate any
>       W484 warnings.
>   (2) Write out a Verilog netlist even if you don't need it, and make sure
>       you have no error messages.
>   (3) As a double check, read the Verilog netlist back in and make sure it
>       succeeds.
>   (4) If you're still nervous, go back to an earlier dc_shell.  This bug
>       does not seem to occur in 1999.* versions.
>
> Those of us who routinely get our Verilog RTL squeaky clean with respect
> to Verilint, and thus don't have any W484 warnings, can feel smug compared
> to those who routinely ignore such warnings because they can't be bothered
> with picky details like that when they have a chip to get out.  :-)
>
> More as soon as I have time.  Unfortunately, since we've worked around
> this bug already, it's not my top priority.
>
>     - Howard A. Landman
>       Vitesse Semiconductor                    Longmont, Colorado


From: [ The Synopsys Support Center ]

Hi Howard,

This is regarding your DW problem.  I've searched our database, and I
believe I've found a match on your case.  I believe you got the error
message VO-5 when you wrote out your Verilog netlist.  Is that true?  If
so, there are a number of STARs filed on this and the problem is fixed
in our next release 2000.11.  The current workaroud is to set the
variable: compile_disable_hierarchical_inverter_opt to true.  Hope this
also solves your problem.

    - [ The Synopsys Support Center ]


( ESNUG 360 Item 8 ) --------------------------------------------- [11/02/00]

Subject: ( ESNUG 359 #5 )  DC/PT Script To Detect Paths Across 2 Clocks

> I have a world with two clock domains:
>
>             SysCLK Domain                   PCI CLK Domain
>                                       
>        ------                         
>        |    |                        
>     -->| ff |-------->|----------|   
>    |   ------         |          |
>    |                  | comb.    |         Sync  FF's
>    |   ------         | logic    |        ------    ------
>    |   | ff |         |          |        |    |    |    |
>    |-->|    |-------->|          |------->|    |--->|    |----> FSM
>    |   ------         |          |        |    |    |    |
>    |                  |          |        ------    ------
>    |   ------         |          |          |__________|________ PCI_CLK
>    |   | ff |-------->|__________|
>    |-->|    |                        
>    |   ------                        
>    |                                 
>  SYSCLK                             
>
> As you can see from the above diagram, the output of SYSCLK FFs go into
> some combinational logic and feed into FFs clocked by PCI_CK.  The outputs
> of the PCI_CLK FFs go into a FSM.
>
> Let's say a glitch were to occur at the posedge of SYSCLK:
>
>  1) This could trigger a false (high) start signal and the output of
>     combinational logic would register a high.  Because PCI_CLK has no
>     way to verify that this is a glitch, not a valid signal, it would
>     start the PCI_CLK FSM.
>
>  2) The correct solution is to put another FF after the combinational logic
>     clocked to the SYSCK, thereby preventing this glitch.
>
> Are there any DRC tools out there that might catch this type of problem?
> Do your users have any DC scripts that could test for this type of
> problem?
>
>     - Subha K. Pindiproli
>       Emulex Network Systems                     Costa Mesa, CA


From: Ken Rose <kenr@cisco.com>

John,

It would be nice if there were scripts to identify this situation, and I
hope someone will post said script.

Barring the provenance of such a script, one alternative is to use a careful
design methodology to preclude the problem situation from occuring.  

In the case Subha brings up, hierarchy can be very helpful.  If you have a
module that contains both clock domains, introduce another level of
hierarchy in that module and require that only a single clock domain exists
in any sub-module.  At that point it becomes easy to inspect the sub-modules
and make sure any signals leaving come from flops.

    - Ken Rose  
      Cisco Systems                              San Jose, CA

         ----    ----    ----    ----    ----    ----   ----

From: Siegfried Weidelich <SHW0889@mcdata.com>

John,

I have run into the same situation, and what I do in DC is a timing report
something like:

    report_timing -from find (clock SYSCLK) -to find (clock PCI_CLK) 

This will give you a timing report with all paths from SYSCLK domain to 
PCI_CLK... then you can look at the paths to see if they are flop-to-flop
(ie., no unexpected gates in the path).  I suppose you could redirect to
a file and parse for gates in the path to automate it, but crossing clock
domains is tricky, so I like to manually investigate every path thoroughly.

Also, if you make it a rule to call the clock domain boundary flops a 
special name in your RTL like "rdy_s2p" so that your netlist has
"rdy_s2p_reg", then you can do the same report with the "-path short" 
option to check that all the paths crossing the clock domain start and
end with a flop with  "_s2p" contained in the name.  It can be a much
smaller file to check!  Hope this helps.

    - Siegfried Weidelich
      McData Corporation

         ----    ----    ----    ----    ----    ----   ----

From: Larrie Carr <Larrie_Carr@pmc-sierra.com>

John,

Sadly, I have seen this structure before...  To catch these types of
mistakes, I use PrimeTime to report all interfaces between 2 clock domains.

   create_clock <one domain>
   create_clock <other domain>

   report_timing -from <one clock> -to <other clock>

If you don't seen flop to flop or flop/buffer/flop, you got a problem.  But
of course, this doesn't tell you if you got all your sync flops.

Never got around to automating it.

    - Larrie Carr
      PMC-Sierra

         ----    ----    ----    ----    ----    ----   ----

From: Dirk Luckett <dluckett@raytheon.com>

John,

First lets assume the SYSCLK and PCI_CLK are asynchronous.  You should never
try and synchronize multiple signals from one domain to the other and into
an FSM.  You cannot count on them all arriving at the FSM the same cycle
in the "REAL" world. (they will in the artificial simulation world of course).
The layout of each signal trace will cause the timing of each signal across
that boundary to be different and unique.

Second, to provide some thoughts on your question.  I have had a design
with asynchronous clocks.  I generally set a false path from SYSCLK to
PCI_CLK to prevent any setup or hold checking on those paths.  This also
allows PrimeTime to classify those endpoints as "untested".  You can get
a list of all untested endpts (which you should already be reviewing)
by executing the command:

  report_analysis_coverage -status_details {untested} -sort_by {check} \
        -significant_digits 3 -nosplit > ./untested.rpt

This isn't a real direct approach but it is workable.  Other users may have
better suggestions; but I employed this one in the last month.

    - Dirk R. Luckett
      Raytheon

         ----    ----    ----    ----    ----    ----   ----

From: Will Leavitt <leavitt@giganet.com>

John,

I got this right out of Solvit:

    QUESTION:

    How can I find all of the paths in my design with start and 
    end flip-flops triggered by different clocks?

    ANSWER:

    For the following example, the assumption is that there are 
    only two clock domains; however, you can easily modify this 
    script to include designs with more clocks.

      1) create_clock CLK_25 -period 25 
         create_clock CLK_10 -period 10

      2) group_path -default -to all_clocks()
     
      3) set_false_path -from find(clock,CLK_25) -to find(clock,CLK_25)
         set_false_path -from find(clock,CLK_10) -to find(clock,CLK_10)
         set_false_path -from all_inputs() -to find(clock)
         set_false_path -from find(clock) -to all_outputs()

      4) report_timing -path only -max_paths 1000

    EXPLANATION:

      1) Create the clocks that your design will use. You can
         optionally use the "-name" option to name your clock something
         other than the port name. If you use this option, make sure
         to refer to this new name when using set_false_path. For
         example

             create_clock CLK_10 -period 10 -name phase_1
             set_false_path -from find(clock, phase_1) 

      2) By default, each time you create a clock, a new path group is
         also created containing the endpoints relating to this clock.
         If you create two clocks named CLK_10 and CLK_25, you will
         have three path groups, "default, CLK_10, and CLK_25".  You
         want all of the paths to be in just one path group, so use the
         group_path command to group them all into the "default" group.

      3) Now that all of the paths are grouped into one path group,
         eliminate all paths triggered by the same clock by 
         setting a false path on them. Similarly, eliminate all paths
         originating or ending at ports.

      4) Finally, report all of the remaining paths. These are
         all paths that cross clock domains. If you anticipate more
         than 1000 paths, change the value of "-max_paths" accordingly.
         This command only returns one path per endpoint, so if you
         want an exhaustive list of all paths, add the "-nworst 100" 
         option to the report_timing command.

I never would have thought of this myself...  Good luck.

    - Will Leavitt
      Giganet

         ----    ----    ----    ----    ----    ----   ----

From: London Jin <jinl@taec.toshiba.com>

Hi John,

This well-known problem has well-known solutions.

  1. Synopsys does have a tcl script released with PrimeTime:

     set clock_list [all_clocks]
     foreach_in_collection from_clk $clock_list {
       foreach_in_collection to_clk $clock_list {
       if { [get_object_name $from_clk] != [get_object_name $to_clk] } {
       set_false_path -from [get_clocks $from_clk] -to [get_clocks $to_clk]
       }
       }
     }

     Changing set_false_path to report_timing gives you all possible paths
     between clocks.

  2. Avanti has a tool called Clock Domain Checker FDRC.

     http://www.avanticorp.com/product/1,1172,62,00.html

  3. IKOS emulation compiler can also specifically reports them.

Hope this helps.

    - London Jin
      Toshiba                                    San Jose, CA


( ESNUG 360 Item 9 ) --------------------------------------------- [11/02/00]

Subject: How To Get A Mirror Copy In A Cadence Virtuoso Layout Drawing

> Could anyone teach me that how could I mirror copy of the layout in a
> Cadence layout drawing?  I just could find the rotation but could not
> find how to mirror copy.  Mirror copy like the following:
>
>          |                                              |
>          |                                              | 
>          -----          After mirror   --->         -----
>          |               copy                           |
>   ---------------                                 -------------
>
>     Original                                         Copy one
>
>
>     - Lee Chi Wai
>       The Chinese University of Hong Kong


From: Kholdoun Torki <torki@rhin.imag.fr>

If you are using Virtuoso layout editor, then just type on F3 key, and you
will get the options of any command you execute.  For the copy command,
there are the possibilities to make the "mirror" in X or Y direction, by
clicking "sideways" or "Upside Down" in the Copy options menu.

You can also configure by default to have the menu options prompted each
time you start a command :

     CIW -> Options -> User Preferences :
     Options Displayed When Commands Start  :  *

Hope this helps.

    - Kholdoun Torki
      CMP-TIMA                                   France

         ----    ----    ----    ----    ----    ----   ----

From: Smartchi <smartchi@hotmail.com>

If I have flatten the layout called from the instance, how could I mirror
copy it?

Could anyone teach me that how could I mirror copy of the Cadence layout
in the layout drawing? Actaully, I know how to mirror it when calling it
from the instance. Sometimes, I needed to modify it after flatten.
Therefore, how could I mirror copy the layout which has already the
flatten.

    - Smartchi

         ----    ----    ----    ----    ----    ----   ----

From: Kholdoun Torki <torki@rhin.imag.fr>

When flattening an instance you loose the selection of the objects that were
inside.  Then you are right -- we cannot make any move or rotation or
mirroring.

One way is to first flatten the layout in an empty space, and then
select the objects (polygons, and rectangles, etc..), make the mirroring,
and move those objects on the desired location.

The other way is to first make the mirroring and then make the flattening.

Another way is to make a copy of the cell with the necessary mirroring, then
instantiate it and modify the content with the "Edit in Place" command.
(The modifications will affect only the copy of the cell, not the original).

    - Kholdoun Torki
      CMP-TIMA                                   France

         ----    ----    ----    ----    ----    ----   ----

From: jws@space.eng.intersil.com (James Swonger)

After you hit the "c" bindkey (or the "Copy" menu button), and before you
click for the destination, press the F3 key.  This is the standard options
button for almost all commands.  In copy ("c") and move ("e") commands, the
options form offers mirror and rotate selections. Select your option, hide
the form, and proceed.

    - James Swonger
      Intersil                                   Melbourne, FL


( ESNUG 360 Item 10 ) -------------------------------------------- [11/02/00]

Subject: ( BSNUG 00 #5 )  7 Silicon Perspectives First Encounter Tape-outs

> We had a visit from our friendly Silicon Perspective sales team today and
> the demo they showed sparked some interest.  They claimed to have several
> tape-outs under their belt at this point, which is when I thought, "Yeah,
> but have you told Cooley about them yet?".   Heard any more from the
> masses on this one?  Their "don't flatten the hierarchy" approach to floor
> planning certainly has some major benefits.  Just thought I'd look for
> some further references before engaging in any kind of benchmark.
>
>     - Pete Churchill of Conexant Systems


From: Stefan Thiede <Stefan.Thiede@sv.sc.philips.com>

Hello John,

We were one of the customers which used Encounter for our tapeout.

Our chip contained 1,200,000 stdcells which we repartitioned with in-house
Philips tools into 8 "chiplets" (Blocks large enought to be called chips :-)
plus 2 CPU-cores.  The chiplets varied in size from 65,000 to 210,000
stdcells.

We've used Silicon Perspective to find the optimal placement of our 200+
macros, for placement, and IPO.

The feedback from the "amoeba" placer was extremely helpful to find good
macro placement due to it's unique visualization of hierarchy.  The placer,
though slower than competing products, produces a more porose placement
that allowed us to use their real gem, their IPO capability, without
screwing up the routablility.  This also helped during clk-tree insertion.

The IPO resized (up/down) ~25% of all cells in the design based on
Encounter's estimated routing and was able converge on timing.  The runtimes
per chiplets for IPO were very short compared to the other options we had.
This is a placement-based optimization that actually works and it's fast. 

Encounters timing engine reads constraints from PrimeTime and is extremely
fast.  Verilog-in/out/flattening/defout/tdfin/out all run in ~10 minutes on
the full chip.

For the next tapeout, we'll use Encounter for full-chip placement-based
timing analysis.  It is capable of holding the whole placed and routed
design in 1.7GB process size.  This compares to 1.8GB for a 210,000 stdcell
block in Apollo.

We are going to have a look at Encounter's hierarchical pin assignment and
their clk-tree synthesis.  We are specifically eager to lay our hands on
their feedthru capabilities, due later this year.

In our experience, Encounter did what it promised, does it fast, and has the
capability to handle full-chips of 1+ million stdcell instances.

    - Stefan Thiede 
      Philips Semiconductors                     Sunnyvale, CA

         ----    ----    ----    ----    ----    ----   ----

From: Mano Vafai <mano@klsi.com>

Hi, John,

I would like to invite you or any serious ASIC designer of your community to
visit Kawasaki LSI's Silicon Valley office so we could explain our usage of
Silicon Perspective's First Encounter tool in our production ASIC flow.

Silicon Perspective floorplanning and placement tools has helped establish a
missing link between Synopsys DC, and Cadence Silicon Ensemble (SE).  This
allows our ASIC customers to leverage their existing investment in front-end
Synopsys environment and us at KLSI to utilize SE more efficiently.  We also
make use of floor planning tools like Avanti Compass, Cadence PDP, and
Synopsys Gambit.  However their scope is limited to CBA (Synopsys's Cell
Based Array) or some other low complexity Standard Cell designs.

Kawasaki has benchmarked a few other tools such as the ones developed by
Monterey and Magma. However we particularly selected Silicon Perspective
because of its front-end design orientation, super fast execution, and
excellent timing correlation with both SE and PrimeTime.  In doing so,
Silicon Perspective tools really helped us to tape out a very complex
Networking ASIC on time.

    - Mano Vafai
      Kawasaki LSI                               San Jose, CA

         ----    ----    ----    ----    ----    ----   ----

From: Tony Liu <tonyliu@sis.com.tw>

Hi, John,

We were one of the customers which used Encounter.  And We've successfully
take the advantage of First Encounter for two tapeouted projects.  One of
them is 1.2 million placed object, and the other one is more than 800K
placed objects.  We appreciate the hierarchically partition and pin
assignment of First Encounter.

It really do a good job for those.

The placement of First Encounter also make our project a good initial point
for timing closure.  We appreciate this tool in many aspect, and
contineously find out more advantage.

    - Tony Liu
      SiS                                        Taiwan

         ----    ----    ----    ----    ----    ----   ----

From: Sunil Malkani <SunilM@teralogic-inc.com>

John,

I manage the Physical Design and CAD team at TeraLogic.  I wanted to confirm
that we have used SPC's First Encounter for two of our chips one which taped
out about a year ago and the other taped out May 2000. 

The first chip was running at 108 MHZ worst case, process was TSMC 0.25
4-Metal.  The second chip was running at 120 MHZ, TSMC 0.25 5-Metal.

In the first chip, which was hierarchical, we used Encounter on one of the
blocks which was having problems in achieving timing convergence.  The block
was about 80K placeable cells and also included 4 large memories.  Encounter
worked pretty well in floorplanning the block and achieving better timing
results than our existing tool at that time -- it had a big advantage in
turn around time so we could do multiple what-if scenarios.

The second chip had 450K placeable cells and FE was used just for the IPO
part using the existing placement.  Also the front end team used the tool
to evaluate different floorplans to get a quick timing view.

    - Sunil Malkani
      TeraLogic, Inc.

         ----    ----    ----    ----    ----    ----   ----

From: Tom McKeone <Tom.McKeone@amd.com>

Hi, John,

I am a product development manager that works in the AMD chipset department.
We have successfully used First Encounter on two different tapeouts in the
last 9 months.  The tool did a fantastic job for us.  The design contained
over 650K placable instances & had multiple clock domains that ranged from
66Mhz to over 533Mhz.  Encounter handled this complex design expertly.  It
was able to optimize timing as well as provide estimated timing in a very
short period of time. The estimated timing was amazingly accurate.  We found
it to be within a few percentage points of the timing data extracted from
the final routed design.  (We used Apollo (version 1999.4.3.3.0.14) from
Avanti for routing this chip.)

The tool is continuously improving.  On the last tapeout we used their spare
gate "shotgunning", scan re-ordering, and clock tree synthesis.  Overall
the results were very good and the time savings significant.

    - Tom McKeone
      Advanced Micro Devices                     Texas


( ESNUG 360 Item 11 ) -------------------------------------------- [11/02/00]

From: Bejoy Oomman <bejoygo@genesystest.com>
Subject: Genesys Replies To Industry Gadfly "Memory BIST Follow-Ups" Column

Hi John,

Thank you very much for your kind coverage of Genesys Testware memory BISTDR
solutions.  Let me annotate my answers to the questions raised.

John Cooley wrote:

> Last month I gushed about GeneSys Test's BISTDR.  BISTDR is a
> paramaterized, synthesizable mem BIST cell that, on power-up, tests all
> your memory and remaps all bad addresses to working spare address spaces.
>
> The neat thing about e-mail and having a column in EE Times is that it
> gives 165,000 readers a chance to add their 2 cents.  The first to write
> was Cliff Whitmore of 3D Labs.  "I'm trying to figure out the actual
> mechanics of how they make remapping work.  Do they rely on a specific
> memory technology (from a particular vendor), or do they insert logic
> between my address generation logic and the RAM for the remapping, or do
> they blow a fuse map at wafer test, or what???"

Our BISTDR can work with any memory compilers and do bad cell remapping
fully externally.  This entails some performance penalty due to muxes at
the data input data output and address lines for remapping.  If the memory
has a redundant cells and repair circuitry (fuses or muxes) in it the
BISTDR can also take advantage of it to eliminate the performance penalty
of external muxes. We just officially announced this capability this week
at the International Test Conference.


> The second to write was Hank Walker of Texas A&M.  (Hank has this way of
> sometimes tersely cutting to the technical chase.)  He sent me a one
> sentance letter.  "IBM has done this in their memory BIST engines for
> many years."

As far as I know IBM has used this for embedded DRAM macros for years.  In
fact, we developed this technology 2 years ago for embedded DRAMs.  However
embedded DRAMs never really took off and third party embedded DRAM IP
developers became networking IC companies (Silicon Access, Mosaid).  We
re-architected the solution for large embedded SRAMs this year.  Fortunately
for us, all networking IC vendors have very large SRAMs that need this now.
Compaq (formely Digital) has used this technique for its Alpha Cache SRAMs.
So has Lucent.  IBM, Compaq and Lucent use only fuses for the actual repair.
We can use either fuses or muxes for repair.  Most of our customers like
muxes for repair since fuses increase the unit cost of the part
significantly due to additional fuse blowing steps required.  Fuses are also
notoriously unreliable and do not scale well with technology.  However
muxes have a small performance penalty over fuses.  Fuses also make BISTDR
transparent to the user.  People typically use fuses for the highest
performance or the highest volume memories.


> And Roderick McConnell of Infineon half-humorously warned me to "Beware:
> many cells fail only after a while.  If you are trying to repair these at
> power-on, you may end up doing a lot of rebooting.  Perhaps you're happy
> that a "bad bit" in an SRAM causes the connection to be lost when you are
> using a mobile phone with a long-winded salesperson who just doesn't stop
> talking -- but that's tough to predict."

This is the main technical objection raised by our potential customers.  It
is true that some SRAM failures are temperature and voltage sensitive.  Some
failures also occur in the field after some time.  We are not addressing
these failures with BISTDR.  We are trying to improve yield by correcting
static failures that are visible during package part testing.  Static cell
failures are the most common source of yield loss for ICs with large
embedded memories.  During memory characterisation testing or volume
manufacturing testing we have to make sure that the repair signature is
consistent at different voltage and temperature points to ensure that the
chip has only correctable static failures.


> "Have you looked at Virage's STAR (self-test and repair) capability yet?
> We are beginning to use their SRAMs, and this capability looks very
> interesting," wrote Jeff.  "Basically, you get a scan interface to disable
> addresses and remap them.  It's very fine-grained.  The overhead is very
> low, because it is built into the hard macros."
>
> (Why this was odd was that Virage isn't really known in the test world.
> See for yourself.  http://www.VirageLogic.com  Virage makes COT embedded
> memory compilers for fabs like TSMC, UMC, and Chartered.  But it's not
> all that surprising to be testing memories since they make them.)
>
> I phoned Virage to see what was up.
>
> It turns out that Virage's STAR generates memories sized from 64 k to 4 meg.
> "We have added redundancy with spare rows and columns in our embedded STAR
> memories," said Alex Shubat, the CTO of Virage.  "It can do off-chip test
> with laser fuse-box repair at manufacture.  It lets you skip using a memory
> testor in fab.  STAR can also do build in self-repair on power-up."
>

We work closely with Virage Logic to add support for their STAR compiler in
our products.  I beleive the self-repair at power-up mentioned by Alex is
our BISTDR product.  STAR contains the dynamic repair circuitry already,
so we need not synthesize the remapping circuitry.  This eliminates the
performance penalty of BISTDR.  Other memory compilers like Dolphin Tech's
RAMpiler are developing similar capabilities.

    - Bejoy Oomman
      Genesys Testware                           Fremont, CA


( ESNUG 360 Item 12 ) -------------------------------------------- [11/02/00]

From: Michael Dotson <mwdotson@vnet.ibm.com>
Subject: Design Compiler VHDL Parser Problem with Aggregates In Expression

Hi, John,

I have a piece of VHDL that represents a problem with the DC parser and
aggregates.  Strangely enough, the same code is accepted with the VSS,
MTI and Fusion parsers.  I was wondering if DC needs help in figuring out
the code without having to rewrite all the code in the first place.  This
source has already been built with a different synthesizer but I had a need
to try DC with it.  The error msg I get is a VHDL type mismatch from the
expression aggregate because it says the aggregate is std_logic....  which
is bogus because the workaround with temp doesn't have a problem.

   library ieee;
   use ieee.std_logic_1164.all;

   entity top is

   port (a_vector_1, a_vector_2      : in  std_logic_vector (1 downto 0);
         a_bit_1, a_bit_2            : in  std_logic;
         output_1, output_2, output3 : out std_logic_vector (1 downto 0)
        );

   end top;

   architecture arch1 of top is

   signal temp_1, temp_2 : std_logic_vector (1 downto 0);

   begin

   temp_1 <= (1 downto 0 => a_bit_1);
   temp_2 <= (1 downto 0 => a_bit_2);

   output_1 <= (a_vector_1 and temp_1) or
               (a_vector_2 and temp_2);
   output_2 <= (a_vector_1 and (1 downto 0 => a_bit_1)) or
               (a_vector_2 and (1 downto 0 => a_bit_2));
   end arch1;

Both output_1 and output_2 should be equivalent, but they're not!  Design
Compiler and Tuxedo both have problems reading in output_2.

    - Mike Dotson
      IBM


( ESNUG 360 Item 13 ) -------------------------------------------- [11/02/00]

From: Satyajit Chowdhury <snagchow@cisco.com>
Subject: User Seeks A Detailed Comparison Of Synopsys SystemC vs. CynLib

Hi, John,

Our group has been evaluating SystemC and CynApps and we plan to embrace one
of them.  For that, it is very important to get a comparison chart between
these two tools/languages which are built around the same concept.

Can anybody point me to some website or give me some pointers regarding why
one is better than the other?  I am looking at criteria like:

   * licensing mode
   * generation of waveforms
   * which is more popular and more used in industry and universities
   * which has more 3rd party vendor and tool support
   * simplicity of use, resemblance to Verilog
   * do we have Verilog "to" and "from" translators
   * most important, how is the PLI interface architected and how it works

Thank you for your help.

    - Satyajit Chowdhury
      Cisco Systems                              San Jose, CA


( ESNUG 360 Item 14 ) -------------------------------------------- [11/02/00]

From: Alex Kumets <kumets@hotmail.com>
Subject: PrimeTime QTM Documentation Typo For create_qtm_constraint_arc

John,

If you plan to use QTM for PrimeTime be aware that the command

                create_qtm_constraint_arc

must have argument '-edge rise/fall'.   The example from the 'SOLD 2000.05'
page 10-14 of "PrimeTime  User Guide: Advanced Timing Analysis" missed it.

Other issue: Synopsys did not give an example how to use variables when
designing qtm models (remember you are using Tcl but not dc_shell.)

    - Alex Kumets


( ESNUG 360 Item 15 ) -------------------------------------------- [11/02/00]

Subject: ( ESNUG 359 #6 )  Design A 64-bit+ Multiplier-Accumulator (MAC)

> I would love to hear from your readers how they'd design a large
> Multiplier-Accumulator (MAC) for over 64-bit operands.  I'm considering
> Module Compiler.  Our implementation technology is not decided yet, but
> I'm guessing 0.18um or smaller.  We're targeting in excess of 133MHz.
>
> In terms of speed/power, pointers/numbers would be greatly appreciated,
> as also techniques to verify this type of circuit.  (Obviously, we're not
> circuit designers, and would probably do a very poor job at a custom
> multiplier.)  Ideas?  Pointers?
>
>     - Neel Das


From: Gil Herbeck <gilherbeck@home.com>

John,

There are a number of factors that can have a big influence on this design.
How big are the operands and the accumulator?  Do you need saturation logic?
If so, are there multiple (programmable) saturation points?  Do you have
both integer and fixed point, or just one data type?  Can you have latency
from the inputs to the accumulator?  Is your process / cell library
optimized for area and power?  It's hard to say much without more specific
info.

If performance becomes a problem, MC can provide a big advantage for
non-interleaved accumulators.  You may be able to leave the accumulator
itself in carrysave format and push the carry propagation to after the
accumulation register.

    - Gil Herbeck
      Radix20                                    Livermore, CA

         ----    ----    ----    ----    ----    ----   ----

From: [ A Synopsys Module Compiler CAE ]

Hi John,

It is fairly straight forward to implement a simple MAC in Module Compiler
(MC).  You get full operator merging (a single carry save reduction/Wallace
tree with just one carry propagate adder for the entire multiply and add
operation as well as any other addends), a choice of different multipliers
(booth/non-booth) and final addera (fast-carry-lookahead, carry-lookahead, 
carry-select, the Synopsys proprietary carry-lookahead-select and ripple)
micro-architectures to trade off area/timing.

You can also parameterize these options along with the input operand widths
and different implementations of the MAC to perform fast architectural
exploration.  This is shown in the first architecture (arch==0) of the 
following piece of Module Compiler Language (MCL) code.

   module MAC (Z,X,Y,R,w,ovf,mult,fa,arch);
   integer w    = 64;                      // Input width
   integer ovf  = 2;                       // Overflow accum. bits
   integer accw = 2*w + ovf;               // Accumulator width
   integer arch = 0;                       // MAC architecture
   string  mult = "booth";                 // Multiplier type
   string  fa   = "cla";                   // Final adder type

   directive(multtype=mult,fatype=fa,pipeline="off");

   input signed [1] R;                     // Accumulator reset
   input  [w] X,Y;
   output [accw] Z;

   if (arch==0){
       wire   [accw] ACCin = X*Y + (Z&R);
       Z = sreg(ACCin);
   }

   // arch-1 is also implemented by an MC built in function maccs()
   // arch-1 can be modified slightly to pipeline the multiplier
   // and the final adder to further speed up the MAC.

   if (arch==1){
      wire   [accw] ACC0,ACC1,ACCin0,ACCin1;
      directive local (carrysave="convert");
      wire   [accw] ACCin = X*Y + (ACC0&R) + (ACC1&R);
      csconvert(ACCin0,ACCin1,ACCin);
      ACC0 = sreg(ACCin0);
      ACC1 = sreg(ACCin1);
      Z = ACC0+ACC1;
   }

   endmodule

As all of us know, the critical path is from the inputs, thru' the merged
multiplier and propagate adder in the accumulator.  You can individually
access the output of the accumulator "carry" and "sum" terms to 'push' 
the final propagate adder out of the sequential feedback loop.  This will
speed up the design and may be done by setting the carrysave attribute in
MC to "convert" and using the csconvert() function.  That was shown in the 
second architecture (arch==1).

   o The second architecture can be easily modified to isolate the
     multiplier, so that it can be pipelined and retimed by MC along
     with the final adder.  This will further speed up the design
     without changing the basic functionality of the MAC.

   o After synthesis, MC will write out a bit and cycle exact RTL
     simulation model, either in Verilog and VHDL.  This can be used for
     running your fast functional simulations to verify your design.
     Of course, you'll use the gate-level netlist for full simulation.

To give you a flavor of what the results look like, I used the Synopsys
DesignWare Silicon Library (std. cell) developed for TSMC's 0.18G process
to run a couple of tests.  This is for a 64-bit operand MAC with out any 
pipelining.  Of course, your results will vary depending on the technology
library you use.

    Arch-0: # of instances= 5910; delay= 7.58ns (~132 MHz)
    Arch-1: # of instances= 6275; delay= 5.33ns (~188 Mhz)

The above delay numbers can be reduced significantly by pipelining the
multiplier and the final propagate adder, until you hit the limit of
the loop delay, which then will be the critical path.  Here's the results
for a pipelined and retimed MAC with 2 pipe stages in the multiplier and
one in the final carry propagate adder (for a total of 3 in the design):

    Modified Arch-1: # of instances=8999; delay=3.37ns (~300 MHz)

You may get aggressive delays with smaller process tech. libraries, but
without changing the functionality of the MAC, you'll always be bound by
the feedback loop delay.

Hope this helps.

    - [ A Synopsys Module Compiler CAE ]


( ESNUG 360 Item 16 ) -------------------------------------------- [11/02/00]

From: Robert Gonzalez <robert.gonzalez@siliconmetrics.com>
Subject: Trying To Find The PrimeTime Tcl Load Command, Not DC's Load_Of

John,

I have been trying to find the "load" command in PrimeTime Tcl, which has
been mapped to the "load_of" command.   The tcl "load" command is designed
to load object files, such as those compiled with C or C++.  They must have
hidden it somehow, but I have been unsuccessful in finding it using the UNIX
"strings" command.   Is there any way you could ask around and see if anyone
knows the name (if any) of the hidden PrimeTime command that does what
"load" does in TCL?

    - Robert Gonzalez
      Silicon Metrics, Inc.


( ESNUG 360 Item 17 ) -------------------------------------------- [11/02/00]

From: Rajkumar Kadam <rkadam@asic.qntm.com>
Subject: Synopsys Tcl 'Collections' in DC Not Like Tcl 'Lists'!  Why!???

Hi John!

I thought that this may be useful for people using Tcl as their scripting
language for DC.  Synopsys has a new form of list equivalent in Tcl called
'collection'.  I assumed that the 'collection' worked similar to 'list' with
their custom commands, but it did not turn out so.

Following is the example which I thought should have worked logically.

  set in_list [all_inputs]
    foreach_in_collection element [all_clocks] {
        set in_list [remove_from_collection $in_list  $clock_name]
    }

I was trying to remove a collection item from an collection, but it does not
work.  The following was the workaround to make it work, which I achieved
with the help of a Synopsys FAE.

  set in_list [all_inputs]
  foreach_in_collection element [all_clocks] {
     set clock_name [get_port  $element]
     set in_list [remove_from_collection $in_list  $clock_name]
  }

Reasoning: You cannot remove a collection item from a collection which has
a different object class.  When you run all_clocks it returns a collection
which has a object class as clock.  When you run all_inputs it returns a
collection which has a object class as ports.

I understand this logic, but I really do not understand the need for having
such a limitation in Synopsys tools.  I always wanted it to be like Tcl
'list'.  Why not?

    - Rajkumar Kadam
      Quantum Corp.


( ESNUG 360 Item 18 ) -------------------------------------------- [11/02/00]

From: Matt Gavin <mtgavin@collins.rockwell.com>
Subject: Customer DC Headaches w/ 0.01% Limitation on Fixing Violations

ESNUG gurus,

Could someone please give me a hand here...

I found the following (quoted below) on the solvNET site, in an article
(SYNTH-432604.html) dealing with fanout violations (titled "Problems with
huge fanout violations?"):

  "According to SOLV-IT article STAR-36489, Design Compiler will not fix
   design rule violations less than 0.01 percent."

Is this true?  It might seem to expain why Synopsys is not fixing a handful
of hold time failures of about 0.017ns in my design (it is fixing many,
many others -- I have set the fix_hold attribute for all clocks.)  Seems
quite strange that DC would have this limitation though.

FYI, I can't seem to find this reference to STAR-36489, anywhere on the
SolvNET site.

Help!

    - Matt Gavin
      Rockwell Collins


( ESNUG 360 Networking Section ) --------------------------------- [11/02/00]

Nevada City, CA -- TDK Semiconductor seeks HDL-based engineers with DSP
experience.  No headhunters, please.  "Martin.Gravenstein@tsc.tdk.com"


============================================================================
 Trying to figure out a Synopsys bug?  Want to hear how 11,000+ other users
    dealt with it?  Then join the E-Mail Synopsys Users Group (ESNUG)!
 
       !!!     "It's not a BUG,               jcooley@world.std.com
      /o o\  /  it's a FEATURE!"                 (508) 429-4357
     (  >  )
      \ - /     - John Cooley, EDA & ASIC Design Consultant in Synopsys,
      _] [_         Verilog, VHDL and numerous Design Methodologies.

      Holliston Poor Farm, P.O. Box 6222, Holliston, MA  01746-6222
    Legal Disclaimer: "As always, anything said here is only opinion."
 The complete, searchable ESNUG Archive Site is at http://www.DeepChip.com


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)