( SNUG 01 Item 16 ) -------------------------------------------- [ 3/28/01 ]

Subject: Synopsys Design Compiler vs. Cadence Ambit vs. Get2chips.com

OLD FAITHFUL:  Yes, Cadence tried it's $260 million Ambit push into the
RTL synthesis market, but Cadence would be lucky if they got back 10 cents
on the dollar for that "investment".  Design Compiler still completely
owns that market and even the last ditch "Ambit fire sale" pricing scheme
tried by Cadence didn't change the EDA landscape one iota.  Too many
Synopsys customers have too much experience, scripts, and know-how
invested in Design Compiler -- and they rather not risk their chips (or
their careers) on a new RTL synthesis tool.  Even the managers at LSI are
afraid to do designs relying solely on Ambit -- and LSI was a big backer
of Ambit before it was aquired by Cadence!  Get2chips also suffers from
this lack of real customer interest, too.  Why?  Because Ambit-RTL and
get2chips are "me, too" tools.  The best they can do is to catch up with
DC.  PhysOpt, PKS, and Magma is where the real battle is being fought
these days.  RTL-to-gates is old news.  Yawn.  It's in RTL-to-GDSII where
the big money 0.18 um crowd is thinking and going.

On the technical front, Synopsys customers loved the idea of marrying DC
with Module Compiler.  Also, judging from the recent bug talk exchanges in
ESNUG and the SNUG trip reports, customers are now starting to use DC's
Automated Chip Synthesis (ACS) add-on tool.


    "We use DC exclusively; however we have done signigicant benchmarking
     and Ambit kicked DC's ass up and down the street -- 6 months ago.  It
     was faster (there was no ACS at that point, but even with just one
     license a piece) and resulted in better area (usually 7%) and better
     timing (2-5%) but could easily meet timing for designs that we had to
     manually take through DC.  You could just throw the whole design at
     Ambit and let it go.  DC needed more tweaking and guidance.

     So, why don't we use Ambit?  A fine question.  Mostly because we have
     too much infrastructure setup for DC.  All the managers want to see
     Ambit used on a production design, but only if it's not their chip.
     Can you say 'chicken'?"

         - an anon LSI engineer


    "I was surprised when I came back from SNUG and had a customer request
     for new vendor inquiries on Ambit, Get2Chip, and Synplicity in an
     ASICs flow."

         - an anon engineer


    "In my opinion DC is the best and the most complete.  Don't have first
     hand experience with Ambit.  Didn't hear good things about it."

         - an anon engineer


    "I think Cadence will always suffer from the "scripts need converting"
     problem.  It is a shame that there isn't a truly generic way to have
     a script that would run on any synthesis tool: it would be so good
     for competitiveness and spur innovation by other companies.  It is
     good that get2chip now does Superlog synthesis."

         - an anon engineer


    "Ambit played their part a couple of years ago, when they forced
     Synopsys to make a major improvement in the QoR and compile time.
     Now it's the turn of Get2chip to do the same...  As for ACS, Steve
     Golson's paper showed very good results with it, so it's definitely
     worth a try."

         - Oren Rubinstein of Nvidia


    "When I tried to use Ambit BuildGates a year ago, it was poorly
     integrated with Cadence's hodgepodge of tools.  It was even a pain to
     use standalone, because a lot of the on-line help and documentation
     was nowhere near as developed as Synopsys.

         - an anon engineer


    "Don't know anything about ACS yet, but my past experience with
     DC vs. Ambit is that DC was better at forming gate logic and
     Ambit was better at optimization (for power or area).  In fact some
     folks were using DC to get the basic circuit and then optimizing
     with Ambit.  Having Module Compiler also gave DC the edge."

         - Grego Sanguinetti of Accelerant Networks


    "Session3: Synthesis and other fun stuff
	
     A couple papers on some synthesis ideas using the currently available
     tools (Yay - no Physical synthesis propaganda!)

     In Paper #1, Bob Wiegand gave his methodology for keeping area small
     while still getting things to pass timing in a 3 pass synthesis
     methodology.  Some interesting ideas there.

     In Paper #2, Steve Golson did some interesting experiments comparing
     bottom up to top down, to "simple compile mode", to "ACS", the new
     automatic full chip synthesis method from Synopsys.  He came to some
     interesting points, namely:

       1. ACS is promising, it consistently had good timing and area
          results.  But sometimes took a while to run.

       2. "simple_compile_mode" which looks really dumb (you set the mode,
          give it ALL your RTL, then say compile)  has good run times, area,
          and just about got decent timing.  In terms of least engineering
          effort, this seems to be the way to go as a first attempt.  If it
          works, ship it!

     Cliff Cummings wrote up a summary of fun considerations when designing
     multiple clock domains, and getting data across them.  His talk
     seemed to be a good summary of methods; I assume his paper is equally
     good.  I'll probably route that one around to the ASIC designers for
     looking at."

         - Paul Gerlach of Tektronix


    "Interestingly enough, we use both DC and Ambit.  DC is better at first
     time synthesis because it gives the best logic mapping.  Later we use
     Ambit for both timing analysis as well as ECO's because it is better
     in timing, area, and runtime."

         - an anon engineer


    "As an old-school engineer, I still think Synopsys is synthesis.  I've
     never really taken to Ambit, especially after the "interesting" results
     on a previous chip.  ACS (when used with LSF) is a powerful idea that
     probably needs more time to mature (cf physical synthesis!!)"

         - Chris Byham of Philips Semiconductors


    "Verilog-2001 Synthesis

     Lance Leong (I think) from Synopsys presented what they would support
     in the Verilog-2001 std.

     Loops (unrollable) - For and While loops will be treated the same.
     Arrays of module instances
     Module parameters (defparam m1.n1.param = foo) - you will be able to
          access parameters in different layers of the hierarchy
     +: and -: added to indexes. 
     sig[i +: c] is the same as sig[i : (i+c)] - this allows for constant
          widths in a bit slice operation.
     2 dimensional arrays **!!
     $signed and $unsigned conversion functions
     Recursion"

         - Dan Joyce of Compaq


    "DesignVision - Replaces design analyzer - look like a much nicer
     interface.  Allows you to view the hierarchy of your design and can
     show you net from the "report" command.  This license is a zero cost
     upgrade from a DA license.  Available in version 2000.11

     "Presto" - new HDL compiler.  5x faster, 33% of the memory of analyze.
     To enable this feature use the command "set hdlin_enable_presto true".

     Verilog compiler bug - When Synopsys put up a slide to show that they
     supported multidimensional arrays in Verilog, they put up the following
     example "reg [0:3][0:7]t;" which, according to Cliff Commings should be
     "reg [0:3]t[0:7];".  Chalk up another Presto bug for Cliff.

     New VHDL netlist reader (3x faster than the old one) use the command
     "enable_vhdl_netlist_reader = true" to enable it.  The command to
     invoke it is "read -netlist -f vhdl ".  This is off by
     default.

     ACS - Automated chip synthesis

       Acs_read_hdl "top" - Reads all of the HDL files in a given directory
         (looks allot like Hsurfer).

       Then "source top_constraints.dc"  - load you top level constraints

       Acs_compile_design "top"  - Compiles hierarchically

       Acs_refins_design "top" - recompiles design.

     Case_analysis - ignore timing to tied inputs.  "set_case_analysis 0
     [get_ports scan_enable]" to invoke "remove_case_analysis" to undo it.
     You can report it with "report_disable_timing".

     There is also a clock gating check.  "set_clock_gating_check -rise
     -setup 1.5 -fall -hold 1.4 [get_clocks ck]" to invoke.

     New automatic ideal nets. "set_auto_idea_nets -scan true" to enable.

     New balance_buffer command.  Has a "-prefer (cell)" option to use a
     specific cell to balance a buffer tree.

     Module Compiler - works by writing TCL scripts.  "read_mcl" and
     "compile_mcl" are the basic commands.

     DC-Ultra - this license provides you with the following extras:
       - Critical path re-synthesis
       - BOA Behavioral optimization of Arithmetic
       - BRT - Behavioral retiming synthesis, does not work with gated
         clocks.

     Some new commands "set_ungroup design_c false" (or true) to tell a
     subdesign to be ungrouped or stay grouped.  Use "compile -auto_ungroup"
     to use this command.

     "report_timing -attributes" gives the timing attributes on cells.

     Synopsys Xilinx Libraries xcb_virtex and xdcs_virtex - Note that
     xdcs_virtex results in bad logic for FC2.  To get around this Coregen
     was used to create multipliers.  "hlo_resource_implimentation =
     constraint driven" - default "none" fastest.

     "hdlin_dont_infer_mux_for_resource_sharing = true", "hdlin_innfer_mux
     = default" and "hdlin_mux_size_limit = 256" is used to infer muxes
     rather than and trees (last depends on the technology).

     Steve Golson paper - Compared several compile modes in Synopsys.  Turns
     out that the best one was to use "set_simple_compile_mode true -verbose"
     to enable it, then just do a compile.

     Cliff Commings paper - To ignore setup in synchronizers use the
     following command "set_annotated_check 0 -setup -hold -from clk1 -to
     xxx/D".  In synchronizers you don't need to worry about setup, just
     hold violations.  Use naming conventions - append the clock name to
     fast signals, this makes it easy to set constraints on them.  Also
     presented a nice section on how gray encoding works.

     Timing closure talk - Floorplan manager didn't work due to the lumping
     of delays in the SDF on input pins, (which is what most vendors do).
     Used Formality and hand edits to fix the design.  IPOFIX is a SNUG tool
     that finds long nets.

     Genove - VHDL timing tool, written in German

     "set_proppagated_clock clk" - do this only on propagated clocks, saves
     you from doing lots of constraints.

         - [ Kenny, from South Park ]


    "Design Vision - new GUI with a lot of very nice features.  Right now
     it only works on sun, not on Linux or NT.  Linux support should be
     available shortly.  We can't use Design Analyzer as a fall back on
     the other platforms because we only have Design Vision license."

         - Bill Lawrie of InfiniCon Systems


    "Cool stuff in DC2000.11/Presto:

     I'm really hot to trot for MC in DC.  I've waited for the bugs to get
     worked out, so I'm skipping straight to 2000.11-SP1 (I believe there
     were some issuers with 2000.05).  I've used transform_csa extensively
     in the past, so I'm familiar with those results.  I've been exposed to
     the use of Module Compiler for heavy datapath stuff and optimizing the
     results in DC, so I'm familiar with those results as well.  Combining
     the two by using partition_dp on verilog RTL is on the very top of my
     hit list, I can't wait to get started.

     In the Synthesis highlights in DC2000.11 session, I heard a user say
     that sometimes transform_csa gives better results than partition_dp on
     smaller paths.  I wonder how this can be if partition_dp uses timing
     and area constraints while transform_csa does not.  I would love to hear
     more from that user, please post some specifics on ESNUG!  I also heard
     that transform_csa will be obsoleted sometime in the future.  I hope
     this issue is resolved before then!

     Another new goodie is the hdlin_use_syn_shifter variable to implement
     shift functions with DesignWare instead of gates.

     Case analysis is a welcome addition to DC.  Of course, you don't want
     to compile this way, but it would be great for analysis and debug.

     Presto:

     I attended the Presto session, and picked up some cool stuff.  Presto
     will allow the use of the infer_mux directive with if/else constructs.
     I asked the presenter what about the Verilog conditional operator
     construct (I've been pointing out for years to various designers that
     the conditional operator will NOT infer a MUX).  I could see the light
     go on in the presenter's eyes as he said "yeah, we could make that
     work."  (John, it's awesome when Synopsys R&D staff members do
     tutorials!)  I hope to see this feature implemented in the future.

     The variable hdlin_vrlg_std can be set to 1995 or 2000 to control the
     support of Verilog 2000 constructs.

     Cliff Cummings attended this session and was answering questions about
     Verilog 2000.  The combination of Cliff and an actual R&D person made
     this a highly informative and worthwhile session.

     Automated Chip Synthesis (ACS):

     My presentation this year ("Have Your Cake And Eat It Too: How To
     Compile For Area AND Timing") was kind of a part 2 to my presentation
     two years ago ("Using MIN/MAX Compile In A Multi-pass Synthesis Flow").

     Back then, the idea of compiling in 3 passes, not over constraining,
     and fixing holds pre-layout seemed highly controversial.  Based on the
     Q&A afterwards, and also at the poster session, these ideas seem a bit
     more acceptable now.  The three pass idea has been adopted by ACS and
     could actually become mainstream!  I received some particular interest
     in the way I use 'characterize' instead of budgeting to refine
     constraints between compile passes, and how that effects DesignWare
     implementation and area.  Some individuals from Intel and some past
     and present Synopsys folks involved with ACS were particularly
     interested in this.

     I've been keeping my eye on ACS and budgeting for a while now.
     Budgeting was being developed when I was beta testing DC 9802, so I got
     some early insight into it.  When the first app note about budgeting
     came out, it described a three pass flow almost identical to mine.
     I'm not so sure RTL budgeting is such a good idea because of capacity
     issues that Steve Golson's presentation pointed out.  It may be better
     just to stick with the 3 passes.

     It was very cool to see Steve Golson's ACS area and timing results
     agree with my 3 pass results.  I still think I have built a better
     mouse trap compared to ACS (no bias here, of course!) since I don't
     have to recompile everything for a small RTL change, my capacity is
     only limited by the largest compiled design I can read in for
     characterization and incremental compiling, and I retain the ability
     to pick the smallest implementations of DW that still meet timing."

         - Bob Wiegand of NxtWave Communications


I put a copy of both of Bob's DC papers in DeepChip's "Downloads".

The Synopsys CAEs for Design Compiler also did something I thought was
really neat; they wrote down all the customer questions asked during the
DC 2000.11 update and they researched answers for all the questions.
Here's the DC 2000.11 Q&A.


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)