( SNUG 01 Item 16 ) -------------------------------------------- [ 3/28/01 ]
Subject: Synopsys Design Compiler vs. Cadence Ambit vs. Get2chips.com
OLD FAITHFUL: Yes, Cadence tried it's $260 million Ambit push into the
RTL synthesis market, but Cadence would be lucky if they got back 10 cents
on the dollar for that "investment". Design Compiler still completely
owns that market and even the last ditch "Ambit fire sale" pricing scheme
tried by Cadence didn't change the EDA landscape one iota. Too many
Synopsys customers have too much experience, scripts, and know-how
invested in Design Compiler -- and they rather not risk their chips (or
their careers) on a new RTL synthesis tool. Even the managers at LSI are
afraid to do designs relying solely on Ambit -- and LSI was a big backer
of Ambit before it was aquired by Cadence! Get2chips also suffers from
this lack of real customer interest, too. Why? Because Ambit-RTL and
get2chips are "me, too" tools. The best they can do is to catch up with
DC. PhysOpt, PKS, and Magma is where the real battle is being fought
these days. RTL-to-gates is old news. Yawn. It's in RTL-to-GDSII where
the big money 0.18 um crowd is thinking and going.
On the technical front, Synopsys customers loved the idea of marrying DC
with Module Compiler. Also, judging from the recent bug talk exchanges in
ESNUG and the SNUG trip reports, customers are now starting to use DC's
Automated Chip Synthesis (ACS) add-on tool.
"We use DC exclusively; however we have done signigicant benchmarking
and Ambit kicked DC's ass up and down the street -- 6 months ago. It
was faster (there was no ACS at that point, but even with just one
license a piece) and resulted in better area (usually 7%) and better
timing (2-5%) but could easily meet timing for designs that we had to
manually take through DC. You could just throw the whole design at
Ambit and let it go. DC needed more tweaking and guidance.
So, why don't we use Ambit? A fine question. Mostly because we have
too much infrastructure setup for DC. All the managers want to see
Ambit used on a production design, but only if it's not their chip.
Can you say 'chicken'?"
- an anon LSI engineer
"I was surprised when I came back from SNUG and had a customer request
for new vendor inquiries on Ambit, Get2Chip, and Synplicity in an
ASICs flow."
- an anon engineer
"In my opinion DC is the best and the most complete. Don't have first
hand experience with Ambit. Didn't hear good things about it."
- an anon engineer
"I think Cadence will always suffer from the "scripts need converting"
problem. It is a shame that there isn't a truly generic way to have
a script that would run on any synthesis tool: it would be so good
for competitiveness and spur innovation by other companies. It is
good that get2chip now does Superlog synthesis."
- an anon engineer
"Ambit played their part a couple of years ago, when they forced
Synopsys to make a major improvement in the QoR and compile time.
Now it's the turn of Get2chip to do the same... As for ACS, Steve
Golson's paper showed very good results with it, so it's definitely
worth a try."
- Oren Rubinstein of Nvidia
"When I tried to use Ambit BuildGates a year ago, it was poorly
integrated with Cadence's hodgepodge of tools. It was even a pain to
use standalone, because a lot of the on-line help and documentation
was nowhere near as developed as Synopsys.
- an anon engineer
"Don't know anything about ACS yet, but my past experience with
DC vs. Ambit is that DC was better at forming gate logic and
Ambit was better at optimization (for power or area). In fact some
folks were using DC to get the basic circuit and then optimizing
with Ambit. Having Module Compiler also gave DC the edge."
- Grego Sanguinetti of Accelerant Networks
"Session3: Synthesis and other fun stuff
A couple papers on some synthesis ideas using the currently available
tools (Yay - no Physical synthesis propaganda!)
In Paper #1, Bob Wiegand gave his methodology for keeping area small
while still getting things to pass timing in a 3 pass synthesis
methodology. Some interesting ideas there.
In Paper #2, Steve Golson did some interesting experiments comparing
bottom up to top down, to "simple compile mode", to "ACS", the new
automatic full chip synthesis method from Synopsys. He came to some
interesting points, namely:
1. ACS is promising, it consistently had good timing and area
results. But sometimes took a while to run.
2. "simple_compile_mode" which looks really dumb (you set the mode,
give it ALL your RTL, then say compile) has good run times, area,
and just about got decent timing. In terms of least engineering
effort, this seems to be the way to go as a first attempt. If it
works, ship it!
Cliff Cummings wrote up a summary of fun considerations when designing
multiple clock domains, and getting data across them. His talk
seemed to be a good summary of methods; I assume his paper is equally
good. I'll probably route that one around to the ASIC designers for
looking at."
- Paul Gerlach of Tektronix
"Interestingly enough, we use both DC and Ambit. DC is better at first
time synthesis because it gives the best logic mapping. Later we use
Ambit for both timing analysis as well as ECO's because it is better
in timing, area, and runtime."
- an anon engineer
"As an old-school engineer, I still think Synopsys is synthesis. I've
never really taken to Ambit, especially after the "interesting" results
on a previous chip. ACS (when used with LSF) is a powerful idea that
probably needs more time to mature (cf physical synthesis!!)"
- Chris Byham of Philips Semiconductors
"Verilog-2001 Synthesis
Lance Leong (I think) from Synopsys presented what they would support
in the Verilog-2001 std.
Loops (unrollable) - For and While loops will be treated the same.
Arrays of module instances
Module parameters (defparam m1.n1.param = foo) - you will be able to
access parameters in different layers of the hierarchy
+: and -: added to indexes.
sig[i +: c] is the same as sig[i : (i+c)] - this allows for constant
widths in a bit slice operation.
2 dimensional arrays **!!
$signed and $unsigned conversion functions
Recursion"
- Dan Joyce of Compaq
"DesignVision - Replaces design analyzer - look like a much nicer
interface. Allows you to view the hierarchy of your design and can
show you net from the "report" command. This license is a zero cost
upgrade from a DA license. Available in version 2000.11
"Presto" - new HDL compiler. 5x faster, 33% of the memory of analyze.
To enable this feature use the command "set hdlin_enable_presto true".
Verilog compiler bug - When Synopsys put up a slide to show that they
supported multidimensional arrays in Verilog, they put up the following
example "reg [0:3][0:7]t;" which, according to Cliff Commings should be
"reg [0:3]t[0:7];". Chalk up another Presto bug for Cliff.
New VHDL netlist reader (3x faster than the old one) use the command
"enable_vhdl_netlist_reader = true" to enable it. The command to
invoke it is "read -netlist -f vhdl ". This is off by
default.
ACS - Automated chip synthesis
Acs_read_hdl "top" - Reads all of the HDL files in a given directory
(looks allot like Hsurfer).
Then "source top_constraints.dc" - load you top level constraints
Acs_compile_design "top" - Compiles hierarchically
Acs_refins_design "top" - recompiles design.
Case_analysis - ignore timing to tied inputs. "set_case_analysis 0
[get_ports scan_enable]" to invoke "remove_case_analysis" to undo it.
You can report it with "report_disable_timing".
There is also a clock gating check. "set_clock_gating_check -rise
-setup 1.5 -fall -hold 1.4 [get_clocks ck]" to invoke.
New automatic ideal nets. "set_auto_idea_nets -scan true" to enable.
New balance_buffer command. Has a "-prefer (cell)" option to use a
specific cell to balance a buffer tree.
Module Compiler - works by writing TCL scripts. "read_mcl" and
"compile_mcl" are the basic commands.
DC-Ultra - this license provides you with the following extras:
- Critical path re-synthesis
- BOA Behavioral optimization of Arithmetic
- BRT - Behavioral retiming synthesis, does not work with gated
clocks.
Some new commands "set_ungroup design_c false" (or true) to tell a
subdesign to be ungrouped or stay grouped. Use "compile -auto_ungroup"
to use this command.
"report_timing -attributes" gives the timing attributes on cells.
Synopsys Xilinx Libraries xcb_virtex and xdcs_virtex - Note that
xdcs_virtex results in bad logic for FC2. To get around this Coregen
was used to create multipliers. "hlo_resource_implimentation =
constraint driven" - default "none" fastest.
"hdlin_dont_infer_mux_for_resource_sharing = true", "hdlin_innfer_mux
= default" and "hdlin_mux_size_limit = 256" is used to infer muxes
rather than and trees (last depends on the technology).
Steve Golson paper - Compared several compile modes in Synopsys. Turns
out that the best one was to use "set_simple_compile_mode true -verbose"
to enable it, then just do a compile.
Cliff Commings paper - To ignore setup in synchronizers use the
following command "set_annotated_check 0 -setup -hold -from clk1 -to
xxx/D". In synchronizers you don't need to worry about setup, just
hold violations. Use naming conventions - append the clock name to
fast signals, this makes it easy to set constraints on them. Also
presented a nice section on how gray encoding works.
Timing closure talk - Floorplan manager didn't work due to the lumping
of delays in the SDF on input pins, (which is what most vendors do).
Used Formality and hand edits to fix the design. IPOFIX is a SNUG tool
that finds long nets.
Genove - VHDL timing tool, written in German
"set_proppagated_clock clk" - do this only on propagated clocks, saves
you from doing lots of constraints.
- [ Kenny, from South Park ]
"Design Vision - new GUI with a lot of very nice features. Right now
it only works on sun, not on Linux or NT. Linux support should be
available shortly. We can't use Design Analyzer as a fall back on
the other platforms because we only have Design Vision license."
- Bill Lawrie of InfiniCon Systems
"Cool stuff in DC2000.11/Presto:
I'm really hot to trot for MC in DC. I've waited for the bugs to get
worked out, so I'm skipping straight to 2000.11-SP1 (I believe there
were some issuers with 2000.05). I've used transform_csa extensively
in the past, so I'm familiar with those results. I've been exposed to
the use of Module Compiler for heavy datapath stuff and optimizing the
results in DC, so I'm familiar with those results as well. Combining
the two by using partition_dp on verilog RTL is on the very top of my
hit list, I can't wait to get started.
In the Synthesis highlights in DC2000.11 session, I heard a user say
that sometimes transform_csa gives better results than partition_dp on
smaller paths. I wonder how this can be if partition_dp uses timing
and area constraints while transform_csa does not. I would love to hear
more from that user, please post some specifics on ESNUG! I also heard
that transform_csa will be obsoleted sometime in the future. I hope
this issue is resolved before then!
Another new goodie is the hdlin_use_syn_shifter variable to implement
shift functions with DesignWare instead of gates.
Case analysis is a welcome addition to DC. Of course, you don't want
to compile this way, but it would be great for analysis and debug.
Presto:
I attended the Presto session, and picked up some cool stuff. Presto
will allow the use of the infer_mux directive with if/else constructs.
I asked the presenter what about the Verilog conditional operator
construct (I've been pointing out for years to various designers that
the conditional operator will NOT infer a MUX). I could see the light
go on in the presenter's eyes as he said "yeah, we could make that
work." (John, it's awesome when Synopsys R&D staff members do
tutorials!) I hope to see this feature implemented in the future.
The variable hdlin_vrlg_std can be set to 1995 or 2000 to control the
support of Verilog 2000 constructs.
Cliff Cummings attended this session and was answering questions about
Verilog 2000. The combination of Cliff and an actual R&D person made
this a highly informative and worthwhile session.
Automated Chip Synthesis (ACS):
My presentation this year ("Have Your Cake And Eat It Too: How To
Compile For Area AND Timing") was kind of a part 2 to my presentation
two years ago ("Using MIN/MAX Compile In A Multi-pass Synthesis Flow").
Back then, the idea of compiling in 3 passes, not over constraining,
and fixing holds pre-layout seemed highly controversial. Based on the
Q&A afterwards, and also at the poster session, these ideas seem a bit
more acceptable now. The three pass idea has been adopted by ACS and
could actually become mainstream! I received some particular interest
in the way I use 'characterize' instead of budgeting to refine
constraints between compile passes, and how that effects DesignWare
implementation and area. Some individuals from Intel and some past
and present Synopsys folks involved with ACS were particularly
interested in this.
I've been keeping my eye on ACS and budgeting for a while now.
Budgeting was being developed when I was beta testing DC 9802, so I got
some early insight into it. When the first app note about budgeting
came out, it described a three pass flow almost identical to mine.
I'm not so sure RTL budgeting is such a good idea because of capacity
issues that Steve Golson's presentation pointed out. It may be better
just to stick with the 3 passes.
It was very cool to see Steve Golson's ACS area and timing results
agree with my 3 pass results. I still think I have built a better
mouse trap compared to ACS (no bias here, of course!) since I don't
have to recompile everything for a small RTL change, my capacity is
only limited by the largest compiled design I can read in for
characterization and incremental compiling, and I retain the ability
to pick the smallest implementations of DW that still meet timing."
- Bob Wiegand of NxtWave Communications
I put a copy of both of Bob's DC papers in DeepChip's "Downloads".
The Synopsys CAEs for Design Compiler also did something I thought was
really neat; they wrote down all the customer questions asked during the
DC 2000.11 update and they researched answers for all the questions.
Here's the DC 2000.11 Q&A.
|
|