( DAC'19 Item 9 ) ------------------------------------------------- [05/21/20]
Subject: SNPS DC/ICC2/Fusion vs. CDNS Genus/Innovus war is Best of 2019 #9
PROXY WAR: Waaay back in 2015 when Anirudh had just launched his assault
on Aart's Design Compiler monopoly, I mentioned how this was NOT a clean
Genus vs DC fight -- but a synthesis-is-married-to-PnR battle. (DAC'15 #6)
Then, in the "Best of 2017" report, the users made it clear that Genus, when
paired with Innvous PnR, was gaining a foothold as credible threat to Aart's
Design Compiler monopoly... (See DAC'17 #4.)
Now, 5 years later, I get to publicaly brag how my insights were soooo very
spot on! [gloating Cooley here]
It's Anirudh's Genus + Innovus vs. Aart's DC + ICC2 - or - Fusion Compiler
which is Shankar Krishnamoorthy's -- Aart's true Head of SNPS PD R&D -- attempt
to "fuse" DC to ICC2 under one database to do a straight RTL-to-GDSII flow that
locks the PD users into a SNPS-only world.
NOT WITHOUT A FIGHT: But if you look at the user comments, both the SNPS and
CDNS sides are having a mix of both "wins" and "loses".
Here's Cadence Genus + Innovus "winning":
"Correlation between the Cadence tools has greatly improved over the
last few years. We believe it's related to the common engines
they've been working on, including for Genus and Innovus. All our
tapeouts now are with Genus."
"We would have to take DC-Topo's netlist, throw away a lot of the
optimization, then do optimize again in ICC2 PnR because it knew
actual loads and distance. Instead, we are using Cadence Genus
plus Innovus more and more (vs DC plus ICC2)."
"Genus and Innovus are 'thread-safe' across multiple CPUs, meaning
that we get 100% repeatable results. That is when we launch a run
on different threads, we still get the same result. It's great
to not have to worry about run-to-run variations."
"We currently get better results from Cadence Innovus versus using
Synopsys Fusion Compiler. CDNS's results are typically 8-10% better
than SNPS in terms of timing, and power, with about the same area.
Here's Cadence Genus + Innovus "losing":
"For IP such as microprocessors, Genus gets consistantly 2-5% better
area results than Synopsys Design Compiler. But for some datapath
intensive IPs we have found CDNS Genus being out-performed by SNPS
DC-Topo. We do not have comparisons to DC-NXT. Innovus PnR cannot
recover the synthesis QOR difference between Genus and DC-Topo in
these cases."
"... and Cadence continues to make incremental improvements. Software
quality is sometimes a struggle as many releases seem to have
corner-case QC escapes."
And the user benchmark wars this year lean heavily in Anirudh's favor -- not
an absolute 100% "win" -- more of a Genus/Innovus is 70% ahead type thing.
User benchmarks DC-ICC2 vs Fusion Compiler vs Genus-Innovus flows
12 good and 4 bad switches in new Genus/Innovus/Tempus 19.1 flow
2nd Fusion Compiler vs. CDNS 19.1 benchmark plus 3 CDNS 19.1 bugs
AART NEEDS FUSION COMPILER CONVERTS: for Synopsys to be safe from Anirudh's
physical design attack, Aart desperately needs his big money customers to
wholeheartedly convert from being DC/ICC2 users over to being Fusion Compiler
users.
That's the only way he can once again lock in a monopoly-ish position like
what he used to have in the olde days with Design Compiler. But if you look
at the user comments below it's not clear that conversion is happening yet;
but there's obvious SNPS effort under way to make it happen.
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
QUESTION ASKED:
Q: "What were the 3 or 4 most INTERESTING specific EDA tools
you've seen this year? WHY did they interest you?"
---- ---- ---- ---- ---- ---- ----
SYNOPSYS DC/ICC2/FUSION READER COMMENTS
Either going full Genus/Innovus or going Fusion Compiler this year.
I cannot tell you how it is going because our engineering mgmt changes
their mind after each visit from the SNPS or CDNS R&D folks.
---- ---- ---- ---- ---- ---- ----
We use Fusion Compiler.
It has the usual beta code bugs, but Shankar is on top of it.
---- ---- ---- ---- ---- ---- ----
We don't like that Fusion Compiler's db won't let us compare its
layout results against Innovus. But beyond that, FC is a step up
from using DC-NXT with ICC2. Fusion is significantly faster with
mostly simular results.
---- ---- ---- ---- ---- ---- ----
We're migrating from DC-NXT + ICC2 over to Fusion Compiler. The
Synopsys FAEs help only somewhat. We have 100's of man-years of
TcL to migrate.
---- ---- ---- ---- ---- ---- ----
1. Cadence Protium
2. Synopsys Fusion Compiler
3. Mentor Calibre
---- ---- ---- ---- ---- ---- ----
After hearing that Intel guy talk, my manager wants us to give
Fusion Compiler a second closer look.
---- ---- ---- ---- ---- ---- ----
At 7nm. Done with DC + ICC2. Going straight Fusion Compiler.
---- ---- ---- ---- ---- ---- ----
still trying to get Fusion to work. ugh.
---- ---- ---- ---- ---- ---- ----
It only took Aart how many years to finally get one unified db
with Fusion Compiler???
---- ---- ---- ---- ---- ---- ----
Pardon the pun but Fusion Compiler is not ready for prime time.
---- ---- ---- ---- ---- ---- ----
We did some test runs of Fusion Compiler at 16FF.
It works. Mostly.
---- ---- ---- ---- ---- ---- ----
Fusion Compiler
---- ---- ---- ---- ---- ---- ----
We're happy with ICC2 and DC-NXT, thank you.
---- ---- ---- ---- ---- ---- ----
Our old timers still like their ancient DC scripts, but they're
begrudgingly switching over to Fusion Compiler.
They see the writing on the wall.
---- ---- ---- ---- ---- ---- ----
We're holdouts against Fusion Compiler. We know DC-NXT + ICC2 and
fear what new set of headaches a new tool will bring.
---- ---- ---- ---- ---- ---- ----
Our purchasing department thinks we can get a better deal with an
all-SNPS flow as compared to an all-CDNS flow.
We try to tell our upstairs that these are not interchangable, but
they don't listen to us engineering pukes.
---- ---- ---- ---- ---- ---- ----
Dumped ICC + DC-Topo
ICC2 + DC-NXT work very well now
---- ---- ---- ---- ---- ---- ----
I don't think you give enough credit to DC-NXT, John.
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
CADENCE GENUS/INNOVUS READER COMMENTS
Cadence Genus + Innovus
Our engineers want their synthesis to understand PnR, so they aren't
forced to over-design. For example, if your synthesis tool knows PnR
will put two elements close together, it wouldn't have to add buffers,
or alternatively PnR wouldn't have to add drivers and delete unnecessary
buffers.
Our team used to use DC Topo for synthesis. Unfortunately, DC Topo had
a big weakness because it didn't have direct influence to drive ICC/ICC2
PnR with assumptions made during its optimization -- since Synopsys
bought Avanti PnR, they actually had a similar gap for their own SNPS
tools as they did with SNPS DC + CDNS Innovus.
We would have to take DC-Topo's netlist, throw away a lot of the
optimization, then do optimize again in PnR because it knew actual loads
and distance.
Instead, we are using Cadence Genus + Innovus more and more (vs DC+ICC2).
---- ---- ---- ---- ---- ---- ----
We use Cadence Genus with Innovus.
Genus' performance scaling is good for our blocks, which are upwards of
2 million instances.
Genus and Innovus are "thread-safe" across multiple CPUs, meaning that
we get 100% repeatable result. That is, when we launch a run on
different threads, we still get the same result. It's great to not have
to worry about run-to-run variations.
WARNING: Cadence Genus' QoR is mixed.
For IP such as microprocessors, Genus gets consistantly 2-5% better area
results than Synopsys Design Compiler.
But for some datapath intensive IPs we have found CDNS Genus being out-
performed by SNPS DC-Topo. We do not have comparisons to DC-NXT.
Innovus PnR cannot recover the synthesis QOR difference between Genus
and DC-Topo in these cases.
In those cases, we used DC-Topo to synthesize the block. The good news
is that Innovus works beautifully with that DC netlist.
Genus + Innovus has multiple synthesis modes:
1. full physical synthesis using Innovus as placement engine.
2. intermediate modes like spatial and hybrid modes use Genus
placement engine for faster turnaround time.
3. fully logical synthesis can be used for prototyping.
Once Anirudh's R&d gets this datapath synthesis problem fixed, we'll
go 100% Cadence. But until then, we need to keep DC around.
---- ---- ---- ---- ---- ---- ----
Cadence Genus
I've done some test runs on Cadence Genus. The runtime was good with
the multi-threading.
Cadence is integrating Genus and Innovus so that they share the same
engine and the correlation is good.
---- ---- ---- ---- ---- ---- ----
Cadence Genus
Genus is a solid tool. We view it as a mature solution, with no major
issues or development.
Our most challenging issue is actually test integration. We would
like to see more native Modus DFT + Genus integration.
---- ---- ---- ---- ---- ---- ----
Genus
With Genus, we're able to do synthesis runs overnight for most of our
designs. (16nm/14nm) The runtime is based on how tight our constraints
are -- such as frequency of operation and tightness of I/O paths.
Genus appears to have at least a 2x faster turnaround time compared to
its Cadence predecessor RTL Compiler. (It's hard to apply a specific
performance metric because Genus does a lot of physical stuff that
RTL Compiler never could do making Genus + Innovus very powerful.)
Our blocks are 1M to 2M instances and we work with the minimum die area
possible. Generally, Genus synthesis is good for area efficiency.
We do our power analysis and performance analysis outside of Genus
through architecture considerations. That is most of our PPA analysis
determined outside of our tool flow, and we then use Genus/Innovus for
reporting rather than for decision-making.
We've stayed with Genus + Innovus instead of Genus Physical. This is
because for our blocks, the Genus + Innovus runtimes are the same, and
we've seen only marginal benefit from Genus Physical.
We have started using Genus Spatial lately. It has the advantages of
Genus Physical plus a fast runtime. It predicts congestion which we
would have not seen otherwise using a pure logic synthesis.
The correlation between the Cadence tools has greatly improved over the
last few years. We believe it's related to the common engines they've
been working on, including for Genus and Innovus.
All our tapeouts now are with Genus. We're able to use it without
any problems, and at this point, we don't see the need to use any other
tool.
---- ---- ---- ---- ---- ---- ----
Cadence Genus
Cadence Genus has been solid for many years, and Cadence continues to
make incremental improvements. Software quality is sometimes a
struggle as many releases seem to have corner-case QC escapes.
Two years ago, Cadence made major scalability turnaround time
improvements for large blocks. Since then, it has been good enough,
though our block sizes have also remained in the same range (up to 5M).
We are also happy with Genus' PPA. We used to be able to find gaps,
but Genus is getting increasingly better results than human designers.
We've been kicking the tires for Genus early physical.
- It's useful for certain workflows and users
- However, it requires flow changes, so the value for the pure
physical design engineer is not as high as for other
investments.
- It would have a higher value for IP than for the backend.
Looking forward:
One significant SW change that Anirudh needs is for scan/ATPG test
needs to be more tightly coupled with Genus, for low DPPM (defective
parts per million) designs such as automotive.
We've been doing test integration after synthesis. But it needs to be
thought of earlier in flow, as changes are needed during synthesis; as
well as during place and route.
---- ---- ---- ---- ---- ---- ----
We use Genus plus Innovus to keep our both our frontend and backend
guys happy.
---- ---- ---- ---- ---- ---- ----
I'd say Innovus/Genus correlation is my #1 for this year.
---- ---- ---- ---- ---- ---- ----
Genus plus Innovus is great.
Genus plus ICC2 is crap.
---- ---- ---- ---- ---- ---- ----
Two years ago we switched to Genus and haven't looked back.
---- ---- ---- ---- ---- ---- ----
We tried to do a Genus vs. DC-NXT vs. Fusion benchmark, but
couldn't get the Fusion data to break out cleanly.
---- ---- ---- ---- ---- ---- ----
Aart wants users to use Fusion Compiler + PrimeTime.
Anirudh wants users to use Genus + Innovus + Tempus.
Other than both companies accept Calibre DRC/LVS, they're making
it difficult for us to do Best in Class point tools now.
---- ---- ---- ---- ---- ---- ----
Genus has now supplanted DC in the Cadence flows my company uses;
because it works so tightly with Innovus.
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
A CADENCE GENUS VS. RTL COMPILER USER BENCHMARK
We're a long time Cadence house.
We've used Cadence Genus for synthesizing with 14nm process nodes.
The largest block we have synthesized with Genus was:
- 2.8M instances, 50 clocks, 900 MHz max frequency
runtime: 12.5 hours; Memory: 30MB
We've used CDNS RTL Compiler for years and know it's quirks. For us
the question is: "Should we switch over to Genus RTL synthesis?"
We ran a benchmark comparing Genus with RTL Compiler runtimes.
synthesis RTL Compiler Genus
block instances runtime (sec) runtime speed-up
------ ---------- ------------- -------- --------
Block1 332K 13,851 7,185 1.9x
Block2 527K 38,675 25,909 1.5x
Block3 722K 20,025 15,561 1.3x
Block4 1,350K 46,141 24,696 1.9x
Block5 1,971K 72,636 35,805 2.0x
Cadence RTL Compiler: v14.20
Cadence Genus: v18.11 (legacy UI mode)
synthesis on System Verilog RTL was run on
- Host: (x86_64 w/Linux 2.6.18-194.el5)
(6cores*12cpus*1physical CPU*Intel Xeon CPU X5670
@ 2.93GHz 12288KB) (296976936KB)
- OS: Red Hat Enterprise Linux Server release 5.5 (Tikanga)
- one machine with 8-CPUs (threads) was used.
We found a 2x in runtime improvement running Genus with only *one*
machine with 8-CPUs (threads).
With *multiple* machines and CPUs, we have no doubt that Genus can
perform 3-5X better.
GENUS VS. RTL COMPILER PPA
With Genus, we are seeing the reduction increase with increasing block
size. The area reduction ranged from 20% to 37%.
synthesis RTL Compiler Genus
block instances area (um2) area (um2) delta
--------- ------ ------------- ---------- ------
Block1 332K 2,220,825 1,772,984 -20%
Block2 527K 2,887,113 2,045,311 -29%
Block3 722K 3,216,537 2,227,261 -31%
Block4 1,350K 5,386,935 3,395,812 -37%
Block5 1,971K 8,929,858 5,985,940 -33%
We have not tried many of the new features in Genus but the 5X
improvement in runtime and ~25% area reductions definitely keep us
from switching to Synopsys Design Compiler variants on our current
or future designs.
---- ---- ---- ---- ---- ---- ----
Related Articles
2nd Fusion Compiler vs. CDNS 19.1 benchmark plus 3 CDNS 19.1 bugs
12 good and 4 bad switches in new Genus/Innovus/Tempus 19.1 flow
User benchmarks DC-ICC2 vs Fusion Compiler vs Genus-Innovus flows
Genus RTL synthesis gaining traction vs. DC is #4 of Best of 2017
Costello on SNPS PnR "still in catch up mode" in 2 years from now
Synopsys layoffs means ICC2 rewrite is unknown for 3 to 4 years out
Engineering comments point to SNPS vs. CDNS PNR shakeout at Apple
Join
Index
Next->Item
|
|