( DAC'18 Item 4b ) ------------------------------------------------ [01/23/19]
Subject: Cadence Innovus dominates Synopsys ICC/ICC2 is #4b "Best of 2018"
EXECUTIVE SUMMARY: here's the status of the 7nm digital PnR market in 2018
What follows below is the backstory and user comments proving it. -John
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
AART CORRECTS HIS 2ND PNR MISTAKE: Aart's first PnR mistake was prematurely
launching his unready IC Compiler 2 tool at SNUG'14 as a way to counter the
early momentum that Anirudh was getting with his brand new Innovus PnR tool.
ESNUG 537 #10 & ESNUG 538 #1 give the Day 1 details of what ICC2 had and
what it lacked (placer not working, no MCMM, old router, old CTS, no unified
database) on its SNUG'14 launch. (As I said, it was premature.)
And from 2014 to 2018, IC Compiler II just went downhill from there.
Readers on ICC II, ATOP, CDNS EDI, upcharges, Z-Route, 24 months
User benchmarks new CDNS Innovus vs. SNPS ICC/ICC2 workaround
ICC2 patch rev, Innovus penetration, and the 10nm layout problem
Qualcomm, Nvidia, ST join as Innovus users in 2Q15 earning call
AART'S 2ND MISTAKE: So instead of just rightiously chewing out his highly
experienced 20 year EDA R&D veterans like Antun Domic and Michael Jackson,
in 2016 Aart instead "promoted" Antun and Michael out of PnR R&D, while also
gutting 200 of his senior PnR R&D staff...
"I don't know what the exact numbers are, but Synopsys laid off
200 people between 55-60 years old. So, they cleaned house of
everybody in the prior generation."
- Jim Hogan of Vista Ventures
DAC'17 Troublemakers Panel in Austin, TX
... and then (Aart's 2nd mistake) he put Sassine Ghazi, the former VP of
North American Sales, to be the head of all SNPS PnR R&D! (Gadfly 161201)
As an 11 year sales guy, Sassine is 100% all aces when it comes to managing
customer relationships, etc. That's was sales guys do. But putting a
sales guy in charge of all your long term PnR R&D decisions -- even though
he's loyal and a nice guy -- is very stupid move.
EDA R&D nerds are super sharp guys who will come up with 40 different -- yet
very clever -- ways to solve any specific EDA problem. And these EDA R&D
nerds will get into bitter internal wars over which one of their 40 different
particular solutions is the one "best" way to fix a chip design SW problem.
Having Sassine, an 11 year sales guy, adjudicating deeply involved PnR R&D
quandries is like having me deciding the future of cancer research. I hate
how cancer is killing off my friends. I can parrot cancer research stuff
I've heard works and doesn't work. But if I'm the one making the cancer
research decisions, I'm vulnerable to just following what the most recent
cancer R&D guy tells me "because what he said sounds right to me right now."
How to fix this? Roughly 7 months after appointing Sassine to publicaly
"lead" SNPS PnR R&D, in August 2017 you very quietly bring in a real 25 year
EDA PnR R&D veteran, Shankar Krishnamoorthy, who's not an EDA sales guy to
do all the actual PnR R&D heavy lifting. Shankar is the SNPS Dick Cheney
to make Sassine's SNPS George W. Bush look good.
Smart move, Aart. But now it begs the question "is waiting 16 months ago
till August 2017 to get a competent leader to manage a *total* restaffing of
your ICC2 R&D enough to catch up with Anirudh's Innovus PnR dominance?"
"Because we forget sometimes, these are really hard problems and when
you actually pull together a team, you know a large team, with the
diverse skills to actually solve these problems -- you know it's not
quite a miracle -- but it's a work of art to make that happen.
And it's not a guarantee that you're going to be able to do it. So I
hope for Aart's company that he's able to do that -- but it's not a
guarantee that he can make it happen.
The second thing is I doubt there will be that much trouble because I
don't think there's a real phase shift going on at that time, [2 years
from now] too, so [Aart] will still be in catch up mode.
I mean that's the cycle that goes on..."
- Joe Costello, former CEO of Cadence
DAC'18 Troublemakers Panel in San Francisco
---- ---- ---- ---- ---- ---- ----
COOL STORY BRO, BUT ABOUT THAT 7NM CLAIM... Do you recall last year's Best
of 2017 when Cadence Genus plus Innovus got more traction against Synopsys
DC plus ICC2 due to CDNS's tight integration at the database level?
"What makes this interesting is since Innovus has been eating ICC2's
lunch in PnR over the past few years -- we might now be seeing a
tipping point happening in the RTL synthesis market, too. From the
comments below, it appears that Cadence Genus RTL, when paired with
Innvous, is now a credible threat to Aart's Design Compiler monopoly."
- from http://www.DeepChip.com/items/dac17-04.html
Well apparently that was a major clue as to what was going to happen at 7nm.
Users doing 7nm designs came out of the woodwork this year. Not only did
they say Cadence is better than Synopsys at 7nm, they are also saying that
great P&R isn't enough at 7nm -- you need integrated tools, also.
On top of that, Cadence users volume got louder. Cadence tool word count
this year in the survey was up by 116%.
Cadence User word count on Innovus + Genus + Tempus
2017 : :################ 1566 words
2018 : :################################## 3387
---- ---- ---- ---- ---- ---- ----
IT'S EASIER TO TRACK WHERE INNOVUS IS NOT: to get a sense of where Innovus
has deeply penetrated, I'm just going to red XXXX out where Innovus is NOT
significant in this old Top 20 semi listing.
Last I looked, Intel, which makes up 11% of all SNPS revenues is still very
much an ICC2 house (partially due to Sassine as VP of Sales leadership)
along with Samsung being a mostly ICC2 house. (Sorry, Anirudh.)
On the flip side, Qualcomm, Broadcom, Apple, NXP, Nvidia, TSMC, MediaTek,
GlobalFoundries, TI, and even ARM are all big Innovus users; most either at
7nm or 5nm now. What's embarrassing here is 3 years ago practically all
of those logos were in Aart's ICC camp. (Sorry, Aart.)
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
QUESTION ASKED:
Q: "What were the 3 or 4 most INTERESTING specific EDA tools
you've seen this year? WHY did they interest you?"
---- ---- ---- ---- ---- ---- ----
We are doing 7nm designs and find Innovus leads for PnR, especially
at 7nm.
(We are a services company, so we also have experience with ICC2.)
Innovus does 5M to 10M instance blocks easily. It's key advantages
are speed and accuracy. We also bring in ARM IP, and Cadence's flow
for that is superior (better PPA) against ICC2.
Innovus has good PPA for our tough blocks like CPU, GPU, modem, and
overall SoC control logic. What we like:
- for most blocks, Innovus has 3-5X faster TAT than ICC2
- massively parallel architecture (shout out to Anirudh!)
- Innovus scales. We see near-linear performance (nice!)
- We like GigaPlace and GigaOpt
- We can do tight mixed-signal with Innovus + Virtuoso.
At 7 nm, integration is perhaps even more important than Innovus' point
tool performance. PnR alone is not enough for successful 7nm silicon.
You need your synthesis and signoff tools (e.g. Genus and Tempus) to be
tightly integrated with your P&R tool because of 7nm physics.
Cadence does this well -- in particular, we see a lot of benefit from
its integrated physical synthesis, timing aware ECO, and EMIR. This
integration sets Cadence apart from Synopsys and Avatar (ATOP).
In terms of 7nm scan and 7nm DRC/LVS interoperability, we have used
the Cadence flow with Mentor Tessent and Calibre DRC -- and they work
really well together.
---- ---- ---- ---- ---- ---- ----
We've used Cadence Innovus for close to a year now.
We used to have a full Synopsys digital flow (DC, ICC2, Primetime).
Then I heard from my colleagues that Cadence was gaining share in place
and route -- and where all of A-list chip design companies have fully
bought into the overall CDNS full-flow ecosystem.
This opened the door with my MGMT to our looking at Innovus. We always
push our designs hard, and needed flow with good PPA. My MGMT wanted
to see if Cadence would take us there. In particular, Innovus' reported
performance with Genus RTL synthesis vs DC-Graphical in ESNUG 582 #1 was
one of the things that attracted us.
We are currently using Innovus for production work at a 7 nm process
node. Here is my feedback.
Performance - Innovus vs. IC Compiler II
- Overall, for the same timing results, we've been getting a 2X
runtime performance speed up with Innovus over ICC2.
- Our largest blocks have ~2 million instances.
- We use Cadence's multi-threading -- we use 8 cores for most of
our design work, and 16 cores for our larger blocks. In the
future we will look at using Cadence's multi-distribution
capabilities. But right now we can't speak to it.
Timing - Primetime + Tempus
Innovus' timing results have met our overall needs. We use Primetime
for our signoff environment; we've found there is a good correlation
between Innovus and Primetime. Also impressive is the ability to SPICE
a path when using Tempus. Seeing the timing reports with Tempus and
SPICE timing results interleaved makes finding the outliers very easy.
Synopsys Design Compiler compatibility
- We compared DC + ICC2 with DC + Innovus. The PPA is similar
but Innovus is faster.
- Innovus can read the DC Verilog netlist and SDC constraints
(which are industry standard formats.)
- Someday we hope to test Genus RTL synth + Innovus to see the
impact of the tighter integration. For now, just folding
Innovus into our flow is a big enough step for us. So, we
are sticking to Innovus with Synopsys Design Compiler.
DRC - Calibre with Innovus vs. ICC2
We use Calibre for DRC signoff, and there is some integration of Calibre
back into the Cadence GUI. One of the things I've been impressed with
for Cadence is that the DRC violations for our blocks is always lower
with Innovus vs ICC2. This better DRC result is compelling as it will
reduce our ECO iterations -- not all iterations are timing-based.
Power - Innovus vs. ICC2
We see similar results, with a small edge to Cadence on clock power.
Utilization
Innovus utilization numbers are within 2% percent of the ICC2 results.
GigaPlace global routing
We are using Innovus' GigaPlace as part of our flow. GigaPlace does the
first initial global placement, which is an especially important first
step for 7nm implementations. If the global placement is bad, it's
game over.
Things are extremely sensitive at 7nm, e.g. moving large cells later
in the flow can cause quite a bit of disturbance which can be hard to
recover from, so getting the initial global placement to honor all the
7nm timing requirements will go a long way to avoiding these bumps.
Overall we see good correlation of TNS as the Innovus flow proceeds.
CDNS Project Virtus - IR-Aware Placement
There are many areas of voltage drop in a power grid that can impact
performance. Innovus can strengthen power grid or move cells out
without impacting your grid. It ensures that routing tracks are
available and that your performance is not degraded.
We continue to push Cadence R&D for more technology advancements;
especially as a we expect to move to designing at 5nm in 2019.
So far, we are satisfied. Cadence's digital tools are moving in the
direction Anirudh wants to take them. (See ESNUG 584 #3)
For us, Innovus' most differentiating feature right today is the 2X
runtime improvement, as well as great repeatability from run to run.
---- ---- ---- ---- ---- ---- ----
Cadence Innovus
We do services, so we use both Cadence Innovus and Synopsys ICC 2 for
PnR, depending on what is best for our customers' situation. We
constantly compare tools so we can use the ones that get us the best
possible result for our customers.
When our customers let us choose the PnR tool, we've recently been
choosing Cadence Innovus (with Genus & Tempus during PnR). Below
is my feedback.
Innovus Speed up & Scalability
Our designs have gotten 1.5X-2X bigger over the last year or so, and
I've noticed the speed up for Innovus.
- We break up our designs hierarchically. Our maximum target
block size is 1.5M-2.0M instances, so that we can better manage
it and have nightly turnaround. The more you break up your
design, the more overhead -- so we must balance the
runtime/turnaround tradeoff.
- Innovus can still do them overnight.
We use both the multi-CPU & multi-threaded distributed approaches with
Innovus. 8 to 16 CPUs is typical for us, and this hasn't changed over
the past year. I always want to get results faster, and have been happy
with the improved performance we've gotten over the past year -- without
having to add more machines.
Innovus scales well. In particular, Innovus routing scales well. Its
placement does also, though to a lesser degree. Those are the two areas
where we most take advantage of the distributed architecture.
PPA / QUALITY-OF-RESULTS
We've also seen improvements in Innovus' quality. We do nightly runs to
close our designs for sign-off. To minimize over-designing, we use
smaller margins during PnR optimization and spend more time at sign-off
to close a design with more accurate data. It takes us months to
iterate through the loops to meet our specs.
- Better quality-of-results for each run means fewer iterations
for us to meet our specs.
- It used to take months of runs to get closure. Now, in
addition to the Innovus runs taking the same time for larger
blocks, it's quality-of-results is also better.
- Equivalently it takes us only about 50% as many runs to
converge with Innovus. So now we can do bigger designs in
the same timeframe.
- We are currently experiencing faster convergence with Innovus
than ICC2.
Cadence has a QoR edge in great part because of Genus + Innovus. The
tools drive each other in a close loop, so we get better results faster.
(See ESNUG 582 #1) I've also heard there are big improvements in
GigaPlace and GigaOpt. We haven't done a test case yet, but hope to do
so after our next tapeout.
TIMING CORRELATION
Tempus for STA has greatly improved over the last several years. We
use it during our RTL and PnR design for maximum efficiency, and then
do final golden signoff with Primetime.
Tempus correlates with PnR pretty well. In particular, we now get good
correlation for our corners, so we don't get surprises. Synopsys
Primetime used to have this same problem with their PnR tools too,
but they fixed it and also do a better job with correlation now.
INNOVUS AT 7NM
We've been doing designs at 10nm for the past 2 years, and are pushing
to 7nm next. At 7nm, most problems are in routing and timing closure.
We did a test chip at 7nm with one customer early this year. We needed
to figure out if:
- Would an Cadence Innovus flow actually work at 7nm?
- Which elements (analysis, tuning) would we needed to add
to Innovus to avoid any surprises?
- Is the 7nm node itself ready for production?
Our conclusion was that Cadence is ready for 7nm, as is the process
node at TSMC. We feel we are now ready to use both for a production
project.
We looked at both Genus + Innovus and DC + ICC2. Our official choice
is Cadence Genus + Innovus (+ Tempus during PnR), plus we will signoff
with Mentor Calibre and Synopsys Primetime.
I'd recommend Cadence if you have a choice. The momentum is there.
They seem to be gaining share over Synopsys. About 3-4 years ago, we
started noticing customers talk about Cadence more, and it's continued
to increase over the last couple of years.
---- ---- ---- ---- ---- ---- ----
We've used Cadence Innovus for three years. We've worked with it in
7nm, 10nm, and 14nm, and my group has had tapeouts & silicon in 7nm,
10nm, and 28nm (ICC2) and 40nm (ICC).
With Innovus, we've had no dead chips -- a 100% success rate with
very strong correlation silicon across the board.
We do relatively small blocks (<500k gates), but our blocks are very
complex blocks with 10's of clock domains running at multi-GHz. We've
closed timing on paths as fast as 8 GHz with Innovus.
Innovus' PPA tends to be ~5% better than ICC2. From what we see, this
seems to be a result of:
- Cadence's end-to-end integration of data in various SoC design
steps, e.g. the full integration from logic synthesis (Genus)
all the way up to timing sign off with Tempus and reliability
with Voltus
- Cadence's concurrent clock optimization, CCopt.
Below are some examples of how Innovus' common engines with the other
Cadence tools helps TAT and/or PPA.
1. Genus RTL + Innovus
Having Genus + Innovus integration significantly accelerated
our flow development since majority of commands are shared
across Genus & Innovus.
We have previous used the DC synthesis + Innovus combination
from two different vendors without data integration, and it
created extra flow maintenance overhead. Worked, but awkward.
2. Tempus + Innovus
Same as with Genus RTL. The ability to share commands and
pass extraction results from Innovus to Tempus saves us manual
and/or scripted steps that would be required otherwise.
3. Innovus + Voltus
Integration for power grid optimization and Power ECO is very
useful. It's very important to develop a proper power plan
early in the SoC design flow process. Foundry recommendations
are generally either are too conservative or too barebones.
We run sample PnR with Innovus and then do IR analysis with
Voltus. A proper power methodology requires many iterations
of running PnR + EMIR analysis, so the Innovus + Voltus
integration makes saves us many manual steps of handing off
data from PnR to EMIR. We like Anirudh's Project Virtus idea
at DAC (see ESNUG 584 #3) but we haven't tried it yet.
We like that Anirudh has integrated Innovus + Virtuoso -- but we have
not yet used it. We'll definitely be exploring it in the coming months,
as it would (hopefully) remove a couple manual stages from our overall
design flow.
RUNTIME COMPARISIONS -- 3X OR 20% FASTER?
It's simple to run Innovus in single or multi-thread mode from a single
command. We typically only run one or two threads, so I don't have data
on the scalability beyond that. We've heard from industry colleagues
that Innovus' turnaround time (TAT) is 2-3X faster than Synopsys ICC2;
but in our direct experience, Innovus' TAT is only about 20% faster.
Since TAT is a function of the design size, complexity and how hard we
push utilization and timing paths, our lower number might be because our
designs are bigger and more complex than an average design.
Innovus has an advantage in advanced FinFET nodes. This based on my
conversations with 3 large high performance SoC teams that do 400+ mm2
chips for very high-performance applications committed to 5nm. These
guys are pushing the performance envelope.
Based on our interactions with customers and partners, Innovus has been
gaining steam since 2015, and their advantage has grown in 7nm and 5nm.
ICC2 continues to do well because Aart's historically had this market
and some of the large customers have millions of lines of Synopsys PnR
flows written that they would not let go of anytime soon.
However, the 3 big customers that we know who are now designing at
5nm are all using Innovus.
---- ---- ---- ---- ---- ---- ----
Cadence Innovus
We are currently using Innovus for applications such as automotive at
28nm and below process nodes.
1. We've seen significant improvements in Innovus'
"through-the-flow" run times over the past 1-2 years
- We typically run up to 3-5M instance blocks
- Innovus' turnaround time is in the 2-3 day range for
such blocks.
2. We are able to achieve almost out-of-the-box clean designs --
including timing and DRC on designs at 16nm.
3. Our most significant challenge is early blocks and getting
critical feedback before the blocks are in a good state for
final closure. This is a general problem with physical
design tools -- we need them to be more resilient to early
blocks with constraint issues, deep critical paths, etc.
4. Some parts in the flow - notably routing and CTS -- have good
multi-CPU efficiency. (Other areas are not as efficient).
We generally don't run more than 8-12 CPUs, even though
Cadence advertises higher; this is due to CPU usage
efficiency, or lack thereof.
Qualitatively, Innovus' robustness has improved and it's
quality seems to be better. It also appears to be very
strong in the market.
---- ---- ---- ---- ---- ---- ----
Cadence Innovus is gaining market share in place and route.
- We run 4-5M instances in 28nm and 16nm with a turnaround time
of about 5 days -- from floorplan to post route hold.
- Our largest blocks were 5.5 M instance in 28nm technology.
- Most of our blocks are 1-2M instance, with a turnaround time
of 2 days, and 3-4M instances with turnaround time of 5 days.
- We have used multi-CPU options with 8, 12 and 16 core machines.
We've run Innovus for 16nm with a smaller number of metal layers, where
the density varied from 50 to 75% depending on the block congestion.
We've use Innovus' GigaPlace option "place_opt_design" successfully on
28nm and 16nm technologies. Sometimes the module placement is not very
intuitive, in which case we need to split placement from optimization.
GigaPlace tends to cluster placement and not spread it uniformly, which
sometimes leads to congestion during hold fix phase.
Looking forward, I expect machine learning in place and route will help
to estimate run time, number of cores from previous jobs, as well as be
able to make decisions regarding cores and licensing; which jobs to run
first. (I already know couple of companies that are already using
machine learning concepts to predict tape-out.)
Cadence's most important improvement for Innovus, was the Innovus-to-
Tempus correlation and the use of automatic ECO. The combination has
cut down our timing closure cycle.
---- ---- ---- ---- ---- ---- ----
Cadence Innovus
The fundamentals clearly favor Cadence for implementation and sign-off
because of its common database. Of course, Cadence has put incredible
work into the algorithms and the interfaces, but it is the common
database that allows all this integration.
- I think the industry learned when Anirudh first released his
SPICE tool while with Magma that he is the master of combining
high accuracy algorithms with best available distributed
multiprocessing.
- Nanometer design is impossible if the implementation tools
(synthesis, placement, routing) and sign-off analysis software
(timing, power) cannot pass information seamlessly (same
database).
With these fundamentals, it is no surprise that Cadence now is the clear
leader in implementation and a fully accepted sign-off option for design
teams and foundries alike.
- We now have the nest possible flow for nm design -- especially
designs under 10nm. We use Innovus as our implementation
tool, with its specific features (GigaOpt, GigaPlace,
integration with Virtuoso, ...) and signoff tools (Tempus and
now Virtus).
- It is hard to substantiate all rumors, but I am not aware of
any sub-10nm designs being implemented with any design flow
other than Innovus.
Only Cadence has this must-have combination today:
- The full and intelligent integration of the Cadence
integration and sign-off flow is crucial to deep nanometer
design. Genus is a perfect example of this integration. The
development team for Genus has now delivered a very capable
synthesis capability -- again, its integration with the rest of
the flow is its most important "feature".
- The fully distributed multiprocessor capability which
facilitates the needed capacity for today's SOCs.
We will continue to expect small feature enhancements throughout the
entire flow when possible (and larger breakthroughs when required), but
in today's designs, I believe the overall flow/integration is much more
important than small advantages of one point tool over another.
---- ---- ---- ---- ---- ---- ----
I am personally inclined to only buying state-of-the-art software from
Cadence. I was a user of Nanoroute which is the router that got bought
by Cadence, and to this day it has set the standard for being the best
and fastest router.
I have historically thrown out all software from Synopsys, and have been
more partial to Cadence, but because they believe in win-win. Synopsys
doesn't care; they are a monopolistic company.
---- ---- ---- ---- ---- ---- ----
Cadence Innovus
We use Cadence Innovus for AI and IoT applications.
Cadence's fully distributed architecture (multi-CPU & multi-threading)
worked well for us. Our turnaround time with Innovus -- with the
parallel architecture -- is ~ 3X; our turnaround time reduced from 3 days
to only 1 day. For our smaller designs, Innovus is about 1.5x faster
than their old Encounter.
The largest block we have done is 1 million instances. Cadence's
GigaPlace and GigaOpt improve Innovus performance by about 15%.
Cadence has now integrated Innovus with Virtuoso. However, we've not
found a suitable case for this feature; for now it's better for us to
continue to partition and handle separately.
The LVF support in Innovus and Tempus is particularly useful for Near
Threshold Voltage (NTV) design success. The timing and accuracy
problems at advanced nodes are replicated at NT voltages. We taped out
a 55nm NTV chip and the measured performance matched our simulations
very well (within 5%). We have not observed any timing violations in
silicon to date.
We also had a very complicated power domain structure, and used CPF to
implement and verify in Innovus and Tempus. Our silicon worked well
with no problems so we were very pleased with the CPF macro-model
support features which was key to our design success.
---- ---- ---- ---- ---- ---- ----
Innovus outperforms ICC.
We have both tools and see faster runtimes for Innovus due to how
Innovus leverages parallel architecture.
- Innovus does multi-threading using multiple cores from the
same CPU sharing the same memory
- It also does multi-CPU
- The scaling isn't quite linear, but close.
Synopsys does some multi-core but it's not "massively parallel".
Our primary focus is power and performance. Both products can come up
with good solutions. Cadence comes up with it faster, and perhaps a bit
better, but we haven't tried to quantify it. Our needs for placement
quality is mostly geared to IP performance, and GigaPlace does a good
job there.
We've been using Innovus for 3 years now. My sense is that Cadence is
in the lead in place and route - they've now passed Synopsys ICC, and
Synopsys ICC2 has not caught up.
Innovus -- and actually all of Anirudh's tools -- are all architected to
leverage distributed computing. e.g. multi-core, distribution. It
makes them suited for the future.
---- ---- ---- ---- ---- ---- ----
Related Articles
Real Intent smacks Synopsys CDC & RDC signoff as #3 "Best of 2018"
Avatar/AtopTech's big comeback in digital PnR is #4a "Best of 2018"
Cadence Innovus dominates Synopsys ICC/ICC2 is #4b "Best of 2018"
Join
Index
Next->Item
|
|