( DAC'17 Item 8 ) ------------------------------------------------- [03/06/18]
Subject: Badru's Calypto Catapult HLS and PowerPro get #8 for Best of 2017
BADRU'S DOUBLEHEADER: Roughly 3 years ago, Badru Agarwala became the new GM
of Calypto after being it reabsorbed back into the Mentor (now Siemens)
mothership. It was a fun news scoop in what was then the "high drama" world
of C/C++/SystemC based chip design from 2011 to 2015.
2015: SCOOP! -- rumor is Calypto is merging back into Mentor Graphics!
2015: 46 readers on Calypto, Gary Smith, Veloce, Ansys, SNPS C Compiler
2011: Carl Icahn accountability forced MENT to trade away Catapult C
2011: Mentor rumored to have traded its Catapult C division for Calypto
2011: Calypto finally spells out the details behind Catapult C merger
It was good drama that involved Carl Icahn (when he tried that hostile
takeover of MENT) forcing Mentor to sell Catapult C over to Calypto -- only
to then 4 years later have Calypto itself merge back into mama Mentor! WHOA!
Since then, Forte went to Cadence, and after some initial protest drifted
off into obscurity.
2014: Cadence to acquire Forte Cynthesizer at a rumored fire sale price
2014: Brett lectures on "Cooley facts" vs. real facts on Forte/CDNS
The last user mentions of Forte Cynthesizer (now "Stratus HLS") was in 2015.
"... the direction Calypto should take now that both Cadence and
Synopsys are both so weak in the High Level Synthesis space."
- Badru Agarwala, Mentor Calypto (ESNUG 560 #5)
FAST FORWARD TO CALYPTO NOW: It's now 3 years later and judging from the
user "Best of" comments Badru has a pair of pocket aces with Calypto HLS
for the C/C++/SystemC-to-RTL synthesis space -- and Calypto PowerPro for
the RTL power optimization business.
Or said another way; I got a ton of Calypto HLS user comments, but nothing
on its rival Stratus HLS. And a ton of Calypto PowerPro user comments, but
nothing on its rival Ansys PowerArtist or Synopsys Spyglass Power tools.
"Eighty percent of success is showing up."
- Woody Allen, American actor (1935 - present)
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
QUESTION ASKED:
Q: "What were the 3 or 4 most INTERESTING specific EDA tools
you've seen this year? WHY did they interest you?"
---- ---- ---- ---- ---- ---- ----
We get killer TTM with Catapult. Our business is all about being
first to market.
---- ---- ---- ---- ---- ---- ----
Mentor Catapult is a great tool.
Its functionality is amazing, knowing you now can generate RTL instead
of handwriting it.
I work in the video field -- that application domain is good match for
the tool.
Writing hardware in the C language was new to me.
- Catapult and the HLS methodology had a steep learning curve.
- But concepts such as schedule, latency and throughput are common
knowledge for digital people
Other comments:
- The GUI is convenient. All the information that you need is
shown.
- The schedule of the hardware generated is helpful.
- The generated hardware schematic is also good.
- The script interface and the Tcl comments are very helpful to
setup your script for automation the build.
The tool is not yet as mature as you might expect in the industry.
Error messages are not always clear, so it's not always clear what the
problem is in the design.
Good support is essential, and that is what we get. There are always
things to improve, and we have close contact with Mentor FAE's to
discuss our difficulties.
And that works fine.
I really think HLS is the future.
---- ---- ---- ---- ---- ---- ----
Mentor Catapult HLS -- does C++/SystemC to RTL synthesis
- Catapult lets you create and test different architectures to
find the best one. This would be one of the main reasons for us
to add it to our design environment.
- Automating the coverage closure using Questa CoverCheck is a
useful feature.
- Catapult's power optimization enables users to make smart power
decisions. It would be great to be able to do power optimization
early in the design.
- Catapult's integration with PowerPro power reduction, and SLEC
equivalency checking would be a good benefit if the customers
have SLEC for equivalency check. (Although customers who use
Cadence's Conformal LEC or Synopsys' Formality as a signoff LEC
tool are likely unwilling to replace their signoff tool with
SLEC.)
---- ---- ---- ---- ---- ---- ----
This is what I learned about Catapult HLS at DAC. (I'm not a user)
- Mentor was promoting using Catapult HLS to make sure your actual
HDL was compliant with certain code standards.
- If you use it, your synthesized RTL will follow the codes, e.g.
using certain variable names and formatting it a certain way.
- Also, using Catapult HLS lets folks without HDL background to
come up to speed more quickly.
Mentor also showed some design examples. I asked about timing
considerations -- they said tool itself doesn't provide timing, but you
can do timing analysis after you synthesize.
Catapult looked useful.
---- ---- ---- ---- ---- ---- ----
Mentor Catapult HLS
I will have to say that, in general the future of hardware design seems
to be going to high-level design. Many of the EDA tool providers have
developed solutions for higher level design.
My focus was on Mentor Catapult HLS. I participated in multiple
meetings and presentations at Mentor's booth. The entire events were
very organized.
The most valuable part for me were the presentations from companies that
had hands on experience with the Catapult.
Catapult has a big advantage in reducing design and verification time.
The time saved on porting the design from one technology to another can
overcome the gate count increase compared to hand written code.
Mentor's customer support is remarkable, i.e. their level of engagement
and understanding.
---- ---- ---- ---- ---- ---- ----
Calypto Catapult HLS
My Catapult use case was compiling a DSL (higher-level language than
C++/SystemC) into HLS C that is supported by Catapult.
One of Catapult's key advantages is being able to create a larger number
of targets (Xilinx, Altera, as well as ASIC). Using Catapult, I was
able to create and compare designs across the space of FPGAs as well as
ASIC designs.
Catapult also let me get a quick analysis and to get a more in-depth
understanding of the designs I was creating.
---- ---- ---- ---- ---- ---- ----
Mentor Catapult HLS
Catapult supports both SystemC and C++, including a new C++ mixed
language & class-based hierarchy.
This sounds like a good idea for system level activities.
---- ---- ---- ---- ---- ---- ----
I attended a Mentor Catapult DAC session where a customer discussed
using Catapult for image sensor processing.
It appeared that Catapult was well-suited for image processing chips.
However the talk didn't make it clear which HLS features that were
unique to Mentor.
---- ---- ---- ---- ---- ---- ----
Mentor Calypto Catapult
---- ---- ---- ---- ---- ---- ----
Mentor respected the promises of Badru's vision for Catapult HLS.
- I was pleased by the native debug and visualization features as
well as the extended support for coverage metrics.
- It appears his team has worked hard to improve the underlying
database and hook system.
It was nice to see the 3D TV prototype outside the Mentor stand and hear
the experience of the customers.
---- ---- ---- ---- ---- ---- ----
We liked the Catapult demo at DAC. We're thinking about it.
---- ---- ---- ---- ---- ---- ----
If it can get our chips significantly faster to market, we might
consider adopting this Catapult C-to-RTL flow.
---- ---- ---- ---- ---- ---- ----
Badru's a happier guy in real life. It was fun to finally meet him.
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
CADENCE STRATUS HLS -- oops! got no user comments!
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
MENTOR/SIEMENS POWERPRO
PowerPro.
Anything that cuts even 5% of our chip power gets respect from us.
---- ---- ---- ---- ---- ---- ----
We use Mentor PowerPro -- below are details of my experience.
POWERPRO GUIDED OPTIMIZATION
We primarily use PowerPro guided optimization. The new visualizer/GUI
is good, though their previous visualizer was also fine.
When PowerPro makes a power reduction recommendation, it also gives you
a schematic to click on, so you can look at RTL to speed up your ability
to understand the suggestion and make the change.
- Clock Gating. Our designers are good at finding clock gating
opportunities. PowerPro can find more complex clock gating
candidates thatare several levels of sequential logic down.
PowerPro has settings such that it won't do clock gating
optimization that logic synthesis can find, so that we don't
change your RTL code if we don't have to. Instead they account
for it and give you the power reduction estimate for the
expected synthesis changes, and instead look for clock gating
opportunities that synthesis won't find.
- Memory Gating. They added more classifications of memory
cases they can detect.
- Data Gating. Mentor has added data gating recently, which is
pretty unique. It doesn't recommend changing the data unless
you have to because it's being consumed -- so you only eliminate
unnecessary switching activity in the design.
It detects and finds those cases such as containing unnecessary
changes for data busses with a steady state value.
For example, for a state machine, when you are switching to
zero every time when you are not sending the real data, in
reality, that zeroization was an unnecessary artifact of RTL
coding style, as you could have just kept it at last value
until something new had to change.
Also, turning off memory clocks if the memory is not active.
ACCURACY
PowerPro's RTL power estimation correlation is good -- we get 15-20%
accuracy. The synthesis and SPEF integration is under the hood; we
provide some information for set up, such as the number of levels of
logic, and the number of flops that will automatically have inferred
clock gating that we are targeting in physical design. Once we set up
this physical library information, we don't need to think about moving
forward, you just work on the RTL level.
The accuracy is good enough to make the right decisions. An example
of this is one small chip we worked on, where we had tight power
budget and were very aggressive with the power reductions we did in
the RTL. When we ran PowerPro afterwards, it turned out the highest
block for potential power savings -- 100s of milliwatts -- was in a
3rd party purchased IP, which we couldn't tweak vs 10s of milliwatts
in our design. So we are thinking about changing that IP. (And maybe
that IP vendor needs better tools.)
This showed us PowerPro's ability to paint an accurate picture, as it
didn't have big power reduction numbers on logic that is already
fairly well power-linted. So PowerPro didn't mislead us in terms of
power reduction potential -- and we avoided doing any unnecessary RTL
changes when we needed to tapeout, where there would have been a low
return for the disruption and schedule hit.
OTHER
Mentor has a PowerPro Designer tool which added new features for
early RTL creation and architectural exploration that could be
helpful when designing more complex blocks. However, we don't use
it as we focus more on incremental modifications and optimization of
our internal legacy IP.
We have also not used PowerPro's micro-architectural optimizations,
but will look at it as we work on new IPs. This could be useful if
you are designing a large engine with lots of new architectural
exploration, but less so if your design is constrained by
pre-defined best practices.
Area for improvement:
We've asked Mentor -- and I believe they are working on it -- to create
a bridge between power analysis results and provide input to synthesis
tools to synthesize a more power-aware design, i.e. to figuring out how
to come up with appropriate new constraints and pass that information to
be used during RTL synthesis.
Following PowerPro's guided optimization, you know which high frequency
signals are more contained in terms of their toggling, and those high
frequency signals that need to toggle more often per the design and
can't be controlled/gated much. That data is useful when you try to
meet timing during synthesis. E.g. You can have 2 GHz bus that toggle
most of the time at 2 GHz, and those that are gated and only toggle 30%
of the time at 2 GHz. So, when you route the busses during synthesis,
you could get a better result with a tighter route for the bus that is
always toggling at 2 GHz. The net result would be lower power than if
the RTL synthesis tool was unaware of it.
---- ---- ---- ---- ---- ---- ----
Mentor PowerPro
What we like:
1. Microarchitectural optimizations for early RTL design, such as
block level & memory access profiling based gating. And
sequential gating, such as redundant reset elimination.
It could help our RTL designers spot potential power improvements.
2. Mentor says PowerPro's RTL power estimates are within 15%
accuracy due to its new synthesis and SPEF integration for
physical prototypes.
This raises the question for me as to whether the user needs to
provide and update the SPEF file as netlists are released. Will
the accuracy be maintained during the project as time passes?
3. Mentor claims PowerPro's gate-level power analysis within 1-2%
of final post place and route.
What I'd like to know if PowerPro's clock tree power analysis correlates
well with gate-level after the CTS stage.
---- ---- ---- ---- ---- ---- ----
Mentor PowerPro
Its PowerLeaks cuts our debug effort involvement by a great amount.
Its power regressions are helpful as we can analyze the design for even
simple bug fixes.
---- ---- ---- ---- ---- ---- ----
Mentor PowerPro
We evaluated Mentor PowerPro. Our eval primarily measured whether
PowerPro enhanced our clock gating efficiency for a core design that we
had already taped out. Mentor uses deep sequential analysis to do this,
and the power reduction results were strong.
Below are my comments on the eval.
CLOCK GATING
PowerPro's automatic mode methodology and results:
- We ran the RTL through PowerPro. Our initial results showed a
15% clock gating improvement.
- We then measured the impact on timing. First, we ran the
post-PowerPro RTL through Synopsys DC. We procure all the
violating paths. PowerPro has a replay flow, where you feed
it the timing (violating paths) report, and PowerPro can
remove those endpoints from its clock gating improvement.
- Our final power reduction after removing *all* those paths was
7-8% to meet our exact timing specs. Mentor said we could
tweak this, i.e. we didn't have to remove all paths, and could
get an 11% reduction. We didn't do this because this was just
an eval, and 7-8% was enough to prove its value.
- To assess PowerPro's power analysis accuracy, we compared it
with Synopsys PT-PX. PowerPro's power estimates were within
5% of PT-PX for the clock gating efficiency numbers.
- We then ran the post-PowerPro RTL through SLEC checking to
verify the functionality. Equivalency checked showed the
results were valid.
DATA GATING
PowerPro assesses how datapath flop-output-to-input can be optimized
for power. It can reduce the toggle for that cone of logic, so the data
does not switch, while keeping the functionality unchanged.
We had PowerPro's data gating turned on during our optimizations.
However, we did not do the analysis needed for me to be able to comment
specifically on its direct impact on our power results.
CAPACITY, HIERARCHICAL OPTIMIZATION
- Capacity/performance.
In general, 250-400 K flops blocks would be run overnight.
- Hierarchical optimization.
Mentor recommends considering breaking up a block that is more
than 300K flops. You create black boxes for the submodules.
We broke ours up into blocks with ~200 flops, did the
optimization extractions and pulled the resulted into the
higher-level design.
Since our eval, we are using PowerPro on production chip design.
PowerPro's set-up wasn't too complicated -- we had a new person pick
it up and run it. Mentor's support was good.
---- ---- ---- ---- ---- ---- ----
Mentor PowerPro
PowerPro was extremely helpful in getting quick energy efficiency
numbers at the RTL on our ASICs.
Because our design test cases were also integrated into Catapult,
verifying the HLS C and generated behavioral Verilog against a golden
model could all be done in the same environment very easily.
---- ---- ---- ---- ---- ---- ----
PowerPro
---- ---- ---- ---- ---- ---- ----
It's interesting how Calypto PowerPro is supported like it's from a
start-up; yet it's inside an $80 billion Siemens. We like that.
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
SYNOPSYS ATRENTA SPYGLASS POWER -- oops! got no user comments!
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
ANSYS APACHE POWERARTIST -- oops! got no user comments!
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
Related Articles
MENT Calypto Catapult single handedly gets #4 for Best of 2016
Badru replies to Cooley's "Cheeky" question about Catapult "noise"
Calypto PowerPro -- but no Ansys PowerArtist nor Synopsys Spyglass
Join
Index
Next->Item
|
|