( DAC'17 Item 8 ) ------------------------------------------------- [03/06/18] 

Subject: Badru's Calypto Catapult HLS and PowerPro get #8 for Best of 2017

BADRU'S DOUBLEHEADER: Roughly 3 years ago, Badru Agarwala became the new GM
of Calypto after being it reabsorbed back into the Mentor (now Siemens)
mothership.  It was a fun news scoop in what was then the "high drama" world
of C/C++/SystemC based chip design from 2011 to 2015.

 2015: SCOOP! -- rumor is Calypto is merging back into Mentor Graphics!
 2015: 46 readers on Calypto, Gary Smith, Veloce, Ansys, SNPS C Compiler
          
 2011: Carl Icahn accountability forced MENT to trade away Catapult C
 2011: Mentor rumored to have traded its Catapult C division for Calypto
 2011: Calypto finally spells out the details behind Catapult C merger

It was good drama that involved Carl Icahn (when he tried that hostile
takeover of MENT) forcing Mentor to sell Catapult C over to Calypto -- only
to then 4 years later have Calypto itself merge back into mama Mentor!  WHOA!
     
Since then, Forte went to Cadence, and after some initial protest drifted
off into obscurity.
 
 2014: Cadence to acquire Forte Cynthesizer at a rumored fire sale price
 2014: Brett lectures on "Cooley facts" vs. real facts on Forte/CDNS

The last user mentions of Forte Cynthesizer (now "Stratus HLS") was in 2015.

    "... the direction Calypto should take now that both Cadence and
     Synopsys are both so weak in the High Level Synthesis space."

         - Badru Agarwala, Mentor Calypto (ESNUG 560 #5)

FAST FORWARD TO CALYPTO NOW: It's now 3 years later and judging from the
user "Best of" comments Badru has a pair of pocket aces with Calypto HLS
for the C/C++/SystemC-to-RTL synthesis space -- and Calypto PowerPro for
the RTL power optimization business.
          
Or said another way; I got a ton of Calypto HLS user comments, but nothing
on its rival Stratus HLS.  And a ton of Calypto PowerPro user comments, but
nothing on its rival Ansys PowerArtist or Synopsys Spyglass Power tools.

    "Eighty percent of success is showing up."

        - Woody Allen, American actor (1935 - present)

        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

      QUESTION ASKED:

        Q: "What were the 3 or 4 most INTERESTING specific EDA tools
            you've seen this year?  WHY did they interest you?"

        ----    ----    ----    ----    ----    ----    ----

    We get killer TTM with Catapult.  Our business is all about being
    first to market. 

        ----    ----    ----    ----    ----    ----    ----

    Mentor Catapult is a great tool.

    Its functionality is amazing, knowing you now can generate RTL instead
    of handwriting it.

    I work in the video field -- that application domain is good match for
    the tool.

    Writing hardware in the C language was new to me.  

        - Catapult and the HLS methodology had a steep learning curve.  

        - But concepts such as schedule, latency and throughput are common 
          knowledge for digital people

    Other comments:

        - The GUI is convenient.  All the information that you need is 
          shown.

        - The schedule of the hardware generated is helpful.
 
        - The generated hardware schematic is also good.

        - The script interface and the Tcl comments are very helpful to 
          setup your script for automation the build.

    The tool is not yet as mature as you might expect in the industry.  
    Error messages are not always clear, so it's not always clear what the 
    problem is in the design.

    Good support is essential, and that is what we get.  There are always 
    things to improve, and we have close contact with Mentor FAE's to 
    discuss our difficulties.

    And that works fine.

    I really think HLS is the future.

        ----    ----    ----    ----    ----    ----    ----

    Mentor Catapult HLS -- does C++/SystemC to RTL synthesis

        - Catapult lets you create and test different architectures to 
          find the best one.  This would be one of the main reasons for us
          to add it to our design environment.

        - Automating the coverage closure using Questa CoverCheck is a 
          useful feature.

        - Catapult's power optimization enables users to make smart power 
          decisions.  It would be great to be able to do power optimization
          early in the design.

        - Catapult's integration with PowerPro power reduction, and SLEC
          equivalency checking would be a good benefit if the customers 
          have SLEC for equivalency check.  (Although customers who use 
          Cadence's Conformal LEC or Synopsys' Formality as a signoff LEC
          tool are likely unwilling to replace their signoff tool with 
          SLEC.)

        ----    ----    ----    ----    ----    ----    ----

    This is what I learned about Catapult HLS at DAC.  (I'm not a user)

        - Mentor was promoting using Catapult HLS to make sure your actual
          HDL was compliant with certain code standards.
  
        - If you use it, your synthesized RTL will follow the codes, e.g.
          using certain variable names and formatting it a certain way.

        - Also, using Catapult HLS lets folks without HDL background to
          come up to speed more quickly.

    Mentor also showed some design examples.  I asked about timing 
    considerations -- they said tool itself doesn't provide timing, but you
    can do timing analysis after you synthesize.  

    Catapult looked useful.  

        ----    ----    ----    ----    ----    ----    ----

    Mentor Catapult HLS

    I will have to say that, in general the future of hardware design seems
    to be going to high-level design.  Many of the EDA tool providers have 
    developed solutions for higher level design.

    My focus was on Mentor Catapult HLS.  I participated in multiple 
    meetings and presentations at Mentor's booth.  The entire events were
    very organized.

    The most valuable part for me were the presentations from companies that
    had hands on experience with the Catapult.

    Catapult has a big advantage in reducing design and verification time. 
    The time saved on porting the design from one technology to another can
    overcome the gate count increase compared to hand written code.

    Mentor's customer support is remarkable, i.e. their level of engagement
    and understanding.

        ----    ----    ----    ----    ----    ----    ----

    Calypto Catapult HLS

    My Catapult use case was compiling a DSL (higher-level language than 
    C++/SystemC) into HLS C that is supported by Catapult. 

    One of Catapult's key advantages is being able to create a larger number
    of targets (Xilinx, Altera, as well as ASIC).  Using Catapult, I was 
    able to create and compare designs across the space of FPGAs as well as
    ASIC designs.

    Catapult also let me get a quick analysis and to get a more in-depth
    understanding of the designs I was creating. 

        ----    ----    ----    ----    ----    ----    ----

    Mentor Catapult HLS

    Catapult supports both SystemC and C++, including a new C++ mixed 
    language & class-based hierarchy. 
 
    This sounds like a good idea for system level activities.

        ----    ----    ----    ----    ----    ----    ----

    I attended a Mentor Catapult DAC session where a customer discussed 
    using Catapult for image sensor processing.

    It appeared that Catapult was well-suited for image processing chips.
    However the talk didn't make it clear which HLS features that were
    unique to Mentor.

        ----    ----    ----    ----    ----    ----    ----

    Mentor Calypto Catapult

        ----    ----    ----    ----    ----    ----    ----

    Mentor respected the promises of Badru's vision for Catapult HLS.  

        - I was pleased by the native debug and visualization features as
          well as the extended support for coverage metrics.  

        - It appears his team has worked hard to improve the underlying
          database and hook system.  

    It was nice to see the 3D TV prototype outside the Mentor stand and hear
    the experience of the customers.  

        ----    ----    ----    ----    ----    ----    ----

    We liked the Catapult demo at DAC.  We're thinking about it.

        ----    ----    ----    ----    ----    ----    ----

    If it can get our chips significantly faster to market, we might
    consider adopting this Catapult C-to-RTL flow.

        ----    ----    ----    ----    ----    ----    ----

    Badru's a happier guy in real life.  It was fun to finally meet him.

        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

CADENCE STRATUS HLS -- oops!  got no user comments!

        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----

        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

MENTOR/SIEMENS POWERPRO

    PowerPro.  

    Anything that cuts even 5% of our chip power gets respect from us.

        ----    ----    ----    ----    ----    ----    ----

    We use Mentor PowerPro -- below are details of my experience.

    POWERPRO GUIDED OPTIMIZATION

    We primarily use PowerPro guided optimization.  The new visualizer/GUI
    is good, though their previous visualizer was also fine.  

    When PowerPro makes a power reduction recommendation, it also gives you
    a schematic to click on, so you can look at RTL to speed up your ability
    to understand the suggestion and make the change.

        - Clock Gating.  Our designers are good at finding clock gating 
          opportunities.  PowerPro can find more complex clock gating 
          candidates thatare several levels of sequential logic down.

          PowerPro has settings such that it won't do clock gating 
          optimization that logic synthesis can find, so that we don't
          change your RTL code if we don't have to.  Instead they account
          for it and give you the power reduction estimate for the 
          expected synthesis changes, and instead look for clock gating
          opportunities that synthesis won't find.

        - Memory Gating.  They added more classifications of memory 
          cases they can detect.

        - Data Gating.  Mentor has added data gating recently, which is 
          pretty unique.  It doesn't recommend changing the data unless
          you have to because it's being consumed -- so you only eliminate
          unnecessary switching activity in the design.

          It detects and finds those cases such as containing unnecessary
          changes for data busses with a steady state value.  

          For example, for a state machine, when you are switching to 
          zero every time when you are not sending the real data, in 
          reality, that zeroization was an unnecessary artifact of RTL 
          coding style, as you could have just kept it at last value 
          until something new had to change.  

          Also, turning off memory clocks if the memory is not active.  

    ACCURACY

    PowerPro's RTL power estimation correlation is good -- we get 15-20% 
    accuracy.  The synthesis and SPEF integration is under the hood; we 
    provide some information for set up, such as the number of levels of
    logic, and the number of flops that will automatically have inferred 
    clock gating that we are targeting in physical design.  Once we set up
    this physical library information, we don't need to think about moving 
    forward, you just work on the RTL level.

    The accuracy is good enough to make the right decisions.  An example
    of this is one small chip we worked on, where we had tight power 
    budget and were very aggressive with the power reductions we did in
    the RTL.  When we ran PowerPro afterwards, it turned out the highest 
    block for potential power savings -- 100s of milliwatts -- was in a
    3rd party purchased IP, which we couldn't tweak vs 10s of milliwatts
    in our design.  So we are thinking about changing that IP.  (And maybe
    that IP vendor needs better tools.) 

    This showed us PowerPro's ability to paint an accurate picture, as it 
    didn't have big power reduction numbers on logic that is already 
    fairly well power-linted.  So PowerPro didn't mislead us in terms of
    power reduction potential -- and we avoided doing any unnecessary RTL 
    changes when we needed to tapeout, where there would have been a low
    return for the disruption and schedule hit.  

    OTHER

    Mentor has a PowerPro Designer tool which added new features for 
    early RTL creation and architectural exploration that could be 
    helpful when designing more complex blocks.  However, we don't use
    it as we focus more on incremental modifications and optimization of
    our internal legacy IP.  

    We have also not used PowerPro's micro-architectural optimizations, 
    but will look at it as we work on new IPs.  This could be useful if 
    you are designing a large engine with lots of new architectural 
    exploration, but less so if your design is constrained by 
    pre-defined best practices.  

    Area for improvement:

    We've asked Mentor -- and I believe they are working on it -- to create
    a bridge between power analysis results and provide input to synthesis
    tools to synthesize a more power-aware design, i.e. to figuring out how
    to come up with appropriate new constraints and pass that information to
    be used during RTL synthesis.  

    Following PowerPro's guided optimization, you know which high frequency
    signals are more contained in terms of their toggling, and those high 
    frequency signals that need to toggle more often per the design and 
    can't be controlled/gated much.  That data is useful when you try to 
    meet timing during synthesis.  E.g. You can have 2 GHz bus that toggle
    most of the time at 2 GHz, and those that are gated and only toggle 30%
    of the time at 2 GHz.  So, when you route the busses during synthesis, 
    you could get a better result with a tighter route for the bus that is
    always toggling at 2 GHz.  The net result would be lower power than if 
    the RTL synthesis tool was unaware of it.

        ----    ----    ----    ----    ----    ----    ----

    Mentor PowerPro 

    What we like:

    1.  Microarchitectural optimizations for early RTL design, such as 
        block level & memory access profiling based gating.  And 
        sequential gating, such as redundant reset elimination.  

        It could help our RTL designers spot potential power improvements.

    2.	Mentor says PowerPro's RTL power estimates are within 15% 
        accuracy due to its new synthesis and SPEF integration for 
        physical prototypes.

        This raises the question for me as to whether the user needs to 
        provide and update the SPEF file as netlists are released.  Will 
        the accuracy be maintained during the project as time passes?

    3.	Mentor claims PowerPro's gate-level power analysis within 1-2%
        of final post place and route. 
 
    What I'd like to know if PowerPro's clock tree power analysis correlates
    well with gate-level after the CTS stage.

        ----    ----    ----    ----    ----    ----    ----

    Mentor PowerPro

    Its PowerLeaks cuts our debug effort involvement by a great amount.

    Its power regressions are helpful as we can analyze the design for even 
    simple bug fixes.

        ----    ----    ----    ----    ----    ----    ----

    Mentor PowerPro

    We evaluated Mentor PowerPro.  Our eval primarily measured whether 
    PowerPro enhanced our clock gating efficiency for a core design that we 
    had already taped out.  Mentor uses deep sequential analysis to do this,
    and the power reduction results were strong.  

    Below are my comments on the eval.

    CLOCK GATING

    PowerPro's automatic mode methodology and results:

        - We ran the RTL through PowerPro.  Our initial results showed a
          15% clock gating improvement.

        - We then measured the impact on timing.  First, we ran the 
          post-PowerPro RTL through Synopsys DC.  We procure all the 
          violating paths.  PowerPro has a replay flow, where you feed 
          it the timing (violating paths) report, and PowerPro can 
          remove those endpoints from its clock gating improvement.  

        - Our final power reduction after removing *all* those paths was
          7-8% to meet our exact timing specs.  Mentor said we could 
          tweak this, i.e. we didn't have to remove all paths, and could 
          get an 11% reduction.  We didn't do this because this was just
          an eval, and 7-8% was enough to prove its value.

        - To assess PowerPro's power analysis accuracy, we compared it 
          with Synopsys PT-PX.  PowerPro's power estimates were within 
          5% of PT-PX for the clock gating efficiency numbers. 

        - We then ran the post-PowerPro RTL through SLEC checking to 
          verify the functionality.  Equivalency checked showed the
          results were valid.

    DATA GATING

    PowerPro assesses how datapath flop-output-to-input can be optimized
    for power.  It can reduce the toggle for that cone of logic, so the data
    does not switch, while keeping the functionality unchanged.

    We had PowerPro's data gating turned on during our optimizations.  
    However, we did not do the analysis needed for me to be able to comment 
    specifically on its direct impact on our power results.  

    CAPACITY, HIERARCHICAL OPTIMIZATION

        - Capacity/performance. 

          In general, 250-400 K flops blocks would be run overnight.

        - Hierarchical optimization. 

          Mentor recommends considering breaking up a block that is more
          than 300K flops.  You create black boxes for the submodules.  
          We broke ours up into blocks with ~200 flops, did the 
          optimization extractions and pulled the resulted into the 
          higher-level design. 

    Since our eval, we are using PowerPro on production chip design.  
    PowerPro's set-up wasn't too complicated -- we had a new person pick
    it up and run it.  Mentor's support was good.

        ----    ----    ----    ----    ----    ----    ----

    Mentor PowerPro

    PowerPro was extremely helpful in getting quick energy efficiency 
    numbers at the RTL on our ASICs.  

    Because our design test cases were also integrated into Catapult, 
    verifying the HLS C and generated behavioral Verilog against a golden
    model could all be done in the same environment very easily.

        ----    ----    ----    ----    ----    ----    ----

    PowerPro

        ----    ----    ----    ----    ----    ----    ----

    It's interesting how Calypto PowerPro is supported like it's from a
    start-up; yet it's inside an $80 billion Siemens.  We like that.

        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

SYNOPSYS ATRENTA SPYGLASS POWER -- oops!  got no user comments!

        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

ANSYS APACHE POWERARTIST -- oops!  got no user comments!

        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----


        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

Related Articles

    MENT Calypto Catapult single handedly gets #4 for Best of 2016
    Badru replies to Cooley's "Cheeky" question about Catapult "noise"
    Calypto PowerPro -- but no Ansys PowerArtist nor Synopsys Spyglass


Join    Index    Next->Item







   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.






















Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)