User gets 37% Calypto PowerPro RTL sequential logic power savings

( ESNUG 535 Item 2 ) -------------------------------------------- [12/20/13]

Subject: User gets 37% Calypto PowerPro RTL sequential logic power savings

> CALYPTO KICKS ASS: Last year there was a very public 3-way battle at DAC
> between Apache PowerArtist, Atrenta Spyglass Power, and Calypto PowerPro
> in RTL power optimization.  See #1 in my Cheesy Must See List for DAC'12.
>
> This year, judging from user comments, it appears that Calypto PowerPro
> has won this fight, with Atrenta Spyglass Power coming in at #2 with a
> few supporting user votes -- and with Apache PowerArtist completely
> Missing In Action (M.I.A.) -- with not one customer advocating for it.
>
>     - from http://www.deepchip.com/items/dac13-03.html


From: [ Anindya Saha of Saankhya Labs ]

Hi, John,

We recently completed our eval of Calypto's PowerPro tool, where we looked
at the power reduction on a hierarchical block within our larger RTL design.
We used one of our internal DSP IP Subsystem IP's as a test case consisting
of 2 cores and significant memory.  

Our management wasn't happy with just RTL power numbers, so we took 3 months
to do RTL power work plus netlist power work (with and without Calypto)
through P&R to see the final power improvement.  

Our final post-route results, using 2 different power reduction choices,
ranged from 9% to 42% less power overall -- and 37% to 60% less power for
sequential logic.

         ----    ----    ----    ----    ----    ----   ----

As a first cut, we did an idealized zero-wireload RTL power analysis with
Calypto PowerPro.

    - used minimum bit-width parameter of 32 with zero wire load models.

    - We used Calypto's "guided optimization" mode with VCD as input.  

    - The DSP subsystem circuit size was ~1.5 M gates.  

With wireload models to zero, generic lib, and no clock tree annotation:

                                          Calypto 
                          Original RTL    Optimized RTL     Power 
    Parameters            Power(mWatts)   Power (mWatts)    Savings
    ----------            -----------    ---------------    -------
    Sequential  Logic        36.9            20.0            45.7%
    Total Power              58.5            41.8            28.5%

It took us 45 minutes to do this "guided optimization".  To get this type
of power reduction manually would have taken us 6 months, because some moves
that PowerPro suggested were not obvious.  (I can't name the moves as it is
proprietary to our design.)

         ----    ----    ----    ----    ----    ----   ----

It was very important to us to know the final power improvements after place
and route.  So we subjected both netlists generated from our original RTL
and from the Calypto-optimized RTL to our same RTL synthesis and place and
route flow, through timing closure and a fully routed netlist.

These were the tools we used in our flow:

    1. Power reduction using Calypto PowerPro 

    2. Logic synthesis with Cadence RTL Compiler 

    3. Cadence Encounter P&R.  We had no routability issues with the 
       Calypto netlist. 

    4. We took the post-routed netlist and ran simulations using Cadence
       NC-Sim and found no RTL errors during verification.  

    5. We also generated representative vectors which we had benchmarked
       on silicon and used it to generate VCD on the generated netlists.

    6. We then used EPS (Encounter Power System) to compute the power
       dissipation for both netlists.

Here are our Calypto post-route power reductions results using a minimum
bit-width of 32 including layout extracted parasitics.

                                          Calypto 
                          Original RTL    Optimized Netlist  Power 
    Parameters            Power(mWatts)   Power (mWatts)     Savings
    ----------            -----------    ---------------     -------
    Sequential Logic         186.5           117.7            36.9%
    Total Power              269.9           244.7             9.3%

We were happy to see a 9% power savings on our existing netlist for our
IP Subsystem.  

Even with Calypto's PowerPro RTL power estimations (zero wireload, generic
library, no clock tree information annotated), the percentage power savings
for sequential logic correlated fairly well.  

PowerPro was easy to use -- in particular the GUI/visualization for specific
guided optimization moves was a standout feature.  As designers we could
immediately recognize Calypto was suggesting the right moves to cut power.

For example, Calypto would show the tradeoff between introducing additional
clock-gating versus the power savings we would obtain by varying the
minimum bit-width of register banks that would be clock-gated.  (You can use
it to automatically generate various RTLs with different minimum bit-width
parameters.)

         ----    ----    ----    ----    ----    ----   ----

WHERE POWERPRO COULD IMPROVE:

Generating a netlist with aggressively lower minimum bit-widths is always
preferred from a designer's perspective, but PowerPro did not have much
ability to predict on how a lower minimum bit-width would impact the routed
clock tree in terms of jitter and signal integrity -- given an area and
frequency constraint.

It would be good to see Calypto R&D work on some of these metrics so that
they are highlighted upfront to the designer for making more intelligent
choices which will ensure timing and reliability closure in addition to
power reduction.

Even so, after our initial 32 bit-width eval, we later used PowerPro to
change our minimum bit-width parameter for this design to 8.  Here's the
post-route power reductions results we saw:

                                          Calypto 
                          Original RTL    Optimized Netlist  Power 
    Parameters            Power(mWatts)   Power (mWatts)     Savings
    ----------            -----------    ---------------     -------
    Sequential Logic         186.5           74.5             60.0%
    Total Power              269.9          154.7             42.7%

This change to 8 min bit-width worked without any issues.  As we expected,
our 8-bit width version had the lowest power consumption overall.

That is, that 60% sequential logic power savings translated for us in
total power as an additional 33.4% reduction over our previous 32-bit 9.3%
reduction, giving us a *total* total power reduction of 42.7%.

    - Anindya Saha
      Saankhya Labs                              Bangalore, India

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)