( ESNUG 464 Item 3 ) -------------------------------------------- [03/30/07]

Subject: ( ESNUG 463 #5 ) We got 2X design at less power with PowerTheater

>> How much PowerTheater / early power analysis are you REALLY selling
>> with your ESL partners: Bluespec, CoWare, Forte, Synfora, etc...
>
> We work with ESL vendors because it is a key productivity multiplier,
> and keeping up with Moore's Law requires us to move to higher levels of
> abstraction.  We continue to see interest in the design community
> although adoption is slower than expected, but we view ESL support as an
> investment in the future.  We are here to support these partners and
> take advantage of the customer design trends!  And proud to be the
> de-facto standard for power analysis at ESL and RTL.
>
>     - Vic Kulkarni, CEO of Sequence


From: Jack Choquette <jackc=user domain=azulsystems not palm>

Hi, John,

We needed a tool to allow us to analyze power distribution at the RTL level,
so we could reduce the power consumption of our chip by exploring different
architectures.  Sequence's PowerTheater is the only commercial tool I've
seen that does vector-based RTL level power analysis.

Our company sells network access to massive amounts of processors and GBs of
heap space.  To do this we use an internally designed multi-core chip.

I used PowerTheater on our "Vega 2", our second generation processor chip
with several million gates, M bytes of caches and 48 processor cores.  (My
boss won't let me release the specifics here.  Suffice it to say Vega 2 is
very big.)

With the Vega 2 chip, we set out to not only double the number of cores of
our Vega 1 design, but to further reduce the chip's power in order to meet
our system power targets.  That is, 2X capacity with less power.  Here's
how it worked:

We fed PowerTheater the RTL version of our Vega 2.  It did power estimation
and categorized the design elements to show which areas in the chip were
using the most power.  This categorization was very useful, enabling us to
tweak our clocks, the logic, or the RAM with respect to power.

It was also useful to look at power consumption *within* the different
blocks of the Vega 2.

We were able to identify that the majority of our chip's power consumption
was in:

  - clock distribution
  - clocking of the flops
  - accessing the RAMs

Based on this, the most useful techniques we found to reduce power were:

  - Reduce accessing of the RAM except when it was needed.
  - Clock gating.  For the core, we were able to chop the active power
    to less than 10% of its original level by turning off the clock.

To generate the vectors we would run each "low", "typical", and "high" test
case on our RTL, then dump out the vectors.  PowerTheater would read in the
vectors and then calculate the power for that test case.

We didn't clock gate all the time, but for the applications where we did use
it, we were able to reduced power consumption by 50%.

After running PowerTheater at the RTL-level and making the architectural
changes to reduce power, we ran synthesis, implemented our clock trees,
and then ran PowerTheater again at the gate-level.  When we compared
PowerTheater's results at the RTL level vs. the gate-level, we found
that the total power between them correlated within 5-10%.

We liked this because we could do our power fixes at the RTL-level.

We found PowerTheater's results were pessamistic compared to real silicon
because our foundry fed it overly simplistic RAM models.  It wasn't anything
the PowerTheater did wrong, but it did teach us the importance of getting
accurate power models from our foundry.

Where the accuracy did vary, was from the foundry's library, where their
power models did NOT take certain power information (e.g. data values,
address values, output enables) into account.  This affected the results
for the RAM elements.

The overall power result was still within acceptable limits.  Our real
silicon power ranged from 50% to dead on for what PowerTheater predicted,
but most of the time it was within 10%.  As long as Vega 2's max power was
conservatively realistic, we were happy.

Hopefully with some of the new library formats such as ESCM and CCS, it will
allow foundries to include more power information in their library models.

What we liked:

- PowerTheater can handle very large designs and provide it in vectors.
  We run it on blocks and our cores.  A representative run time for
  PowerTheater was 10 minutes to run 1000's of cycles on our core, which
  had 32K of cache and other logic.

- PowerTheater's GUI let us see the power distribution in the aggregate,
  then let us drill down on a per block basis to see where the power
  issues were.  We could get the information in a text file or use
  Sequence's hierarchical display/GUI to drill down to where the power
  issues were.  Nice.

What we didn't like:

- Running the input stimulus on our chip should be faster, so we can
  see the power cycle by cycle.  Sequence claims that the latest release
  version 2006.4 has a performance improvement in running FSDB vectors
  against the design.  They claim a 10x faster performance, but I have
  not yet verified this.

Either way, PowerTheater gave us what we needed to identify the architecture
changes to reduce our power.  It did this by making it easy to identify
where the power was being consumed in the design at the RTL-level and where
to make the necessary course corrections.

Vega 2 met our goal of 2x processors for less power.

    - Jack Choquette
      Azul Systems                               Mt View, CA
Index    Next->Item







   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)