Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS

( ESNUG 465 Item 5 ) -------------------------------------------- [06/28/07]

Subject: Blaze DFM explains how Blaze DFM is now a power optimizer company

> Blaze DFM switched from being a DFM company into being a power optimizer
> company.  I didn't get to their booth until DAC Thursday and I honestly
> tried to understand how their tool worked but I was too fried to get it.
>
>     - from http://www.deepchip.com/wiretap/070618.html


From: Paul Donehue <paul=user domain=blaze-dfm bot calm>

Hi John,

Our first product, Blaze MO, is a leakage power product.  However, as a
company, Blaze is still an electrical DFM company dedicated to maximizing
parametric yield.

Blaze MO optimizes a chip to reduce subthreshold leakage power, the amount
of power that "leaks" through a transistor when it is off.

Leakage is a fairly recent problem. Above 130nm, leakage was unimportant.
At 130 nm, it started to become noticeable, but it was still negligible.
Then, at 90 nm, leakage jumped up to 20-30% of a chip's total power
consumption and at 65 nm, leakage can add up to 50% of the total.

You can imagine the headaches this causes for chip designers when something
that didn't even exist just a few years ago now sucks up 50% of a chip's
total power.

Blaze MO takes advantage of the relationship between transistor gate length
and subthreshold leakage current.  As the length of a transistor's gate
increases, the leakage for that transistor drops exponentially.  Even a very
small increase in transistor gate length results in relatively large
decrease in leakage.  For example, in a 65 nm process, lengthening a
transistor gate by 8-9% might result in a 50% or more reduction in leakage
for that device.

At the same time, changing the transistor's gate length changes the
transistor's drive strength. Increasing the transistor gate length slows
down the transistor.  So you need to know the timing of the design in order
to increase the gate length only on non-timing critical transistors.  You
will find that the vast majority of transistors are non-critical, even for
a high-performance design.

Although the amount of leakage savings from a single transistor is very
small, when you add up the savings over tens, or even hundreds, of
millions of transistors, it quickly adds up.

Now, John, you might ask, "Doesn't increasing gate lengths cause problems
in physical verification?"

The cool thing about Blaze MO is that it doesn't actually change the drawn
layout of the transistors.  Instead, it "tags" the transistor gate with a
special annotation shape on a separate Blaze layer in the GDSII file.  This
tag tells the OPC tool how much to increase (or "bias") the transistor gate.
For example, if the transistor is on a setup critical path, then Blaze MO
will tell OPC not to bias the transistor at all.  However, if there is some
slack in the path, then Blaze MO might tell OPC to bias the transistor gate
length by as much as 3nm per edge (for a total of 6 nm).  Since the drawn
layout is not changed, Blaze MO has no effect on physical verification.  In
fact, on a hierarchical design, Blaze MO can be run on blocks in parallel
with physical verification.

Now, John, you might think, "Wait a minute, by biasing a transistor gate,
you're changing the timing characteristics of the cell that contains that
transistor.  Doesn't that completely invalidate your methodology?  After
all, there's no way to close on timing if your timing models no longer
match the cells that you're using."

As part of the setup work for Blaze MO, we create what we call "virtual
variants" for selected cells in the library. The reason we call them
"virtual" is that the physical layout of the cells doesn't change.  Instead,
they are used strictly for the purpose of verifying timing.  Each variant
models the timing of a cell whose individual transistors have been biased
by a certain amount.

For example, there might be two virtual variants for a particular cell, say
an AND gate (a NAND plus an inverter) - one whose transistors are all biased
by 6 nm and a second one with a more intelligent permutation of 2 nm, 4 nm,
and 6 nm biases based on the optimal timing/leakage tradeoff.  In the first
variant, all of the PMOS and NMOS devices in both the NAND gate and the
inverter are biased 6 nm.  The second variant takes into account the drive
strength and leakage states of the cell, and biases the individual devices
trading off performance and leakage.  In this case, the NMOS devices on the
AND gate and the PMOS device on the inverter have a larger impact on cell
leakage than the other devices and would therefore be biased more.

The first variant might take a 10% timing hit but produce a 50% leakage
reduction.  The second variant might take a 5% timing hit with a 40% leakage
reduction.  Blaze MO will choose the appropriate variant depending on how
much timing slack there is in the path and use its internal timing engine to
verify that it hasn't creating any timing violations.

During optimization, Blaze MO selects variants for each combinational cell
to reduce leakage as much as possible without breaking any timing paths.  It
then outputs a new Verilog netlist containing the virtual variant cells and
PrimeTime is run to verify that all timing constraints are still met.

Obviously, this may result in lots of variant cells, so Blaze works together
with the foundry to decide which cells to create variants for and how many.

You might wonder, "How do you guarantee timing closure?"

Blaze MO has a built-in timing engine.  It reads all of the standard
design files - Liberty, SDC, Verilog, etc., - so it knows which paths
are critical and which are not.  Then, as it biases the transistor gate
lengths, it also dynamically measures the effect on timing to make sure
that no timing paths are broken.

At first, we had some problems correlating our timing engine with PrimeTime,
but we came up with a novel approach that has helped a lot.  One of the
first things we do is force our timing engine to correlate with PrimeTime.
Our customers run both timing engines, and then whatever PrimeTime says the
timing is, we calibrate our timing engine to be the same.  I know it sounds
kind of kludgy, but it works really well.  Plus, we've put a lot of effort
into making our timing engine correlate to PrimeTime as closely as possible.

Obviously, anything that can go wrong inevitably does go wrong and Blaze MO
will break some timing paths.  So, we've created a "repair" feature to take
care of this.

After Blaze MO is run, it outputs a new Verilog netlist containing the
variant cells.  Then, PrimeTime is run on the optimized netlist.  If there
are problems, Blaze MO reads in the PrimeTime report and automatically
repairs any broken paths by backing out those variant cells that created
the problems.

Large IDMs such as Intel and IBM have internal tools that do this, but as
far as we know, there is no other commercially available tool that does
gate-length biasing.

Another question that often comes up, "Don't the foundries object to your
fiddling with transistor gate lengths?"

Initially, we did get some resistance from foundries.  Manufacturing
engineers were leery that the gate biasing would eat into their spacing
margins and harm yield.  However, over time, we were able to convince
them the amount of the biasing, only 3 nm per edge or less, were well
within the tolerances supported by their processes and that it would not
harm functional yield.

Besides, Blaze MO will not work at all without the support of the foundry.
They have to update their OPC scripts to recognize the Blaze annotation
layer and to perform the biasing, then add that info in their design kits.

Another question that pops up a lot is, "How does gate-length biasing
affect my use of multiple-Vt libraries?"

Just about all of our customers are using Blaze MO together with multi-Vt
libraries.  In fact, Blaze MO can automatically swap functionally equivalent
cells with different Vt levels.  We call this process "Vt re-assignment".
You'd expect Design Compiler to choose the right Vt, but because Blaze MO
performs a much finer-grained analysis on the completed layout, it is
usually able to improve the Vt assignments over and above whatever comes
out of physical synthesis.

The initial release of Blaze MO was limited to digital cell-based logic.
We don't touch memories, hard IP blocks, analog, mixed-signal, I/Os, or
clock trees. On early designs, we only worked on combinational logic,
but more recently, we've started optimizing sequential cells, as well.

For those parts of the design that we are allowed to touch, we can usually
reduce subthreshold leakage current by 30-50% or more.  When you look at the
entire chip, including those parts that we're not allowed to touch, the
savings run in the 15-25% range.

This is how Blaze DFM is a power optimization company, John.  :)

    - Paul Donehue
      Blaze DFM                                  Sunnyvale, CA

Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)