Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS

( ESNUG 435 Item 3 ) -------------------------------------------- [12/08/04]

Subject: ( ESNUG 433 #1 ) Two Hands-On Users Report On Sierra Pinnacle

> What do you hear about Sierra Design Automation and their claims about
> Pinnacle handling far more gates overnight than Synopsys PhysOpt or
> Magma Blast Fusion?  The numbers claimed are 10 M gates vs 2-3 M gates
> for Synopsys and Magma.  What's the truth here?
>
>     - Paul Wick
>       Seligman Mutual Funds                      Palo Alto, CA


From: Kazuyuki Dei <kdei=user domain=fma.fujitsu spot calm>

Hi John,

We have been using Pinnacle for a few months now.  We have run 9 designs
through Pinnacle, and are seeing very good results: high capacity, fast
runtime, and better quality results.  Ours is a typical ASIC netlist
handoff model where we receive the netlist and SDC from our customers
and we take it through the physical implementation flow through P&R
(Cadence SoC Encounter and NanoRoute), extraction (Simplex Fire&Ice
and in-house), VoltageStorm for IR drop analysis, and PrimeTime for
final timing analysis with our internal delay calculator.

The following highlight our most recent design run through Pinnacle.

Design -- 1.4 M placeable instances run flat in Pinnacle, @ 200 MHz

Capacity -- 2.6 GB memory footprint in a 32-bit Linux machine to perform
placement, optimization, routing and timing.

We even loaded the entire routed database back into Pinnacle and did a full
post-route optimization on the design in a 32-bit Linux box as well.  In
addition, during our initial trials we loaded and placed 2.3 M instances
(around 10 M equivalent gates) in less than 3.7 GB.  Therefore, to answer
Paul's question, Pinnacle's feasibility might be close to 10 M gates
in a 32-bit Linux box for our designs, which is pretty impressive.

Speed -- 18 hours from netlist to detailed, optimized placement, including
full global routing done on a 3.2 GHz, 32-bit Pentium processor.  Also,
the way Pinnacle performs optimization, it can tell you after about
5-6 hours roughly what the quality of results will be.  Time to an answer
is very important to us as each iteration of the design in our existing
Cadence flow takes days.

Better quality results -- from the customer netlist handoff of this
design, we had very good timing results from Pinnacle within a few
iterations of the tool in a 2 day period.  Our SoC Encounter flow did not
match the QoR of Pinnacle.  The Cadence flow took weeks to get to their
best result and required a 64-bit machine to run the design.

Gotchas in Pinnacle:

One issue on this design was that the Clock Tree Synthesis engine for
Pinnacle was not yet released.  This meant we had to go in and out of
the tool for this, which is kind of clunky.  We have a couple of other
issues with Pinnacle which required workarounds:

  a. Pinnacle doesn't support SDF, which is our traditional
     customer deliverable.

  b. Gate placement in Pinnacle doesn't check pin blockages for
     pre-routes immediately above the pins, so we occasionally
     have a few blocked pins.

  c. Its power router is not full-featured, does not support
     conformal ring structures and elaborate power structures.
     Currently we are using SoC Encounter for our power creation.

  d. We initially had problems with Pinnacle making modifications of
     certain logic that we wished to remain untouched.  These are
     soft modules that will later be swapped with hard IP, but we
     did not currently have LEF/lib models for.  We were able to
     generate workarounds for this by setting dont_modify properties
     on these hierarchies, nets, and cells so that the Pinnacle
     generated netlist would go through the rest of our flow.

  e. We also had the problem of placement of critical cells relative
     to I/O timing.  Our I/O timing constraints were not 100%, so to
     achieve good I/O timing, we had to write a custom flow to
     pre-place flops for interfaces, to make sure Pinnacle met timing.

Pinnacle fits well into our existing Cadence tool flow as it uses standard
data formats Verilog/LEF/DEF/.lib/SDC/tcl.  Also, Pinnacle has a flexible
architecture; we are running both standard cell and Structured ASIC
designs in the tool.  In fact, Sierra was able to support our Structured
ASIC designs within 24 hours of receiving their first look at design data
with our architecture.

    - Kazuyuki Dei
      Fujitsu Microelectronics                   Sunnyvale, CA

         ----    ----    ----    ----    ----    ----   ----

From: [ Chicken Little ]

Hi John,

I must be anon for political reasons.

We recently did an eval using Sierra Pinnacle.  Tools used:

  Floorplan: SOC Encounter
  Placement: PhysOpt(10%)/Pinnacle(90%)
  CTS: SOC Encounter(20%)/Pinnacle(80%)
  Route: NanoRoute
  SI Check: internal
  Formal Verification: Synopsys Formality
  STA: PrimeTime
  IR-drop: Simplex VoltageStorm
  DRC/LVS: Mentor Calibre

The design we used for this eval was a 0.15 um 4.0 M gate (about
950,000 instances) with a core frequency of 367 Mhz.  The main
purpose of this trial was to look at capacity and turn around time
it was also important to not sacrifice QoR for fast run times.

For the eval we ended up testing Pinnacle in several modes.

1. Rapid turn around time on small design.

   We took one 80 k instance block from our design and ran it through
   Pinnacle.  It took Pinnacle less than 30 minutes.  We ran this block
   at 400 Mhz. Pinnacle had no trouble meeting this.  We had a few long
   paths with this block Pinnacle had no trouble optimizing these paths.
   We gave the same constraints and floorplan to another tool.  The
   run time was 1 hour and it has unable to meet timing on these long
   paths we were left with over 300 ps of negative slack.

2. Large Flat Design (~950K instances)

   We gave the whole design to Pinnacle and it placed and optimized
   it in less than 4 hours.  We ran it on a 2.8 Ghz Linux box the
   memory usage was only 2.2 G.  The QoR has the same as the smaller
   block.  During the eval we ran 3 different netlists through Pinnacle
   that varied in gate count from 750 k, to 1.1 k, to the final of
   950 k gates.  Through the design cycle we also worked through
   missing constraints and other floorplan experiments.   We were
   happy to see that the run times were always less than 4 hours to
   get feedback on new design changes.

3. Top-level design with physical hierarchy

   We read in a fully routed DEF and Verilog for five sub blocks and
   placed and optimized the top level logic (about 30K cells).  No
   timing models needed to be created for these sub blocks.  The
   Pinnacle timer would use the routed data from the sub blocks to
   calculate the timing between the top-level and the sub blocks.
   One of the sub blocks was used more than once,  Pinnacle doesn't
   require you to uniquify in this mode even though their are multiply
   instances of the same sub block.  This run took about 30 minutes.

4. Power and CTS

   We had missed our power budget on this design and found the problem
   was with the clock tree.  We had to redo the clock tree to reduce
   the power.  By this time Pinnacle CTS tool has mature enough to give
   it a test drive.  We were able to reduce the buffer count and swap
   out all registers for low power versions to save about 2 watts of
   power.  The skew was less than 150 psec which is what we were seeing
   with the Cadence and Magma CTS tools we had used previously on this
   design.  We found a problem late in our design flow and this is where
   having the rapid turn around time of Pinnacle really saved us time in
   the schedule.  We had our first test results from Pinnacle CTS in just
   a few hours.   Within 2 days we had resolved our power issue that
   included re-running initial placement and optimization to swap out
   registers to lower power flops, insert CTS, close setup and hold time
   violations and get back to our verification flows.

5. Hold time fixing

   For our 0.15 um technology we have 8 corners to run for hold fixing.
   Pinnacle has able to run 6 of these.  The other 2 are OCV corners
   and require CRPR.

We were already committed to doing a hierarchical design so that is the
version we taped out.  In the end Pinnacle did about 90% of placement
and 80% of the clock tree.  The parts just came out of the fab this week.
The parts are passing test except for an at-speed test that needs tweaking.

    - [ Chicken Little ]

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)