( ESNUG 410 Item 1 ) -------------------------------------------- [04/02/03]

From: David Parker <user=parkerd  domain=lsil jot palm>
Subject: A First Customer Look At Prolific's New ProTiming / PrimeTime ECOs

Hi John,

Here is a run-down of a recent tool evaluation.  Thought your readers might
be interested.

ProTiming is the new cell resizer point tool from Prolific.  ProTiming bolts
onto PrimeTime and uses your existing STA setup as a starting point for the
optimization (more on this later).  It's resizing optimizations fall into
two categories:

     1) a gate is resized to an existing cell in your library,
        e.g., a NR2X1 is resized to a NR2X2;

and

     2) a gate is resized to an in-between drive size,
        e.g., a NR2X1 is resized to a NR2X1.6.

Since the in-between cells do not exist in the lib prior to optimization,
the tool creates an estimated timing snapshot of the cell, enabling
ProTiming to update the timing as the optimizations occur.

In order to minimize the impact of this evaluation on my engineering staff,
I provided Prolific the necessary files and let Prolific drive their
ProTiming tool.  During the evaluation process, I focused my efforts on
validating ProTiming's resulting output and approach (more on this later).
Since I have not driven the tool, I cannot comment on specific software
issues such as usability, bugs, GUI, run times, etc.  

The Test Design & Methodology

The test case we chose for this evaluation was an ARM processor implemented
in LSI Logic's GflxP 0.13 um standard cell library.  This design contains
~40k cell instances and was taken through LSI's standard timing closure flow
prior to entering the evaluation process.  Our timing closure flow looks
like the following:

   1. synthesis with Design Compiler.
   2. timing-driven placement with MPS. (MPS is LSI's internal placer.)
   3. clock tree insertion with Synopsys (Avanti) GCTS clock tree
      synthesis tool.
   4. timing-driven physical optimization with MRS. (MRS is LSI's internal
      physical optimization tool which performs gate resizing, buffer
      insertion, logic restructuring, etc.)
   5. detail route with Synopsys (Avanti) Apollo.
   6. extract SPEF using Synopsys (Avanti) Star-RCXT.
   7. generate post-layout Verilog
   8. delay prediction with lsidelay (LSI's internal delay calculator.)
   9. Static timing analysis in PrimeTime with lsidelay generated SDF.

The post-layout version of the design was then used as the starting point
for the ProTiming optimization.


The Eval
--------

September 19, 2002

I provided a tarball to Prolific containing the data files required to drive
their ProTiming tool.  The tarball contained:  post-layout Verilog, worst
case SDF file, Synopsys design constraints, SPEF file, .lib and .db for
standard cell libraries and the PrimeTime  script for this design.

October 4, 2002

I received their first set of results and started digging in.  The results
sent by Prolific consisted of two key pieces, a change.tcl file and an
estimated.db file.  Their change file contained all of the resizing ECOs as
determined by ProTiming.  The change file is written to take advantage of
the "size_cell" function in PrimeTime, so one line of the change file may
look like "size_cell {instance_name} {library/cell}".  Their estimated.db
contains the cell info (timing, area, etc., your standard .db stuff) for
the in-between cells.  Using these two files, the ECO changes can be
evaluated in PrimeTime.  The thing to keep in mind here is that this ECO
evaluation is mixing the interconnect context (SPEF file) from the original
post-layout result with the ProTiming ECOs (cell swaps).  To achieve the
predicted performance as shown by PrimeTime, the physical ECO must maintain
a similar interconnect context.  Using PrimeTime, I evaluated the first
round results and was able to replicate the 13% speed increase Prolific
reported.  Digging a little deeper, I determined that a number of the
critical/near critical paths in the design were not properly optimized
during the original timing closure effort.  By moving to a PrimeTime setup
that used ideal clocks and an uncertainty budget, I was able to eliminate
the easy pickings from the table (the poorly optimized paths were associated
with clock gating structures).  Since ProTiming does not alter the clock
structures, this seemed like a reasonable approach throughout the remainder
of the evaluation.

October 7, 2002

I spoke with Prolific about moving from the PrimeTime setup that used
propagated clocks to a PrimeTime setup that used ideal clocks and an
uncertainty budget.  Additionally, we discussed relaxing several of the
input and output constraints to improve optimization results.  This was
particularly important for input and output paths with shallow logic
depths.  For these paths, the majority of the cycle time was tied up in
the input or output delay constraint.  I agreed to relax these delay
constraints by 150 psec.

October 9, 2002

I received a second set of results from Prolific.  The speed-up was now
10% as reported by Prolific (the optimizer had been setup to run with a
targeted goal of 10%).  As I reviewed the results, it became clear that
delay prediction differences between PrimeTime's timing engine and LSI's
timing engine were influencing the results.  At the time, the ProTiming
tool was being run using the lsidelay generated SDF and the provided SPEF
file.  For each gate in the design, the SDF is used to determine the
original gate delay.  Once a gate is resized, PrimeTime recalculates the
gate delay using the SPEF network for the net.  In my case, this led to
a situation where the gate delays in the optimized design were generated
using a mixed delay prediction; PrimeTime for the resized gates, lsidelay
for the gates that had not been resized.  I thought this would confuse the
comparison results so I asked Prolific to re-run optimization using only
the SPEF file (this request was made 10/28/2002).  All pre-optimization
to post-optimization comparisons from here forward would be made using the
PrimeTime delay prediction.

November 5, 2002

I received a third set of results from Prolific.  The speed-up as reported
by Prolific was 11.9% (the optimizer had been setup to run with no targeted
performance goal).  I reviewed the results in PrimeTime to verify the
speeds.  While I was able to replicate the number Prolific had used to
calculate the performance boost, I was not satisfied with the result.  The
speed increase had been calculated counting all types of paths; input to
register, register to register and register to output.  Obviously, I wanted
the input and output paths to be optimized, but the performance of the
processor block had to be calculated considering only register to register
paths.  When I recalculated the performance benefit using this criteria, the
speed of the processor had improved by only 4.6%.  Back to the drawing board.
I contacted Prolific to discuss the discrepancy in performance calculations
(call was made 11/07/2002).  During the discussion, I agreed to take another
pass through ProTiming using the current optimization result as the starting
point.  Additionally, I agreed to zeroing out the input and output delays so
that the ProTiming tool could focus on the flop-to-flop paths.

November 19, 2002

I received a fourth set of results from Prolific.  The speed-up was reported
to be 9.6%.  Using PrimeTime, I reviewed the results and was able to verify
that the input/output path timing was nearly identical to the 11/05/02
result.  I also verified that the flop-to-flop performance had increased by
9.6%.   At this point, I was satisfied that the result warranted further
dissection.

In order to feel reasonably confident about the result, I wanted to answer
three questions:

  1) was the timing estimation for the in-between drives solid?
  2) was the resulting area reasonable?
  3) did the cell optimizations make sense?

To answer question one, I sent our internal library development team a copy
of an uncompiled version of the estimated.db file.  (I had asked for this
prior to receiving the 11/19/02 result.)  I asked the library team to review
the timing estimation for the in-between drives and compare it to what we
would get if we were to build and characterize the cell ourselves.  After
reviewing the data, it was determined that the timing estimation method was
reasonably accurate and tended to be slightly conservative for most (but not
all) cells.  Secondly, the area penalty was determined to be 5.1% using the
Synopsys reported area.  Finally, several spreadsheets were generated with
Excel to breakdown the cell swaps that ProTiming was implementing.  One
would expect that the critical paths in the design would be composed
primarily of simple logic gates, i.e., ND2, NR2, INV, BUF, etc.  It would
seem logical, then, that the majority of ECO operations should be ECOs of
these simple gate types.  The spreadsheets showed that approximately 70% of
the cell swaps were performed on simple logic gates.  This was one
particular example of a few thought experiments that showed the cell swaps
performed by ProTiming to be reasonable.  All in all, the ProTiming results
seemed reasonable after further scrutiny.  This meant that the true speed-up
boiled down to one's ability to successfully work the changes in the P&R
flow (isn't this often the case?).  

January 5, 2003

I asked Prolific to re-run the optimization to reduce the area penalty.
They suggested an approach that would limit the maximum area (Synopsys
area attribute) of the in-between drives.  We decided that two runs of
ProTiming would be performed, one with a maximum area of 50 and one with
a maximum area of 100.

January 23, 2003

Prolific sent an email with the results from the two additional trials.
The first run (max. area 50) provided an 8.9% speed increase for a 3.6%
area penalty.  The second run (max. area 100) provided a 9.3% speed-up
for a 4.2% area penalty.

February 20, 2003

Finally had enough time on my hands to dig into the results reported in
the 01/23/03 email.  Unpacked the tarball, fired up PrimeTime and analyzed
the results.  I was able to reproduce the reported timing improvements
and area increases.


My overall observations in the ProTiming eval:

  1. I chose to skip the physical ECO process for now since this would
     require a build of the cells to generate physical views.
  2. Since ProTiming does rely on an ECO process, it has the same
     pitfalls as any other ECO process.
  3. With an ECO tool like this, one must plan ahead to reserve enough
     room to facilitate a successful ECO.  This tool will have a
     difficult time improving timing on a very highly utilized design.
  4. I like the fact that ProTiming bolts onto PrimeTime and uses a
     standard STA setup as a starting point.  Those fluent in PrimeTime
     likely would come up to speed on ProTiming in short order.
  5. Since ProTiming only does resizing optimizations, it can provide
     only marginal benefit in some cases.  For example, it will have
     minimal impact on paths hindered by bad clock skews and/or poorly
     structured logic.
  6. If PrimeTime is not your delay prediction engine, the correlation
     issue must be addressed.
  7. Whether it be in-house capability or external capability, you need
     to have a means of building and characterizing the new cells
     (in-between drives).
  8. I don't have a good handle on ProTiming's capacity issues since
     the ARM processor block was on the small side.
  9. You can do a lot of exploratory work in PrimeTime before the physical
     database is touched.

John, obviously I can't comment on any of the software-specific issues since
I did not drive the tool, but I have spent a fair amount of time reviewing
ProTiming's optimization capabilities and am generally pleased with their
test case results.

    - David Parker
      LSI Logic                                  Bloomington, MN


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)