( ESNUG 376 Item 3 ) -------------------------------------------- [08/29/01]

Subject: ( DAC 01 #27 ) The Cisco 80 Mhz Cadence PKS/SE-PKS 0.25 Tapeout

> And finally now, let's look at the customer tape-outs using these tools:
>
>        Synopsys  ########################################### 170 tape-outs
>     Cadence PKS  ### 12
> Cadence non-PKS  ###### 23 
>           Magma  #### 13
>        Monterey  0
>
> So, from this analysis, I see only 1 non-Cadence company (a consulting
> house) with one PKS job, 12 PKS tape-outs, and hardly any PKS user
> discussion *anywhere*.  With PhysOpt, I'm seeing 17 non-Synopsys job
> offers, 170 tape-outs, and boatloads of PhysOpt user banter.  Hmmmm...
>
> Now I know why Ray Bingham has quarterly conference *phone* calls.  He's
> hiding one awful big ear-to-ear grin on his face when he intentionally
> uses the words 'proliferation', 'PKS', and 'SE-PKS' together in the call.


From: Geoff Smith <gjsmith@cisco.com>

John,

I'd just like to voice my thanks for the huge effort in putting the DAC Trip
Report together.  It's a veritable gold mine of information - and especially
valuable to those of us who were resticted from attending this year.

Also thanks for extending ESNUG to cover all the physical and non-Synopsys
issues, tools and vendors.  The expanding focus has greatly increased its
value to me.

Maybe you should rename to ENSUG (Email Not just Synopsys Users Group).

BTW -- you can add another tapeout for PKS to your stats.  We're ramping to
production now.  Here are the details of using PKS without a taxi in sight!
(We did it on our own.  We're in Australia -- its hard to get Cadence
on-site support down here :-)

  Design:  ~360,000 placed standard cells (~1 million gates)

  RTL language: Verilog

  Digital Macros: 39 RAMs - actually more like register files => implemented
  as pre-placed standard cells (think of it like a datapath generator.)

  Analog Macros: 1 large analog macro containing high speed ADCs and DACs

  Process: TSMC 0.25 uM generic 5 metal CMOS

  I/O: ~200 pads

  Clock Rate: 80MHz (12.5ns) - 7 gated clock domains - the speed is not
  huge but the designers keep on packing in plenty of gates per stage so
  it was still a challenge.

  Library: Artisan cell library

  Memories: implemented with standard cells via a programmed generator i.e.
  not real full-custom memories but placed by a program - similar to how a
  datapath generator might work.

  Timing format: TLF - provided by Artisan plus custom TLF for analog
  macros.

We do one complete top-down run. i.e. the whole 360,000 cells are done
together - no divide and conquer.  This, I think is actually one of PKS's
strengths.  Actully I think there are two strengths:

    a) PKS handles large designs as a single block (makes constraint
       setting easier.)

    b) PKS has excellent correlation with silicon (from what I hear its
       better than PhysOpt.)

To me the big down-side is whether it will survive against the Synopsys
machine.  I sat thru a Synopsys RTL->GDS presentation last week and the
story is looking good.  It'll still be a while before they out Silicon
Ensemble but what troubles me is that they are going after it very
aggressively whereas Cadence don't seem to be going after the Synopsys
core business anywhere near as seriously.

We didn't use a floorplanner tool (e.g. Chip Architect, etc.)  We did our
floorplan pretty much manually using the Silicon ensemble environment.
This came down mainly to where to place the 39 memories and analog parts
and then a power plan that fit that.  The standard cells where then placed
'flat' after feeding this DEF based floorplan back to PKS.

  PKS Flow Overview:
  ------------------
  1. Ambit RTL Synthesis
  2. LogicVision membist
  3. Resynthesize in Ambit
  4. Logicvision icbist (scan) insertion
  5. Logicvision boundary scan, jtag, icbist insertion (scan simulations)
  6. Resynthesize again in Ambit
  7. Early path check/fixes (fast models)
  8. Manual Floorplan using SE (macro placement, I/O placement, and power
     striping)
  9. PKS (import DEF from step 8, import netlist from step 7) => one pass
     timing closure (~18 hrs)
 10. Clock tree using CT-PKS (clock tree tool inside PKS - Ouch.) (~20 hrs)
 11. Early path check (no fixes required)
 12. Global & Detailed Routing (SE-Ultra - Wroute) (import DEF/netlist
     from step 10)
 13. Hyperextract (SE-Ultra) => creates DSPF, SDF parasitic data
 14. Pearl - timing analysis (import DEF, DSPF) - timing closure accuracy
     was extremely close to pearl static timing results
 15. At speed gate simulations (import SDF, DSPF) => internal PLI based
     power tool
 16. SE-SI - signal integrity.  (Yeack!!!)
 17. Final Layout finished off with Silicon Ensemble and Virtuoso (import
     to virtuoso) fix power routing, analog routing, apply metal fill
 18. DRC checks
 19. Antenna checks
 20. LVS checks

Wins:
-----

Ambit/PKS can handle the whole design.  This means relatively little trouble
setting top-level constraints and having those constraints propagate through
the design hierarchy.  No bottom up crap.

PKS closed on timing.  Although the clock speed was relatively slow there
was a *lot* of logic between stages and a previous version of the design
done without PKS had the typical timing-closure problem.

The correlation between Pearl STA and the PKS timing engine was extremely
good (< 5%).  No timing surprises at the end.

CT-PKS clock tree generation caused much initial heartache.  We taped out
with version SPR4.07 - a bug fix release.  SPR4.06 would core dump on the
clock tree.  Clock tree generation took around 20 hours - with zero updates
on status.  Clock tree did cope with clock gating in the tree - which is
why we chose to use it in the first place.

Logicvision ICbist works as advertised, but it does take non-trivial effort
to make it so.

Loses:
------

SE-SI - signal integrity was extremely disappointing.   We are happy with
PKS performance, what we're unhappy with is the signal integrity/power
tools in SE-SI (aka SE-PKS).

These tools need a *lot* of work.  We've written our own power analysis
tool to compensate and we're actively looking for a better crosstalk
analysis and repair tool. If you know something that works for crosstalk,
I'd be happy to hear it.  SE-PKS/SE-SI will have to get much better before
we use it again.

Success Factors:
----------------

1. Experienced RTL designers that understand how to maintain a clean
   clock paradigm.

2. Running backend builds while the design is in progress.  We did 23
   builds from partial code allowing us t fine tune the flow and
   feedback RTL change requests.


My summary is PKS works.  It has good correlation with silicon and can
swallow large designs.  Interfacing with Silicon Ensemble is a no-brainer.
Its bleeding-edge but showing some signs of maturity: we taped out with
version SPR4.07 but couldn't have done it using SPR4.06.  Proof: the
silicon is in manufacturing ramp up.

    - Geoff Smith
      Cisco Systems                              Toowong, Australia


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)