( ESNUG 494 Item 5 ) -------------------------------------------- [10/20/11]

From: [ The Green Lantern ]
Subject: Juniper Networks benchmarks SNPS CustomSim XA vs. Synopsys HSPICE

Hi John,

Please keep me anon.

I went to the Synopsys "SPICE Up Your Chip" AMS dinner event at DAC this
year, which I estimate about 500 people attended.  Nikhil Jayakumar from
Juniper Networks gave a presentation on using Synopsys CustomSim in the
construction of hybrid tree-mesh clock distribution networks, and compared
its performance and accuracy with Synopsys HSPICE.

Clock Tree vs. Clock Grid

Nikhil indicated that Juniper uses target skew to decide whether to use a
clock tree or clock grid structure when designing chips.  He said there are
two kinds of skew:

  1. Structural (layout) skew, caused by capacitive load mismatch and 
     wire length mismatch.  This can be handled by building balanced 
     clock trees (eg. H-trees) which can get zero skew, but only in the
     absence of PVT variations.  Can be also measured using regular 
     Static Timing Analysis (STA) tools.

  2. Dynamic skew due to PVT variations.  There are dynamic clock 
     deskewing schemes which Intel uses in their clock networks.  
     Alternatively, there are static schemes such as adding cross-links
     to clock mesh or hybrid tree-mesh structures.  Static schemes 
     requires SPICE-based analysis; PrimeTime techniques don't work.

Juniper did a tape-out with a hybrid tree-mesh clock distribution network
that had a frequency of 700 MHz to 800 MHz.  The design was implemented in
a TSMC 40 nm process.

They used a hybrid tree-mesh structure (the tree drives a mesh), where the
clock delivered from PLL to vertical clock spine which drives 6 horizontal
clock ribs, which drives a clock mesh.  They added cross-links to the 
hybrid tree-mesh structure at regular intervals to reduce skew due to PVT 
variation.  They then carefully constructed a vertical clock spine and 
horizontal clock ribs that were balanced, with low latency to reduce jitter.

Nikhil listed the following factors as needing to be managed:

   - Optimal wire width.
   - Spacing.
   - Buffer drive strength.
   - Wire length between buffers chosen to reduce jitter.
   - Slew limitations.
   - IR and EM factors for determining buffer drive strength and 
     tolerable slews.
   - Routability and area constraints.
   - Adding cross-links (shorting wires) in the tree to cancel out PVT 
     variation.  (details below)

Cross-Links to Cancel PVT Variation Complicate Timing:

Adding cross-links at regular intervals in the clock tree to cancel out PVT
variation cannot be done randomly because in some cases adding cross-links 
worsen jitter due to the added load on buffer.  So Juniper only adds 
cross-links if skew reduction is outweighed by jitter increase.

Juniper needed SPICE simulation to estimate the delays, since PrimeTime
analysis could not handle reconvergence in non-linear circuits nor account
for the averaging effect of cross-links.  So they used Synopsys CustomSim XA
based on its:

  - SPICE accuracy.  Relative accuracy is important since you are only 
    concerned with relative differences between various points in the 
    network.

  - Speed.  The clock grid is tuned multiple times during design, so 
    Juniper needed quick turnaround for ECOs.

Synopsys CustomSim XA vs. Synopsys HSPICE:

Nikhil indicated that CustomSim XA is easy to use - there was no learning 
curve, because CustomSim used the same HSPICE netlist and measures.

He gave the following sample CustomSim XA command syntax:

     xa -hspice <netlist.sp> -o <output_file> -c <command_script_file>

Where command_script_file is used to specify level of accuracy:

     set_sim_level <level from 3 to 7>

Nikhil then showed his test case:

             Element                # of Elements
             diodes                         2,275
             NMOS                         186,564
             PMOS                         186,564
             capacitors                 7,335,052
             resistors                  4,306,327
             voltage sources                    3
             ----------------          ----------
             Total                     12,016,785

The test case was an extracted netlist of the clock tree and mesh.  Juniper
used Synopsys Star-RC ver 2009.12 with "reduction" enabled for extraction.
Used CustomSim XA (64-bit, Ver 2010.03) vs. HSPICE (64-bit, Ver 2009.09):

                                          CustomSim XA        CustomSim XA
                         HSPICE              Level 6             Level 3

  Latency (ps)        1317.2-1346.9     1313.03-1342.76     1326.19-1353.48
  Skew of delay (ps)      29.7               29.73               29.27
  Slew (ps)           109.62-117.77     109.952-118.083     106.797-117.454
  Skew of slews (ps)      8.15               8.131               10.657
  Runtime (cpu_clock)     3.5 hrs            1.5 hrs              1 hr
  Memory                  8 G            4 G (physical)+     4 G (physical)+
                                         9 G (virtual)       9 G (virtual)

CustomSim XA's:

  - Latency at Level 6 accuracy was generally between 0.4% to 0.5% 
    error vs. HSPICE.  Its latency at Level 3 accuracy was generally 
    between 0.5% to 1.8% error.
  - Skew was quite accurate, even at Level 3 setting, and its slew also
    fairly accurate.
  - Simulation performance was 2x to 4x improvement over HSPICE.

HSPICE was *unable* to run a high-capacity test of 101,501,922 elements.  In
contrast, the same high-capacity test ran successfully in CustomSim XA, as
it didn't have memory limitations of HSPICE.

Nikhil's concerns and wishlist were:

  1. Electrical current measurements were not accurate in CustomSim XA ver
     2010.03.  Problem fixed in ver 2010.12 so accuracy now within 3%.

  2. Would like CustomSim XA to support Monte Carlo simulations.  Juniper
     uses cross-links to reduce PVT skew, but they need to do Monte Carlo
     simulations on a large netlist fast to know for sure.  Synopsys claims
     Monte Carlo support will come in 2011.09.

  3. Nikhil also said that a faster simulation time is always welcome.

I hope these reports help your readers, John.

     - [ The Green Lantern ]
Join    Index    Next->Item






   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)