( DAC'16 Item 3 ) ----------------------------------------------- [12/16/16]

Subject: IC Manage PeerCache and CDNS Rocketick get #3 as Best of 2016

ALL THINGS PARALLEL: If you had to name the one trend which characterized
the past 12 months in EDA and chip design, I'd say 2016 was "The Year That
True Parallelization Became Real".

        ----    ----    ----    ----    ----    ----    ----

For general EDA tool use, true parallization was invented (or should I say
re-invented) this year when the ever busy R&D guys at IC Manage came up
with PeerCache, a tool that does peer-to-peer data transfers/updates in
fully parallel workflows of both source and generated project data.
     
It's as if BitTorrent or Napster came to your chip design (or verification)
workspace.  You have dozens of engineers a working on the same project at
the same time -- tweaking this and modifying that -- and this can easily
involve say 2 terabytes of data if you include both your design's source
files and its related generated files (timing reports, waveforms, etc.).

In the olde days, to populate your 1 terabyte of design into your workspace
readily took 2-3 hours.  With PeerCache, 1 TB takes less than 60 seconds!
And for capacity it benchemarked populating 10 TB in 10 minutes.
     
To work this magic, PeerCache snarfs all the right bits and fragments of
your project from your fellow engineers' workspaces.  Shiv Sikand worked on
this in detail at IC Manage and announced it at DAC'16.  (See ESNUG 561 #2)

WAIT!, IT DOES MORE!: In addition, PeerCache also 4X to 20X accelerates
your EDA tool's monster big data "reads" and "writes" with 2000 MB/sec
transfers.  Plus, through clever data redundancy reduction (and by only
storing "deltas") it can take 47 TB of design & its related generated data,
and squeeze that down to 200 GB on your hard drive.  (Again ESNUG 561 #2)

Get that?  20X faster EDA tool reads/writes, 150X faster workplace loading,
and using 1/200th the hard drive storage -- all because of parallization.

        ----    ----    ----    ----    ----    ----    ----

For Verilog simulation, true parallization first appeared on the engineering
public's radar screen when Rocktick -- a project that Uri Tal had worked on
in Israel for 4 years -- had it's first (tiny) booth at DAC'2011.
          
And even by 2013, RocketSim was only a Nvidia-GPU-only gate-level Verilog
simulator -- but it still benchmarked 23X faster against VCS.  (ESNUG 523 #4)

                  Captures    time (hours)    Time/capture    Speed-up
                  --------    -----------     ------------    --------
    VCS              7          8.2            1.17 hrs          1X
    Rocketsim        7          0.64           0.09 hrs         13X
    RocketSim      102          5.31           0.05 hrs         23X 

Jump forward to 2016; Cadence acquires a Rocketick that now does both gate-
and RTL-level simulations -- plus it runs on Intel XEON CPU server cores.
When CDNS R&D natively compiled the RocketSim source C together with their
Incisive source C into one GNU C++ object called "Xcelium", they saw:

    design        type       size      # of CPU   speed vs. Incisive
    -----------   ----     ---------   --------   ------------------
    Little Boy     RTL     50M gates    8 cores       4X speed-up
    Fat Man        RTL    400M gates    6 cores     9.3X speed-up
    Fat Man       gates   400M gates    6 cores      30X speed-up

SNPS ON THE DEFENSIVE: Keep in mind that all of this is actually a tech war
between Aart and Anirudh over the next big boosts in Verilog simulation;
     
and this Synopsys Cheetah VCS is Aart playing catch-up with RocketSim.  Even
according to the Synopsys press release, and in Aart's own SNUG'16 keynote,
SNPS Cheetah is still 2 years out (hints of ICC2's lateness?) -- and it's
still only using Nvidia GPUs instead of the Intel x86 CPUs.

    "I couldn't find anyone at the Synopsys booth to discuss their
     Cheetah VCS equivalent, but that didn't surprise me because
     it's still 2 years out."

         - Cliff Cummings, father of SystemVerilog (ESNUG 561 #7)

    "We plan to roll out Cheetah technology over the next two years
     as part of VCS."

         - Manoj Gandhi, Synopsys EVP/GM (press release 03/24/2016)


      QUESTION ASKED:

        Q: "What were the 3 or 4 most INTERESTING specific EDA tools
            you've seen this year?  WHY did they interest you?"

         ----    ----    ----    ----    ----    ----    ----

CADENCE ROCKETICK ROCKETSIM

    We did an evaluation of (Cadence) RocketSim, multicore CPU-based
    simulation accelerator for RTL simulation.

    We evaluated it on a 40M gate sub-system with Cadence NC-Sim and
    SystemVerilog (we have a mix of Verilog and SystemVerilog RTL).
    What we found.

    1. RTL Speed up.  This is the relative speed up we saw:

       - For  4 CPUs, our speed up was ~ 3X faster
       - For 16 CPUs, our speed up was  10X faster

    RocketSim seems to peak at about 10x speed up; that's the nature
    of programming.

    2. Compilation time is about the same.

    3. The debug was very similar to using NC-Sim; we used it with
       Cadence's SHM waveforms.

    4. RocketSim supports four-state 1/0/X/U logic.

    5. Like NC-Sim, RocketSim communicates interactively with the
       testbench.  This is better than a lot of older generation
       simulation accelerators that you had to use in batch mode.

    A negative is RocketSim is price more than directly proportional to its
    speed up.  For example, it costs 6X the price for a 3X speed up, so
    financially we could just get 3 more NC-Sim licenses to run on our
    servers instead.

    If RocketSim were priced proportionally, we could use it for all of our
    regressions, rather than as a point tool for debugging.  And it cannot
    save a snapshot, which matters a lot for debugging.  i.e. You can't run
    RocketSim for an hour, save it and then run it many times, as you can
    with NC-Sim alone.  Cadence says the snapshot feature is in their
    pipeline and they are working on it, but it is not yet available.

    Technology-wise, overall RocketSim is a great tool.

         ----    ----    ----    ----    ----    ----    ----

    Cadence RocketSim

    More speed without losing the convenience and capability of a
    simulator.  To quote the Cadence marketing: "debuggability,
    seamless testbench integration, fast turn-around, and
    availability".

    Emulators/accelerators are all very well but there are never
    enough seats and it's always extra effort.

         ----    ----    ----    ----    ----    ----    ----

    Cadence RocketSim enables true parallel processing to greatly speed
    up logic simulation.

    We have been asking Cadence to implement true parallel processing
    for many years.

    They finally decided to obtain it thru acquisition.

         ----    ----    ----    ----    ----    ----    ----

    RocketSim

    Cadence discussed 10X Verilog simulation performance, reduced the
    memory footprint, and full debug visibility.

    Given that all of support is now in place, RocketSim's high capacity
    might be most significant.

    My confidence in this success starts with Anirudh Devgan.  Anirudh
    made significant contributions at IBM in transistor level analysis,
    which are still in use at IBM 10+ years after he left.  At Magma,
    Anirudh founded a small team with a SPICE engine which he enhanced to
    become the industry's best SPICE tool, FineSim (acquired by Synopsys).

    So with Anirudh involved I predict great success for RocketSim.  Before
    buying it, he would have confirmed the features and user experience,
    well as well the inherent capabilities of the development team before
    making the purchase.

    It looks much easier than starting with a small team of SPICE developers
    10 years ago.

         ----    ----    ----    ----    ----    ----    ----

    I had heard about RocketSim before DAC.  We wanted a simulation
    accelerator, but at the time we looked it didn't support VHDL, so
    couldn't use it.

         ----    ----    ----    ----    ----    ----    ----

    I liked Cadence RocketSim.  I am lazy.  I cut & paste for my report.

    To cut & paste Cadence marketing:

      - RocketSim solves the simulator's bottleneck challenge by
        offloading most time-consuming calculations to an ultra-fast
        multithreaded engine. Unlike hardware based accelerators,
        RocketSim works from within the familiar simulator environment
        and runs alongside the existing test bench, eliminating ramp-up
        time while providing 4-state bit-precise results.

    To cut & paste Cooley:

      - Splits Verilog simulation into multi-threads on 100's of regular
        multicore Intel x86 XEON servers.  What they got benchmarked 23X
        faster vs. Incisive.  Does gate and RTL sims.  Compiles 1 billion
        gates in 2 hours.  4-state-logic for X.  Full System Verilog and
        accelerates SVAs

      - Xcelium on 8 core Linux box ran 4X faster than Incisive on a
        single core Linux machine.  For a 400 million gate design
        (Fat Man), Xcelium on 6 cores ran 9.3X faster.  That is, the
        larger the design with the most activity the testbench stimulus,
        the better speed-up Xcelium got!  When 400 M gate Fat Man was
        doing high activity DFT gate-level simulation it was 30X faster.
        This 4x-9.3X-30X boost revitializes the RTL SW market (or at
        least Incisive's share of it.)

    To cut & paste Cliff Cummings:

      - Limitations?  No SDF backannotated timing yet (working on it).
        RocketSim runs RTL simulations with non-accelerated UVM in
        parallel.

    Currently, RocketSim does not accelerate testbench primitives.

         ----    ----    ----    ----    ----    ----    ----

    RocketSim not supporting SDF is a deal breaker for us.

         ----    ----    ----    ----    ----    ----    ----

    Cadence RocketSim

    We were already evaluating Rocketick for potential time savings
    on our gate-level netlist verification.

    We had been waiting for it to have SDF annotation support, which
    Cadence announced at DAC, following the acquisition.

         ----    ----    ----    ----    ----    ----    ----

    I had a very positive impression of Rocketick.

    Am looking forward to the improvements in the future.

         ----    ----    ----    ----    ----    ----    ----

    The CDNS-Rocketick acquisition looked very interesting indeed.  I was
    aware of the Rocketick technology since I have friends connected to
    the company in Israel.  I had been tracking it for some time.  So when
    I heard of the acquisition I was most interested.  I hear that CDNS is
    planning to integrate Rocketick with Incisive further I heard from a
    source that they are calling it "Project Xcellium".  (* - strange name
    but if it produces what is possible then who cares!!)

      - Xcellium is supposed to deliver 10X speed-up for RTL sims and
        be massively parallel.

      - For gates, I believe they said it would be up to 30X speed-up.

      - Being massively parallel is something CDNS has done before with
        other technologies like STA and P&R.

      - They've figured out how to apply the same architecture to RTL
        simulation.

    One thing that did surprised me is that CDNS acquired this technology
    instead of developing something in-house, like they've done with
    their other digital P&R products.  So despite the excitement of the
    potential, I wonder if CDNS will be successful integrating these
    technologies or will it follow the failed path previous acquisitions
    have gone.  Historically CDNS has struggled with integrating acquired
    companies and their technology.  It has instead replaced their old
    R&D team with the new R&D team.

         ----    ----    ----    ----    ----    ----    ----

    I've known the Rocketick guys for a while.  It was a smart move for
    Cadence to acquire them.

    If Cadence fully integrates RocketSim to accelerate their functional
    Verilog simulation by 10X, that will be great.  Also the fact that
    RocketSim supports Verilog, VHDL, System Verilog, OVM, VMM, and UVM
    can be a market killer against Aart's VCS and Wally's Questa tools.

         ----    ----    ----    ----    ----    ----    ----

    RocketSim

    I saw Cadence RocketSim at DAC.  My impression is that it's fast and
    for Incisive users it's easy to use.

    Our front end group did an evaluations and had a positive opinion of
    it.  (I haven't personally used it.)

         ----    ----    ----    ----    ----    ----    ----

    My Synopsys VCS account manager frowned at me at DAC.  He saw me
    talking to Uri in the Cadence RocketSim demo booth.

         ----    ----    ----    ----    ----    ----    ----

    Cadence Rocketick -

    I don't think it'll replace our Palladium sessions, but it could
    help with our early functional RTL development.

         ----    ----    ----    ----    ----    ----    ----

    We want to get some Rocketick licenses in our tool mix next year.

         ----    ----    ----    ----    ----    ----    ----

    Cadence RocketSim will grow with the money & resources that Lip-bu
    can throw against it.

         ----    ----    ----    ----    ----    ----    ----

    RocketSim.  Faster is always gooder.

         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----

SYNOPSYS CHEETAH VCS

    Cheetah VCS will be interesting once it comes out.

         ----    ----    ----    ----    ----    ----    ----

    Synopsys Cheetah VCS

    Saw it at SNUG San Jose.  Looks like early Rocketick.

         ----    ----    ----    ----    ----    ----    ----

    We want to beta Cheetah if we can.

         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----

IC MANAGE PEERCACHE

    IC Manage PeerCache

    My team supports about 400 mask and schematic designers in our company.
    These designers vnc into servers to do their work.  The data that they
    access is located on NFS filers that are shared by up to 80 other users
    in the company.

    When rogue users get carried away and run a very large number of
    simulations and regressions they can kill the performance of these
    NFS filers.  

        - It can take hours for our IT department to track down the
          offending users.  

        - The poor filer performance results in poor end-user experience.
          
        - At those times our users notice long delays to bring up their
          layouts or run their verification and simulation jobs.

    IC Manage PeerCache is a software solution that may be able to help us
    deal with this problem. 
 
        - PeerCache will cache frequently accessed data from the
          NFS filers on an SSD drive that's local to a machine.  

        - PeerCache works on both managed and unmanaged data.  

    We think this software could reduce the latency of getting data to and
    from disk and could help the productivity of our mask and schematic 
    designers.

         ----    ----    ----    ----    ----    ----    ----

    The most interesting and useful to me is IC Manage PeerCache's
    super-fast data copying system.

    For most designers runtime is a huge deal and the speedups that
    could be achieved using IC Manage seemed significant.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache

    PeerCache's peer-to-peer networking tool was interesting and could
    give us a significant speedup.

    The tool has a lot of potential, especially for the design verification
    space when running a lot of simulations on a larger server farm.

        - When doing design verification, we do lots of simulations on one
          design, and may run 1000's of SystemVerilog simulations.

        - Since they use the same design database, all our servers are 
          pulling the same file sets from the same db.

    PeerCache would let us share files faster -- we could get the data
    quicker from a peer-to-peer network, versus everything hitting our
    NFS filer at once.

         ----    ----    ----    ----    ----    ----    ----

    The IC Manage PeerCache "peer-to-peer" tool was interesting.

        - We like the idea that Peercache can speed up our systems and
          reduce the load on the filer servers. 

        - It keeps engineers from waiting for copies and for other
          engineers to be done.

        - Plus we get the speed up without having to upgrade our servers.
          PeerCache accelerates NetApp, Isilon, VMware -- we primarily use 
          NetApp, so that's a benefit for us.

    The fact that it is all software is also plus as we expect this will
    reduce our costs.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage's Global Design Platform & PeerCache are impressive.

    PeerCache has peer-to-peer networking and virtual workspaces for
    parallel workflows with local caching and low storage.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache

    IC Manage announced PeerCache at DAC for parallel workflows.  IC Manage
    uses a P2P network to make it more efficient.  This is a big benefit for
    us, as we have multiple sites.

        - The speed improvement of being able to populate databases or 
          files faster would be huge for us.

        - We integrate files and libraries from project to project, for 
          new projects, and for revisions.  

          PeerCache should let us populate workspaces in perhaps only minutes
          compared with hours for large databases.

    PeerCache also massively reduces local storage needs.  A company with
    100's of users would benefit tremendously from that cost savings on
    storage; however, it is less needed for us, given we have fewer
    engineers.  

    Instead, we really need speed for our remote center usage.

    PeerCache also offers good data security, because much of the data is
    virtual vs. local.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache peer-to-peer workflow acceleration is very
    interesting to us as we look to both accelerate key workflows while also
    offloading expensive shared storage resources that are in high demand.

    I have questions about the loading it might place on the compute nodes
    that are already used for simulation and verification via LSF jobs that
    now will also become storage/IO peers for the rest of the compute farm.

    If the load does impact the simulations and verifications a bit, the
    overall gain from PeerCache could still make sense if aggregately the
    flows complete in a shorter amount of time while also offloading IO from
    the shared filers.

    To deploy IC Manage PeerCache, we would potentially have to change the
    local storage we currently have on compute nodes (smaller spinning disk
    and lower end RAID cards) and move to significant local SSD/flash
    capacities to host the Virtual Workspaces and participate in the
    peering.

    This of course comes at increased costs, which we need to compare with
    the potential benefits of reduced verification and simulation times
    coupled with reduced load on the share filers.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache - peer-to-peer parallel workflows. 

    I've been a customer of IC Manage for many years.  

    Their new PeerCache seemed useful for the offices with multiple sites, 
    but we are a single office, with everyone on site.

    I want to investigate to see how data is checked in into the filer and
    how it's validated inside the cache.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache. It would be helpful for 

        - Large projects facing storage and compute limitations
          during design

        - Some areas of verification

    Our own projects are currently too small to justify using it.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache. PeerCache reduces the load on the file server
    through a P2P network.

    It's interesting and may have possible applications, but would
    need a lot of experimentation.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache: software solution to caching work areas. 

    1. Uses the disk on the client machine as cache.

    2. Only loads metadata.  

        - As users read data, it downloads to the client, then all clients
          on that host have instant cache access to that data.  

        - The cache then maintains the differences between the data.
          So it only takes the populate time hit once, and only for
          the files needed, and then only local disk access delay.

    3. It is a "bring-your-own-hardware" software solution that accelerates
       NetApp. 

    4. Speed Improvement. It speeds up all your project data - both managed
       and generated files - and speeds up all DM systems.

    5. Gives you filer storage savings

    6. Allows parallel workflows.

        - Engineers no longer have to wait for someone else to finish 
          with their physical copy.

        - Any user can now clone any authorized workspace at any moment 
          in time -- even a terabyte-sized full chip workspace.  The
          clones occur in near-zero time, and include both managed
          and generated data. 

        - No additional storage is consumed until changes are made.

    We use Subversion, and IC Manage decided not to have PeerCache support 
    Subversion until 2017 so it's no longer a solution for us.

         ----    ----    ----    ----    ----    ----    ----

    IC Manage PeerCache P2P workflow is very interesting.

    It addresses a need we are currently facing, and that is IO bottlenecks
    on large overloaded NetApp filers.  We are seeing slower design data
    access times and building workspaces can be time consuming.

    It's interesting to me how widespread this problem is.

    There is a need for meta-data to be maintained and accessible away from
    the filer itself so that finds, snapshots, disk usage checks, time 
    stamp checks, etc. can all be done without filer access.

    IC Manage's PeerCache solution is a clever way to speed up data  
    access when you have multiple people on a project.  Their solution does
    this using software.

    IC Manage helps users who build workspaces. (e.g. DDM needs)  

    EMC pushed their hardware that is more scalable than NetApp to avoid 
    the IO bottlenecks.  Ellexus pushed their software for identifying IO
    bottlenecks and providing IO throttling and load balancing to help
    ease the problem.  Methodics provides a hardware solution.  IBM spoke
    about their new object storage solutions.  

         ----    ----    ----    ----    ----    ----    ----

    IC Manage had a great car at their booth. 

         ----    ----    ----    ----    ----    ----    ----

Related Articles

    Real Intent and Blue Pearl get #2 overall for Best EDA of 2016
    IC Manage PeerCache and CDNS Rocketick get #3 as Best of 2016
    MENT Calypto Catapult single handedly gets #4 Best EDA of 2016
    BDA, Solido, MunEDA, and Silvaco get #5 for Best EDA of 2016

Join    Index    Next->Item






   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.





































































 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.

Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)