Xcelium-ML is Best of 2020 #2b

( DAC'20 Item 02b ) ----------------------------------------------- [02/19/21]

Subject: CDNS Xcelium-ML gets 3x faster regressions is Best of 2020 #2b

SMART REGRESSIONS: After Anirudh talked up how he was doing machine learning
(ML) & computational software on my DAC Troublemaker 2019 panel, I've been
watching to see how Cadence would deliver on it.

  "Right now, you run verification for 6 months and still don't know
   whether you are finished with verification or not.  So, overall
   verification closure is a great area for machine learning.

       - Anirudh Devgan, CEO of Cadence (ESNUG 588 #2)

I spotted ML showing up in Innovus and JasperGold, along with Aart's R&D
chatting up ML in Fusion Compiler and VC Formal; along with Joe Sawicki's
R&D adding ML to Calibre -- plus touting his long established ML lead with
his Solido OCV tools (which won the "Best of 2018" in DAC'18 01) -- I had
yet to find *any* user comments about ML being applied to logic simulators
for regressions nor anything simulation related.

That is, until now.

In this year's report, two early ML users share that Xcelium-ML got them
a 2.5X to 3X speed-up in regression runtimes -- with comparable coverage
compared to their constrained random approach.

    "Xcelium-ML helped us generate a 3X smaller regression set while
     retaining 99+% coverage."

    "Xcelium-ML improved our regression runtimes by 2.5X vs. Xcelium."

A bonus is that it also requires fewer licenses for the same coverage.  And
since Xcelium-ML is based on statistical modeling, you'll get similar
coverage results each time.

TECHNICAL NOTES: Both users described how Xcelium-ML does machine learning
based on it first monitoring a vanilla Xcelium regression run -- and then
it later intelligently generates "condensed" regressions that are 2.5x to 3x
faster with the same coverage.  (ML voodoo involved, of course.)

        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----
        ----    ----    ----    ----    ----    ----    ----

      QUESTION ASKED:

        Q: "What were the 3 or 4 most INTERESTING specific EDA tools
            you've seen in 2020?  WHY did they interest you?"

        ----    ----    ----    ----    ----    ----    ----

    Cadence Xcelium-ML

    Cadence Xcelium has been our primary simulator for years and continues 
    to deliver great performance and efficiency.

    We've now begun working with Xcelium-ML -- which uses machine learning.  
    So far, we've gotten strong results from running Xcelium-ML on three
    SoC System Component IPs.  

    Xcelium-ML:
 
       - Improves our regression runtimes by 2.5x vs. Xcelium.

       - Requires fewer licenses for the same coverage than Xcelium
         requires.

       - Has consistently high coverage of 99+%.

    HOW IT WORKS

    To use Xcelium-ML, we start with our vManager regression database.  Our
    engineer then prepares a simple configuration file for Xcelium-ML.  

    Xcelium-ML has a command that then automates the rest of the flow.

       - Xcelium-ML's machine learning functionality learns from our 
         original Xcelium regressions, and generates an equivalent
         machine learning model.

       - Xcelium-ML then uses this machine learning model to generate 
         new, smaller regression test suites.  The new test suites 
         reduce the overall regression runtimes while still delivering
         coverage results comparable to our original Xcelium tests. 
 
       - In addition, Xcelium-ML generates analytic reports showing 
         random control knobs and coverage bins.  

    HOW WE USE IT

    Whenever a new RTL drops, we replace our normal Xcelium regression
    test suites with the ones generated by Xcelium-ML (to take advantage of
    the reduced regression time, fewer licenses, and high-quality coverage.) 

    For example, for a case where our original Xcelium full regression time
    for an IP was 3 days using full random simulation, the new Xcelium-ML
    was able to finish in about 1.5 days. 


    We then make a coverage-space-of-interest in a Xcelium-ML configuration
    file, and Xcelium-ML generates a target regression for that coverage
    space.  

       - If our original tests already reached 100% coverage, Xcelium-ML
         can reduce the number of test suites to achieve the same level
         of coverage.  

       - If our original tests have less than 100% coverage, the 
         possibility of identifying uncovered coverage bins is the same
         as for traditional random simulation.  

    Xcelium-ML's machine learning can be used for both functional coverage
    and code coverage.

    SMOKE & SOAK TESTING, SIMULATION RANKING, DIRECTED TESTS

       - Smoke and soak testing.  We've used Xcelium-ML for smoke and 
         soak testing to check our code health after any RTL changes.  

       - Directed Test.  If we included directed tests in our original 
         regression tests, Xcelium-ML will also include them in their 
         generated tests. 

       - Compared to Simulation Ranking.  Xcelium-ML shows better 
         coverage than doing simulation ranking.  When you do simulation
         ranking, what you only use a is a subset of your original tests
         and regressions; the ranked tests need to all have the same seeds.

    Knowing this, I would recommend Xcelium-ML to substantially improve your
    original regression efficiency -- especially if you're a prior Xcelium
    user -- the upgrade is worth it.

    It's significantly reduced our regression runtimes.

        ----    ----    ----    ----    ----    ----    ----

    Cadence Xcelium ML 

    We've been using Cadence Xcelium for simulation and have built a testing
    infrastructure around it.  

       - Xcelium runs our regressions, collects our various metrics, 
         enables coverage instrumentation and coverage collection, and 
         prints out our reports.  

       - We deploy a coverage-driven verification methodology to close 
         functional coverage and code coverage.  

       - It takes a lot of time for us to ensure the design is good to 
         go for production; we run a massive number of random seed 
         regressions that take a long time to close.

    Cadence's new Xcelium-ML uses machine learning (ML) to observe the 
    different randomization points and create/train a learning model that 
    correlates the randomization and coverage.  Then, during simulation, the
    model directs the simulator to hit the coverage points.

       - We evaluated Cadence Xcelium-ML against our prior Xcelium-based
         methods.

       - Xcelium-ML helped us generate a 3X smaller regression set 
         while retaining 99+% coverage. 
 
    Below I discuss our Xcelium-ML evaluation process, and our results.

    CONSTRAINED RANDOM ISSUES

    Some of our expectations built around our verification approach:

       - No bug escapes.  

       - Efficiently address changes in our product specifications 
         during active development.

       - Contribute to accelerating verification closure <-- shrinks
         our time to market.

       - Optimize resources <-- Our simulation runs consume a lot of 
         resources, e.g., engineering, software licenses, compute servers.  

     As we are primarily using a constrained random approach, we have a lot 
     of challenges for regression profile selection, such as:

       - Which randomization knob will give me which covered point?
         This is difficult to predict.

       - How many randomized testcases should we run?

       - What scenarios will cover our deep logic?  We need scenarios to 
         cover logic deep inside the design, but it's not obvious what 
         those scenarios should be.  Not all our engineers are experts 
         at analyzing and creating the scenarios -- it takes both design
         familiarity and skill.

       - Are there redundant test cases we can avoid?

    As a result, we end up in a lot of trial-and-error iterations, which 
    consume a lot of time.

    We've shifted our focus from *what* to cover to *how* to cover it. 
    We've tried methods like directed tests and regression result ranking to
    be more efficient.  

    In general, with random regressions we were getting diminishing benefits
    with the additional regressions over time.

    HOW XCELIUM-ML WORKS

    Cadence Xcelium-ML, which has a machine-learning based extension/utility
    that works with Xcelium.  Here is how we use it.

      1. We run our Xcelium regressions normally, but with the Xcelium-ML
         interface enabled.

      2. This starts the ML learning process, which collects the 
         regression data, coverage data, and what settings on the
         "control knobs" used.  

         The learning process can run in parallel with new regressions,
         or we can use it to collect data from prior regressions.

      3. We invoke the generation process, and Xcelium-ML uses the models
         to generate condensed regressions with scenarios that reach
         comparable coverage but with faster closure.

    It works transparently with Xcelium -- we don't have to tweak a lot of
    knobs and/or put in a lot of extra effort to use it.  

    TARGET DESIGN FOR THE EVALUATION

    To do our evaluation quickly, and to easily quantify results:

       - We took a smaller IP that was already completely verified.
 
       - We used a coverage-driven approach with a lot of randomization 
         because that's the focus for Xcelium-ML tool.  

       - Our target regression was about 10 hours.  We used Cadence VIP
         to facilitate this (so that we didn't have any new variables.)  

    Our design IP had:

       - 3,300 Functional coverage points

       - 10,000 code coverage points

       - 272 individual tests 

    EVAL: XCELIUM-ML VS. CONSTRAINED RANDOM

    We ran randomized test cases, with vanilla Xcelium.  For our original
    results, we had around 2800 test cases and cumulatively they picked up
    around 7800 bins (=100%).  It took us about 70 hours.

    Because the tool was not yet in production release during our eval, the
    Cadence team came to our office, set up their ML models, and ran the 
    evaluation on-site.

    Xcelium-ML produced three regression suites, which I've compared to our
    original results.

        1. ML-1 had 126 seeds, resulting in ~5% of the original 
           testcases.  It only took ~6% of the original runtime and 
           achieved ~97% of the original coverage.  

        2. ML-2 had 683 seeds, which translates to 25% of the test 
           cases.  It only took ~30 percent of the original run time.
           It got us 99.1% of the original coverage.

        3. ML-2 had 1326 seeds.  This third test suite had 49% of the 
           original test cases, and only took 60% of runtime.  It 
           achieved 99.4% of the original coverage.

    NOTE: Xcelium-ML produces different run set options automatically 
    and indicates what coverage level to expect for each one.  As part of
    our eval, we intentionally ran all three sets.  However, when we
    actually deploy the tool, we would just choose the coverage level we
    want to achieve, and then only run that set.  For example, when the
    time increases a lot, but the coverage change is very negligible, we
    might choose the suite with slightly lower coverage -- e.g., 99.1%
    instead of 99.4%.   And then figure out how to cover the remaining
    fraction of a percent.

    Our findings:  Xcelium-ML was very promising.  It got us good coverage
    closure in significantly reduced runtime.  

    EVAL: XCELIUM-ML VS. REGRESSION RANKING

    We also compared Xcelium-ML with regression ranking.  With ranking, the
    simulation tool takes the complete original set, and then determines
    which testcases are the most meaningful one.

    So, the ranking regressions set is a subset of the original regressions
    with the same seeds.

       - The vanilla Xcelium ranking operation gave us 156 regressions
         that resulted in only 91% of coverage.

       - Xcelium-ML (ML-1 set), with 126 seeds gave 97% coverage, even
         though the number of seeds and runtimes were very comparable
         to the vanilla Xcelium ranking.

    Xcelium-ML also got us 97% functional coverage, compared with only 84%
    for the vanilla Xcelium ranking.  

    Another difference is that with a ranking approach, if you change 
    anything, such as replacing your seeds with randomly generated seeds,
    it's not guaranteed that you'll get the same numbers again.  

    In contrast, since Xcelium-ML is based on statistical modeling, it's 
    pretty much guaranteed that you'll get very similar results each time.  

    Our findings: Xcelium-ML works better than the ranking approach in 
    terms of coverage results and significantly in cuts down regression time
    compared with our original approach.

    CADENCE'S UMBRELLA SWITCHES

    In general, when Cadence enhances Xcelium with additional optimizations,
    they add umbrella switches to use to turn the features "on" or "off".  
    This is because:
 
       - Not all engineers want all the optimizations.  

       - Some engineers want time to fully test out the new features 
         and get comfortable with them.

    There are a lot of switches that come with Xcelium-ML, including some 
    specifically related to performance.  So, if we run into any issues, we
    can just disable the switch and we can return the tool back to the
    default behavior.  

    XCELIUM-ML -- 3X FASTER USING FEWER RESOURCES 

    Xcelium-ML helped us:

       - Stay focused on randomization to unveil hidden bugs.  (vs. 
         too directed) 

       - Eliminate regression redundancy.  (3X smaller regressions)

       - Optimize our resources and human effort.  Reducing the number 
         of regressions saved us compute resources as well as 4 to 5 
         days of engineering time manually writing testcases to reach
         99%+ coverage.  

    Plus, we should be able to spend less time debugging our testbenches, 
    e.g., simulations hit by illegal scenarios.  

    Xcelium-ML looks very promising, and over the next few months, we will
    look at Cadence's production version on bigger designs.  If everything
    goes well, we expect to adopt it as a part of our flow.

        ----    ----    ----    ----    ----    ----    ----

    CADENCE XCELIUM 

    My team has used Cadence Xcelium for three years now; it's our primary
    simulator for daily simulation work.  Our design is over 400M+ 
    instances; we do daily regressions and are happy with the TAT.

       - We are very happy with Xcelium's overall speed and capacity 
         improvements over time.  Its performance has improved quite a
         bit simply with its out of box settings.  

       - We also love HAL, which makes it easy to capture some early RTL
         issues which requires some non-default switches to check for 
         Synopsys SpyGlass.  Xcelium's random engine gives better 
         distribution, which improves our constraint-driven verification
         efficiency.  

       - Incisive Metric Center + Unreachability Analysis helps quite a
         bit on our coverage closure.  

       - Xcelium multicore simulation is now the only simulator we use 
         for our ATPG simulation.

       - Xcelium (previously IES) has good compliance with language 
         standards.  It is one of the reasons we use it as our main 
         simulator.

    I'm happy to see Cadence's year-to-year tool improvements.  Its 
    interoperability is key as the frontend design flow nowadays is no 
    longer a single tool; Xcelium is doing very good in this area with
    multiple built-in offerings.  My only suggestion for improvement is the
    waveform viewer.  (Verdi users will know what I mean).

        ----    ----    ----    ----    ----    ----    ----

    We'd love to try Cadence new machine learning enabled Xcelium-ML.

    We've seen quite a bit of improvement in backend tools with ML,
    so it will be very interesting to see how it works on frontend.

        ----    ----    ----    ----    ----    ----    ----

Related Articles

    Cadence vManager saves 1 hour/day/engineer is Best of 2020 #2a
    CDNS Xcelium-ML gets 3x faster regressions is Best of 2020 #2b

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)