( DAC'20 Item 02b ) ----------------------------------------------- [02/19/21]
Subject: CDNS Xcelium-ML gets 3x faster regressions is Best of 2020 #2b
SMART REGRESSIONS: After Anirudh talked up how he was doing machine learning
(ML) & computational software on my DAC Troublemaker 2019 panel, I've been
watching to see how Cadence would deliver on it.
"Right now, you run verification for 6 months and still don't know
whether you are finished with verification or not. So, overall
verification closure is a great area for machine learning.
- Anirudh Devgan, CEO of Cadence (ESNUG 588 #2)
I spotted ML showing up in Innovus and JasperGold, along with Aart's R&D
chatting up ML in Fusion Compiler and VC Formal; along with Joe Sawicki's
R&D adding ML to Calibre -- plus touting his long established ML lead with
his Solido OCV tools (which won the "Best of 2018" in DAC'18 01) -- I had
yet to find *any* user comments about ML being applied to logic simulators
for regressions nor anything simulation related.
That is, until now.
In this year's report, two early ML users share that Xcelium-ML got them
a 2.5X to 3X speed-up in regression runtimes -- with comparable coverage
compared to their constrained random approach.
"Xcelium-ML helped us generate a 3X smaller regression set while
retaining 99+% coverage."
"Xcelium-ML improved our regression runtimes by 2.5X vs. Xcelium."
A bonus is that it also requires fewer licenses for the same coverage. And
since Xcelium-ML is based on statistical modeling, you'll get similar
coverage results each time.
TECHNICAL NOTES: Both users described how Xcelium-ML does machine learning
based on it first monitoring a vanilla Xcelium regression run -- and then
it later intelligently generates "condensed" regressions that are 2.5x to 3x
faster with the same coverage. (ML voodoo involved, of course.)
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
QUESTION ASKED:
Q: "What were the 3 or 4 most INTERESTING specific EDA tools
you've seen in 2020? WHY did they interest you?"
---- ---- ---- ---- ---- ---- ----
Cadence Xcelium-ML
Cadence Xcelium has been our primary simulator for years and continues
to deliver great performance and efficiency.
We've now begun working with Xcelium-ML -- which uses machine learning.
So far, we've gotten strong results from running Xcelium-ML on three
SoC System Component IPs.
Xcelium-ML:
- Improves our regression runtimes by 2.5x vs. Xcelium.
- Requires fewer licenses for the same coverage than Xcelium
requires.
- Has consistently high coverage of 99+%.
HOW IT WORKS
To use Xcelium-ML, we start with our vManager regression database. Our
engineer then prepares a simple configuration file for Xcelium-ML.
Xcelium-ML has a command that then automates the rest of the flow.
- Xcelium-ML's machine learning functionality learns from our
original Xcelium regressions, and generates an equivalent
machine learning model.
- Xcelium-ML then uses this machine learning model to generate
new, smaller regression test suites. The new test suites
reduce the overall regression runtimes while still delivering
coverage results comparable to our original Xcelium tests.
- In addition, Xcelium-ML generates analytic reports showing
random control knobs and coverage bins.
HOW WE USE IT
Whenever a new RTL drops, we replace our normal Xcelium regression
test suites with the ones generated by Xcelium-ML (to take advantage of
the reduced regression time, fewer licenses, and high-quality coverage.)
For example, for a case where our original Xcelium full regression time
for an IP was 3 days using full random simulation, the new Xcelium-ML
was able to finish in about 1.5 days.
We then make a coverage-space-of-interest in a Xcelium-ML configuration
file, and Xcelium-ML generates a target regression for that coverage
space.
- If our original tests already reached 100% coverage, Xcelium-ML
can reduce the number of test suites to achieve the same level
of coverage.
- If our original tests have less than 100% coverage, the
possibility of identifying uncovered coverage bins is the same
as for traditional random simulation.
Xcelium-ML's machine learning can be used for both functional coverage
and code coverage.
SMOKE & SOAK TESTING, SIMULATION RANKING, DIRECTED TESTS
- Smoke and soak testing. We've used Xcelium-ML for smoke and
soak testing to check our code health after any RTL changes.
- Directed Test. If we included directed tests in our original
regression tests, Xcelium-ML will also include them in their
generated tests.
- Compared to Simulation Ranking. Xcelium-ML shows better
coverage than doing simulation ranking. When you do simulation
ranking, what you only use a is a subset of your original tests
and regressions; the ranked tests need to all have the same seeds.
Knowing this, I would recommend Xcelium-ML to substantially improve your
original regression efficiency -- especially if you're a prior Xcelium
user -- the upgrade is worth it.
It's significantly reduced our regression runtimes.
---- ---- ---- ---- ---- ---- ----
Cadence Xcelium ML
We've been using Cadence Xcelium for simulation and have built a testing
infrastructure around it.
- Xcelium runs our regressions, collects our various metrics,
enables coverage instrumentation and coverage collection, and
prints out our reports.
- We deploy a coverage-driven verification methodology to close
functional coverage and code coverage.
- It takes a lot of time for us to ensure the design is good to
go for production; we run a massive number of random seed
regressions that take a long time to close.
Cadence's new Xcelium-ML uses machine learning (ML) to observe the
different randomization points and create/train a learning model that
correlates the randomization and coverage. Then, during simulation, the
model directs the simulator to hit the coverage points.
- We evaluated Cadence Xcelium-ML against our prior Xcelium-based
methods.
- Xcelium-ML helped us generate a 3X smaller regression set
while retaining 99+% coverage.
Below I discuss our Xcelium-ML evaluation process, and our results.
CONSTRAINED RANDOM ISSUES
Some of our expectations built around our verification approach:
- No bug escapes.
- Efficiently address changes in our product specifications
during active development.
- Contribute to accelerating verification closure <-- shrinks
our time to market.
- Optimize resources <-- Our simulation runs consume a lot of
resources, e.g., engineering, software licenses, compute servers.
As we are primarily using a constrained random approach, we have a lot
of challenges for regression profile selection, such as:
- Which randomization knob will give me which covered point?
This is difficult to predict.
- How many randomized testcases should we run?
- What scenarios will cover our deep logic? We need scenarios to
cover logic deep inside the design, but it's not obvious what
those scenarios should be. Not all our engineers are experts
at analyzing and creating the scenarios -- it takes both design
familiarity and skill.
- Are there redundant test cases we can avoid?
As a result, we end up in a lot of trial-and-error iterations, which
consume a lot of time.
We've shifted our focus from *what* to cover to *how* to cover it.
We've tried methods like directed tests and regression result ranking to
be more efficient.
In general, with random regressions we were getting diminishing benefits
with the additional regressions over time.
HOW XCELIUM-ML WORKS
Cadence Xcelium-ML, which has a machine-learning based extension/utility
that works with Xcelium. Here is how we use it.
1. We run our Xcelium regressions normally, but with the Xcelium-ML
interface enabled.
2. This starts the ML learning process, which collects the
regression data, coverage data, and what settings on the
"control knobs" used.
The learning process can run in parallel with new regressions,
or we can use it to collect data from prior regressions.
3. We invoke the generation process, and Xcelium-ML uses the models
to generate condensed regressions with scenarios that reach
comparable coverage but with faster closure.
It works transparently with Xcelium -- we don't have to tweak a lot of
knobs and/or put in a lot of extra effort to use it.
TARGET DESIGN FOR THE EVALUATION
To do our evaluation quickly, and to easily quantify results:
- We took a smaller IP that was already completely verified.
- We used a coverage-driven approach with a lot of randomization
because that's the focus for Xcelium-ML tool.
- Our target regression was about 10 hours. We used Cadence VIP
to facilitate this (so that we didn't have any new variables.)
Our design IP had:
- 3,300 Functional coverage points
- 10,000 code coverage points
- 272 individual tests
EVAL: XCELIUM-ML VS. CONSTRAINED RANDOM
We ran randomized test cases, with vanilla Xcelium. For our original
results, we had around 2800 test cases and cumulatively they picked up
around 7800 bins (=100%). It took us about 70 hours.
Because the tool was not yet in production release during our eval, the
Cadence team came to our office, set up their ML models, and ran the
evaluation on-site.
Xcelium-ML produced three regression suites, which I've compared to our
original results.
1. ML-1 had 126 seeds, resulting in ~5% of the original
testcases. It only took ~6% of the original runtime and
achieved ~97% of the original coverage.
2. ML-2 had 683 seeds, which translates to 25% of the test
cases. It only took ~30 percent of the original run time.
It got us 99.1% of the original coverage.
3. ML-2 had 1326 seeds. This third test suite had 49% of the
original test cases, and only took 60% of runtime. It
achieved 99.4% of the original coverage.
NOTE: Xcelium-ML produces different run set options automatically
and indicates what coverage level to expect for each one. As part of
our eval, we intentionally ran all three sets. However, when we
actually deploy the tool, we would just choose the coverage level we
want to achieve, and then only run that set. For example, when the
time increases a lot, but the coverage change is very negligible, we
might choose the suite with slightly lower coverage -- e.g., 99.1%
instead of 99.4%. And then figure out how to cover the remaining
fraction of a percent.
Our findings: Xcelium-ML was very promising. It got us good coverage
closure in significantly reduced runtime.
EVAL: XCELIUM-ML VS. REGRESSION RANKING
We also compared Xcelium-ML with regression ranking. With ranking, the
simulation tool takes the complete original set, and then determines
which testcases are the most meaningful one.
So, the ranking regressions set is a subset of the original regressions
with the same seeds.
- The vanilla Xcelium ranking operation gave us 156 regressions
that resulted in only 91% of coverage.
- Xcelium-ML (ML-1 set), with 126 seeds gave 97% coverage, even
though the number of seeds and runtimes were very comparable
to the vanilla Xcelium ranking.
Xcelium-ML also got us 97% functional coverage, compared with only 84%
for the vanilla Xcelium ranking.
Another difference is that with a ranking approach, if you change
anything, such as replacing your seeds with randomly generated seeds,
it's not guaranteed that you'll get the same numbers again.
In contrast, since Xcelium-ML is based on statistical modeling, it's
pretty much guaranteed that you'll get very similar results each time.
Our findings: Xcelium-ML works better than the ranking approach in
terms of coverage results and significantly in cuts down regression time
compared with our original approach.
CADENCE'S UMBRELLA SWITCHES
In general, when Cadence enhances Xcelium with additional optimizations,
they add umbrella switches to use to turn the features "on" or "off".
This is because:
- Not all engineers want all the optimizations.
- Some engineers want time to fully test out the new features
and get comfortable with them.
There are a lot of switches that come with Xcelium-ML, including some
specifically related to performance. So, if we run into any issues, we
can just disable the switch and we can return the tool back to the
default behavior.
XCELIUM-ML -- 3X FASTER USING FEWER RESOURCES
Xcelium-ML helped us:
- Stay focused on randomization to unveil hidden bugs. (vs.
too directed)
- Eliminate regression redundancy. (3X smaller regressions)
- Optimize our resources and human effort. Reducing the number
of regressions saved us compute resources as well as 4 to 5
days of engineering time manually writing testcases to reach
99%+ coverage.
Plus, we should be able to spend less time debugging our testbenches,
e.g., simulations hit by illegal scenarios.
Xcelium-ML looks very promising, and over the next few months, we will
look at Cadence's production version on bigger designs. If everything
goes well, we expect to adopt it as a part of our flow.
---- ---- ---- ---- ---- ---- ----
CADENCE XCELIUM
My team has used Cadence Xcelium for three years now; it's our primary
simulator for daily simulation work. Our design is over 400M+
instances; we do daily regressions and are happy with the TAT.
- We are very happy with Xcelium's overall speed and capacity
improvements over time. Its performance has improved quite a
bit simply with its out of box settings.
- We also love HAL, which makes it easy to capture some early RTL
issues which requires some non-default switches to check for
Synopsys SpyGlass. Xcelium's random engine gives better
distribution, which improves our constraint-driven verification
efficiency.
- Incisive Metric Center + Unreachability Analysis helps quite a
bit on our coverage closure.
- Xcelium multicore simulation is now the only simulator we use
for our ATPG simulation.
- Xcelium (previously IES) has good compliance with language
standards. It is one of the reasons we use it as our main
simulator.
I'm happy to see Cadence's year-to-year tool improvements. Its
interoperability is key as the frontend design flow nowadays is no
longer a single tool; Xcelium is doing very good in this area with
multiple built-in offerings. My only suggestion for improvement is the
waveform viewer. (Verdi users will know what I mean).
---- ---- ---- ---- ---- ---- ----
We'd love to try Cadence new machine learning enabled Xcelium-ML.
We've seen quite a bit of improvement in backend tools with ML,
so it will be very interesting to see how it works on frontend.
---- ---- ---- ---- ---- ---- ----
Related Articles
Cadence vManager saves 1 hour/day/engineer is Best of 2020 #2a
CDNS Xcelium-ML gets 3x faster regressions is Best of 2020 #2b
Join
Index
Next->Item
|
|