( ESNUG 417 Item 8 ) -------------------------------------------- [09/08/03]
Subject: ( ESNUG 410 #10 ) Synopsys Caught Spreading Vera/NC-Sim Speed FUD
> Our Synopsys AC is saying that we can improve our Vera simulation run
> times by a factor of between 1.8X and 6X by using Vera/VCS instead of
> Vera/NC-Sim. The reason is that VCS implements a direct kernel interface
> whereas Cadence NC-Sim uses the slower PLI interface. However, when we
> run our testcases, we find that the two approaches are actually quite
> competitive. What's going on? Are we missing out on some performance
> increases that everyone else is getting?
>
> - David Sawey
> Vitesse Semiconductor Corp. Richardson, TX
From: Chris Spear <chris.spear=user domain=synopsys hot calm>
To: David Sawey <dsawey=user domain=vitesse got mom>
David & John,
Pardon my ignorance, but if your Synopsys AC made a claim, and your
experiments could not reproduce it, why write to Cooley? Why not call the
AC and ask him what went wrong? Nothing against John, but he is not going
to speed up your simulation.
I can't send you a fruit basket to make things better, but I can give you
the same tips that your AC would.
1) Make sure you are using the -vera switch and not the old style
-P vera_pli.tab and libSysSciTask.a method to interface VCS and Vera.
The -vera switch is key for using the DKI; so always use it with VCS.
2) Still no speedup? Is one possible? Do a profile with 'vcs +prof"
as Rajesh Bawankule explains in his DVCON paper, or just look in the
VCS User Manual. If your Vera simulation and the PLI is only using
10-30% of simulation, no Vera tricks can give you 2x.
3) Okay, you are using the DKI, Vera is taking a non-trivial amount of
time, but VCS/Vera is still taking about the same amount of time as
NCV/Vera? Maybe there are other bottlenecks. Are you dumping lots
of waveforms? Do you have other PLI applications which need access
into the entire design, read, write, force, callback? Then VCS can
not make its best optimizations.
There are other VCS optimizations and tricks. First, call your Synopsys
AC. We are always eager to help out customers, especially one who is
using the competitor's tools! You can also pick up ideas from Rajesh's
DVCON paper, VCS's documentation, and SolvNet.
- Chris Spear
Synopsys AC Marlboro, MA
---- ---- ---- ---- ---- ---- ----
From: David Sawey <dsawey=user domain=vitesse got mom>
To: Chris Spear <chris.spear=user domain=synopsys hot calm>
Hi Chris,
Thank you for your suggestions. Fortunately, our local Synopsys support is
excellent. We are currently working closely with them on this issue. We
just wanted to get feedback from other Synopsys users to see if anyone else
had experienced the speed up. We want to get it, too!
- David Sawey
Vitesse Semiconductor Corp. Richardson, TX
---- ---- ---- ---- ---- ---- ----
From: Sean Smith <sesmith=user company=cisco spot balm>
Hi, John,
This sounds like some marketing foo to me. We use Specman not Vera, but
both are PLI based tools in a NC envrionment. NC-Sim has a useful feature
for profiling code so we spent some time looking at this a while back based
on FUD being spread. Reality for us with Specman/NC combination is that
PLI overhead accounted for between .5% and 2.5% of CPU time being used by
the sim. The rest is being consumed by Verilog and Specman. You can do
the math but totally eliminating that 2.5% worst case overhead we measured
would result in neglible real world performance gains.
Now on the other hand, the real reason Vera users are now getting some real
performance gains in that Vera finally has a compiled mode vs. interpreted
mode. Now whether Synsopsys is hiding their new Vera compiler for VCS users
only, I have no idea, but don't buy the PLI overhead argument.
I certainly can't measure it on a # of real world designs here at Cisco.
One argument Synopsys makes that has potential is that when the compile
Verilog/Vera into one image vs. say two for the compiled NC-Verilog/Specman
approach we use is that they can now do some optimizations across the
Vera/Verilog for even more performance. They have quoted no #'s here, so I
can't guess as to how large the benefit would be but it seems reasonable.
- Sean Smith
Cisco Systems
---- ---- ---- ---- ---- ---- ----
From: Chris Gori <cgori=user company=sanera.net>
John -
We are just starting to go the other way (we use VCS and want to bring up
NC-Verilog as an alternative platform). I suspect that you will benefit
from running "vcs +prof" when you compile (as we did). When you run the
simv it will write out a vcs.prof file to see %PLI, %SIM etc. which can
tell you exactly what is happening (and/or why it is slow). I don't know
the corresponding switch in NC-Verilog yet.
In many cases it is possible that you will be spending 80-90% of your time
in Vera (if you have good "cycle-sim-type" RTL that VCS/NCV can optimize
heavily). If you don't have a ton of events crossing the Verilog/Vera
boundary (either via DKI or PLI) then it won't matter what the interface
is, and you won't see much speedup (it's an Amdahl's Law type of thing).
The other possibility is you are using a type of Vera binding (dynamic or
static) that does not allow DKI to accelerate it. I believe dynamic
binding is the problematic one, but I would have to check to be sure.
I believe some of our verification/testbench engineers told me that you
could profile Vera code as well to find the slow-running parts. A couple
versions ago associative arrays were dog-slow. We found a few places,
either by profiling or just brute-force selective commenting, where a
one-line change could give 2X speedup on some testcases. Profiling is
your friend... :-)
Can I ask you about reader's experiences with NC-Sim? We have been a VCS
shop for ~2 years (and I have been a VCS user for ~7 yrs), but I am doing
the eval on NC right now. So far it looks OK to me but I feel like some of
the techniques that NC uses might be leaving performance "on the table" so
to speak (i.e. for waveform tools it seems to always compile in the
$recordvars, etc PLIs, whereas with VCS you get big-time speedup when
you `ifdef those out). Another annoying point is that installing
multi-platform trees seems not-well-thought-out by Cadence (i.e. I want
to have Linux and Solaris installed under the same CDS directory,
unfortunately a _lot_ of programs seem to depend on a symlink from tools
-> tools.sun4v or tools.lnx86 -- there can only be one symlink at any time
though?!). Lastly the NCV setup seems to be heavily dependent on
LD_LIBRARY_PATH settings, which I think could be a debugging nightmare,
especially on Linux. If your readers have any other gotchas, I'd love to
hear about them.
- Chris Gori
Sanera Systems Sunnyvale, CA
|
|