( ESNUG 454 Item 17 ) ------------------------------------------- [04/28/06]
Subject: ( ESNUG 450 #11 ) Two users on the Cadence Verisity/Axis emulator
> One approach I have found helpful with Palladium is to create a memory
> that saves statistics as we run our simulations. We can then write the
> memory to a file at the end of the simulation, and analyze the results.
>
> Palladium continues to out perform all other simulators with our smaller
> designs running at 1.2 MHZ. It improves our capability to verify ASIC's.
>
> - Tom Paulson
> QLogic Corporation Eden Prairie, MN
From: Minh Bao Tran Nguyen <tran.nguyen=user domain=st spot gone>
Hi, John,
We started our Verisity (now Cadence) Axis/Xtreme Server evaluation 3 years
ago. At that time, we were big users of acceleration technology and the IKOS
Nsim accelerators that we had been using were acquired by Mentor Graphics
and had no more roadmap. I began looking for an acceleration technology, and
selected Xtreme Server, based on:
- Fast ramp up time from RTL. We wanted the early acceleration techniques
to be like a simple 'accelerated simulator' for our verif. engineers.
- Debug facilities.
- Easy third party tool integration for our 'co-acceleration' platform.
For this more complex verification environment, we needed to have the
accelerator as flexible as a simulator by integrating Specman e, C,
SystemC, Coverage, Assertion, etc... into one single environment.
- Customer Support
- Interesting roadmap
Depending on the type of design and verification environment, Cadence Xtreme
Server's performance ranges from 10 kHz to 140 kHz for a transactional SCEMI
environment and around 200 kHz for In Circuit. We consider Xtreme Server to
be more of an accelerator (close to a simulator) than a pure emulator, so we
mostly use it for simulation-acceleration purposes even if Xtreme Server can
be used in both of the following modes:
1 - simulation-acceleration/co-acceleration
2 - Targetless/In Circuit Emulation
The setup time for Xtreme Server mostly depends on the 'maturity' of the
design to be accelerated. Today, we are able to map a design onto Xtreme
Server as soon as the first simulation is working on a simple simulator
(e.g. NCSim, ModelSim etc.). In general, this takes from one day to a few
days.
Regarding the debug (find & fix bug), it depends on many factors, but
usually we are working with RTL level and use Xtreme as a simulator so we
have full visibility of all the design signals. We combine it with Verdi
from Novas to close the debug cycle.
What I like about Xtreme server is:
- Fast and easy ramp up and very flexible for complex environment
integration. We have automated all Xtreme mapping flows into scripts so
our verification engineers can use it by themselves. For example, we
were able to build and use a single verification platform on Xtreme,
integrating SystemC, code coverage, PSL assertion, transactional SCEMI
for our IP verification -- we can get all of this information while
running test cases on Xtreme.
- Multi-partitioning capability: Hardware accelerator machines are very
costly and we need to get the best return on invest as much as possible.
With multi-partitioning capability, we are able to run several small
designs in parallel; we can run both large and small designs to always
fit the Xtreme MGates capacity we have. Xserver's partitions are then
used by projects like a compute farm.
But Cadence could improve the Axis/Xtreme server as follows:
- Reduce the compilation/mapping time (we use Linux PC farm)
- Improve performance
- Add 4-states propagation
- Add back annotation capability
- To be able to follow emerging standards/langages (ESL,SCEMI 2.0...)
Overall, I would say that Xtreme Server is a good & flexible acceleration
machine to speed up your verification process. But it is not clear today
how Cadence will support Xtreme since the merger with Verisity.
We use both Palladium and Xserver in a complementary way with Xtreme
(Sim-Accel) + Palladium (emulation) so hopefully Cadence will still keep
the same strong support on Xtreme server as it had with Verisity. It is
definitely clear that Palladium cannot do the same good job as Axis/Xtreme
is doing in simulation-acceleration mode.
Furthermore, exciting things are now coming to the market with the new
generation of emulators from Mentor or new accelerators from Tharas Systems.
This will definitely open a door to competition which is always very good
for us (at least in technical improvement and costs/Mgate reduction.)
- Tran Nguyen
STmicroelectronics Grenoble, France
---- ---- ---- ---- ---- ---- ----
From: Anup Kumar Raghavan <anup.raghavan=user domain=freescale spot gone>
Hi, John,
We have used Verisity's (now Cadence's) Axis Server emulator in-house at
Freescale for 2 years for our pre-silicon verification efforts. Our recent
emulation project was on our MPC8548 chip, which is a derivative of our
PowerQuicc III family of network processors. MPC8548 is one of the first
chips from Freescale that is fabricated on CMOS 90 technology. Among other
things, we have added new high speed interfaces like PCI Express, SRIO and
enhanced the existing triple speed Ethernet controller from PowerQuicc III,
on MPC8548.
For the first time, we used In-Circuit Emulation techniques to enhance
pre-silicon hardware and software co-verification. We used external industry
conformant test equipment to generate stimulus and test responses to the
new IP blocks on MPC8548 even during the pre-silicon period. By using
external testers, we gained confidence on inter-operability with third party
devices and chips. For example, we use Smartbits, a very popular Industry
standard test equipment to generate Ethernet traffic and test responses on
MPC8548. Then, if the design passes the tester, we hook up the Design Under
Test (DUT) to a live network to improve test coverage and gain confidence on
the DUT to handle real world traffic.
Pre-silicon verification of the Enhanced Triple Speed Ethernet Controller
(eTSEC) IP on MPC8548 -- There are 4 eTSEC's on MPC8548. Each eTSEC is
independent and supports 10/100/1000 MBPS speeds. New features of this IP
includes addition of TCIP-IP Offload Engine capabilities and supports
several types of PHY interfaces like GMII, MII, TBI and reduced PHY
interfaces like RGMII, RMII and RTBI as well as a new FIFO protocol for chip
to chip communication. The programming model is backward compatible with its
predecessor's programming model with minor modifications. This IP is
verified at block level, chip level and system level using conventional RTL
simulation techniques, as well as emulation for system level verification.
We share In-Circuit Emulation experiences on eTSEC pre-silicon verification
in this article.
Ethernet Protocol background
To give you some background on Ethernet Protocol, Ethernet is a packet-based
protocol and involves a point to point network. The protocol supports
10/100/1000 Mbps and 10G network speeds and supports half duplex and full
duplex in 10/100 modes, and full duplex in 1000 mode. It requires a minimum
interframe gap between consecutive packets and provides flow control
mechanism to handle network congestion and hardware latencies.
Pre-Silicon Verification & Emulation Goal
We need to apply as much stress testing as possible, including testing for
corner cases. So we pump in different types of Ethernet frames and payload
sizes that range from 64 bytes to jumbo frames (9000+ bytes), to verify the
TCP/IP offload engine capabilities as well as identify performance latencies
on the DUT. We apply back pressure and test the flow control capabilities of
DUT at varying traffic rates.
It was hopeless to try to simulate the entire MPC8548 due to its size-to
perform RTL simulations is incredibly compute intensive. In our current
chip level verification test environment, we are constrained to test the DUT
with a maximum of a few hundred Ethernet frames. RTL simulation of the DUT
with the PowerPC core takes several hours to even get out of reset; after
reset, the software overhead is also very high which further slows down
simulation speeds. This slows down the testing process and our test coverage
quite dramatically. There is no time to create testing infrastructure for
Ethernet simulation. We need extensive Ethernet packet generator/analyzer
software that can hook up to user's test bench. The packet generator must be
capable of handling real world type applications to verify DUT. It takes
time to develop this tool. Only useful thing we can do with simulator is
internal loop back type testing.
In-Circuit Emulation (ICE)
In-Circuit Emulation (ICE) is the hardware emulation setup between external
test equipment and the DUT in the emulator. By using ICE techniques in
pre-silicon verification, we extend the traditional loop-back type
verification to the next level by hooking up the DUT with third-party,
industry-conformant test equipment. We are able to stress DUT exhaustively
as in a post silicon type verification setup. We generate and analyze
stimulus coming to/from the DUT with external testers. Our verification
method is as follows:
- Test DUT with millions of Ethernet packets using tester.
- Expand beyond internal loop back type tests to external loop back
on cable or via tester and provided symmetric as well as asymmetric
traffic to DUT.
- Use pre-silicon tests on post-silicon. When the chip comes back, we
are able to run the same test as we did on the emulator.
- Run our application level software on MPC8548 to handle real world
type network based applications like TELNET, FTP, etc
SpeedBridge is key
Since we use Xtreme Server for In Circuit Emulation (ICE) on eTSEC, we needed
a SpeedBridge. SpeedBridge is the critical element to make ICE possible. It
is the link between the real world speeds and emulation speeds. Real world
Ethernet traffic is run at 10/100/1000 Mbps which relates to 2.5/25/125 MHz
clock frequencies respectively. However the best results we get from
emulating the DUT is close to several hundred kilo hertz. We need to bridge
this speed difference without altering the Ethernet protocol specifications.
The SpeedBridge also provides the connectivity from a standard Ethernet PHY
that networking devices connect to using RJ45 connectors to the emulation
world thru MAC-PHY digital interfaces like GMII.
The success of a SpeedBridge design is based on the flow control mechanism.
If the flow control mechanism is not designed properly, we can end up
flooding the network traffic with pause frames and this causes heavy network
congestion. Additionally, burst traffic count is highly dependent on the
SpeedBridge design and memory limitations. Poor SpeedBridge design can
result in loss or corruption of Ethernet packets.
These were our requirement for the SpeedBridge:
- Support all types of Ethernet traffic and payload sizes (from 64 bytes
to 9000+ bytes).
- Support CRC Error Injection & other types of Negative Error testing.
- Should be completely transparent to incoming and outgoing Ethernet
Traffic.
- Should have reset signal controlled by DUT on emulator to maintain the
two devices in sync.
Xtreme Server SpeedBridge. It has:
- Four independent ports, each compatible with Ethernet (10 M bps), Fast
Ethernet (100 Mbps) and can handle the Gigabit Ethernet (1 G bps)
- Four RJ45 connectors
- Auto-negotiation
- PHY management through MDIO interface
- Full-duplex
- MII/GMII interface at the emulator side
- Packets can pass through unmodified
- All sizes from 64 bytes to 9K bytes (jumbo frame)
Xtreme Server ICE Pros
1. Can use the emulator with the tester without a lot of work.
2. Can stress test the DUT with real traffic
3. Can reuse the work when the chip comes back to run the same test.
4. VCD-on-Demand debug feature is quite helpful to view ICE hardware signals
as well as all internal signals and pins of the DUT.
Xtreme Server ICE Cons
1. Hardware is expensive-thus it is a shared resource and not accessible
to everyone on demand if it is busy.
2. Harder to debug signal integrity issues
3. We need to be located onsite near hardware -- for or Ethernet work, we
must sit by the emulator (at least during proto-type).
Freescale's ICE results with Xtreme Server
Using ICE with Xtreme Server, we discovered killer silicon bugs, such as
weird & corner case bugs on the DUT in pre-silicon verification that escaped
standard chip level verification. Some bugs we found that would have been
difficult to catch without Xtreme Server/ICE were: varying inter-frame gap
between packets causing DUT to hang, and incorrect reporting of receive
packet drop counter values when the RX engine is busy.
In addition to finding hardware bugs, ICE also verified that our software
drivers worked, rather than waiting for the chip to come back on board. It
also helped with driver debug even before the silicon was available.
We also profiled aspects of chips performance. For example, we validated
CPU off load by the Ethernet controller for TCP/IP. Another example is that
we profiled the TCP/IP performance vs. CPU clock.
The following features were generated and tested on the eTSEC IP using ICE:
- Thousands of randomized Ethernet traffic (which is impractical in
simulation mode)
- Back-to-back jumbo frames (9600+ bytes), TCP-IP offload engine
capabilities, etc.
- Collection of performance statistics, such as memory bandwidth
utilization and eTSEC performance.
The PowerQUICC III processor features several new IP blocks that have not
seen silicon before, and emulation techniques revealed several critical
bugs. In addition to verifying eTSEC IP, we also verified other peripherals
such as, PCI Express, JTAG and Serial RapidIO using ICE.
- Anup Raghavan
Freescale Adelaide, Australia
Index
Next->Item
|
|