Pt 4 - Lauro errs on channel latency, sim acceleration, and ICE

( ESNUG 553 Item 4 ) -------------------------------------------- [11/13/15]

Subject: Pt 4 - Lauro errs on channel latency, sim acceleration, and ICE

... CONTINUED FROM Pt. 3 ...

         ----    ----    ----    ----    ----    ----    ----

LAURO ERRS ON CHANNEL LATENCY AND SIMULATION ACCELERATION

Palladium-XP2 supports transaction-based verification TBV, but it is rumored that its throughput is significantly lower than that of the Mentor Veloce 2 and Synopsys ZeBu 3.

- Lauro Rizzatti, emulation consultant http://www.deepchip.com/items/0547-09.html

Don't rely on Lauro's rumors that Cadence has latency issues.

Traditional emulator communication across the workstation to the hardware
channel can use standard SCE-MI or IEEE SystemVerilog with blocking,
alternating & synchronous communication mechanisms.  However, Palladium
also has communication channels for streaming TBA transactions as well
non-blocking, concurrent, and asynchronous import and export functions,
at high speeds.  It is all about pipelining and maximizing the overall
throughput as AMD illustrated at CDNLive in Boston this year.  They doubled
their speeds over conventional DPI or SCEMI Pipe transfer by using
asynchronous data streaming with "non-blocking" DPI calls for their tests.

One of the use models is to stream data from monitor is Palladium XP at
ICE speeds to the host.

               Error Count:  0    Total Error Count:  7
                Miss Count:  1     Total Miss Count: 17

Regardless of the name, this verification mode is the emerging trend in the industry. It does not require human manned supervision to plug/unplug speed adapters when you switch from one design to the next. As such, TBV is the mandatory choice for remote access at large emulation datacenters accessible 24/7 from anywhere in the world.

- Lauro Rizzatti, emulation consultant http://www.deepchip.com/items/0547-09.html

It's entertaining to see the attempts by the competition to push Palladium
as only viable in the In Circuit Emulation (ICE) space and that cables are
considered bad.  The reality is radically different.  For starters, most of
the Palladium installations we have are literally working like private
clouds with remote access, even if they are using ICE mode.

               Error Count:  1    Total Error Count:  8
                Miss Count:  0     Total Miss Count: 17

ICE is still the biggest market segment in emulation and customers tell us
it is not going away.  While virtualization is important and supported by
Palladium for years, ICE will remain to have its place because users need
to run real test cases with real-world data.  They use ICE to:

  - Validate HW/SW system interactively with live traffic including
    long sequences with multiple ports and increased bandwidth
  - Interface to testers & target systems for system level validation
  - Stress test with real applications & full protocol stacks

Virtualization works well in specific cases like USB, PCI, Ethernet and
multimedia protocols when users want to use specific debug capabilities on
the host.  In addition, it is used to:

  - Push traffic through protocols at sub-system & system-level on
    an acceleration platform
  - Validate HW/SW behavior early in development cycle
  - Check the flow of commands and data across the chip at the SoC level

This part of Lauro's ESNUG 547-09 table literally made laugh:

	Cadence Palladium-XP2 (GXL)	Mentor Veloce 2	Synopsys EVE Zebu Server 3
Best Deployment	excellent in ICE	excellent in ICE and in TBX/VirtuaLAB	excellent in TBV

Yes, Palladium is excellent in ICE.  Mentor's Eric Selosse and Lauro himself
surrendered this market publicly to Cadence during a DAC DeepChip video many
years ago.  But Lauro completely erred on Palladiums applicability for
acceleration -- the number of acceleration projects we see done with
Palladium has grown 47% year over year in 2014 to well above 100 projects
at more than 50 customer locations.  The majority of our top 30 customers
are using it.

I am tempted to count three errors here, but fine.

               Error Count:  1    Total Error Count:  9
                Miss Count:  0     Total Miss Count: 17

         ----    ----    ----    ----    ----    ----    ----

LAURO MISSES THAT HYBRID IS SUPPORTED BY ALL THREE EMULATORS

Another Veloce-2 approach replaces the RTL processing cores in your SoC design with QEMU-based cores -- and then moves them into the host connected to Veloce2 by way of transactors. The emulator continues to execute the remaining synthesizable portion of your SoC; pushing performance from 1 to 3 MIPS up to an upper limit of 100 MIPS when your entire SoC is mapped inside Veloce-2. With this some users are booting an Android RTOS, and then running applications like Antutu for performance characterization prior to silicon.

- Lauro Rizzatti, emulation consultant http://www.deepchip.com/items/0547-09.html

This "hybrid" use model is only mentioned in the Veloce section even though
Mentor has not announced customers using this functionality. 

In contrast, we at Cadence have various examples out there from ARM, Nvidia,
CSR, and Broadcom with hard data of a 200x speedup for an OS boot in the
case of CSR.

And in terms of technology, Synopsys was actually first to announce this
capability in 2012 with HAPS, and later disclosed Ricoh as a customer in
their book on virtual prototyping as combination with Zebu, but without
hard speed data.

Needless to say, "hybrid" is a thriving use model, and we are on it as
our customers attest, including ARM in your DAC'14 writeup.  Lauro missed
that both Palladium and Zebu run in "hybrid" modes -- not just Veloce.

               Error Count:  0    Total Error Count:  9
                Miss Count:  2     Total Miss Count: 19

Also, be aware that QEMU models and ARM Fast Models are very different.
QEMU models -- as supported by Veloce 2 Quattro -- do not have caches,
TLBs, MMUs and are really only instruction set simulators.  In contrast,
the ARM Fast Models connected to a Palladium are the same models ARM uses
for development and validation internally; they are validated against the
real processor implementations of ARM.

               Error Count:  0    Total Error Count:  9
                Miss Count:  1     Total Miss Count: 20

         ----    ----    ----    ----    ----    ----    ----

LAURO MISSES PALLADIUM OFFLINE HW/SW DEBUG AND HOT SWAP ABILITIES

And this final part of Lauro's ESNUG 547-09 table also gave me a chuckle:

	Cadence Palladium-XP2 (GXL)	Mentor Veloce 2	Synopsys EVE Zebu Server 3
SW Debug	physical or virtual JTAG [intrusive]	physical or virtual JTAG [intrusive] or CodeLink [non-intrusive]	physical or virtual JTAG [intrusive]
Save and Restore?	Yes	Yes	Yes

Well beyond an everyday simple save & restore -- which is limited to
the same engine -- Palladium also supports hot-swap between Incisive
Simulation and Palladium Emulation.  No other vendor can do that.

               Error Count:  0    Total Error Count:  9
                Miss Count:  1     Total Miss Count: 21

On debug - Palladium has had offline HW debug for years as we agree that
the non-intrusiveness of offline debug is useful, especially when fanning
out to mant users, both HW and SW.  Our Codelink equivalent functionality
is the Indago Embedded Software Debug, that enables synchronized HW/SW
debug for traces from both simulation and emulation, reading trace formats
like ARMs TARMAC, and giving the user a view of embedded SW and HW signals
as well as transactions.

               Error Count:  0    Total Error Count:  9
                Miss Count:  1     Total Miss Count: 22

OK, OK, I roll this one back.  While this has been released for a while, we
announced this as part of the Indago Debug Plaform launch on April 28th. 

               Error Count:  0    Total Error Count:  9
                Miss Count: -1     Total Miss Count: 21

         ----    ----    ----    ----    ----    ----    ----

LAURO MISSED PALLADIUM'S ON-DEMAND WAVEFORM STREAMING

While some Veloce-2 debug is similar to Palladium-XP2's, Mentor devised a faster debugging scheme based on the on-demand waveform streaming of a few selected signals without requiring compilation.

- Lauro Rizzatti, emulation consultant http://www.deepchip.com/items/0547-09.html

Yes, users can stream out Veloce Quattro signals in TBX mode.  But in TBX
mode only.

Palladium can do this using its "Transaction Based Acceleration" (TBA)
mode as well as in a use mode called "In Circuit Acceleration" (ICA),
we introduced back in 2012.  In this mode, users run ICE at full speed,
snoop data over to a different area of Palladium that runs in TBA mode
and stream those data out to the host for analysis.  Continuously,
without perturbing performance.  Voila!

Lauro saw this for Veloce, yet not for Palladium?

               Error Count:  0    Total Error Count:  9
                Miss Count:  1     Total Miss Count: 22

         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----
         ----    ----    ----    ----    ----    ----    ----

THE BOTTOM LINE

Being German, I could probably find 6 or 7 more errors and omissions beyond
this 9 and 22 count that I've listed here in what Lauro Rizzatti wrote on
behalf of the Mentor Veloce marketing team, but I'll stop.

Beyond my advice to always question those who omit key facts, I want remind
those looking to buy an emulator to carefully look into their utilization,
power, performance, and verification throughput holistically; rather than
their individual intrinsic parameters.

    - Frank Schirrmeister
      Cadence Design Systems, Inc.               San Jose, CA

         ----    ----    ----    ----    ----    ----    ----

  Pt 1 - Lauro missed Veloce2 and Zebu have lame gate ultilization
  Pt 2 - Lauro missed Palladium job throughput is 3X faster vs. Zebu
  Pt 3 - Lauro missed energy costs is intrinsic power use over time
  Pt 4 - Lauro errs on channel latency, sim acceleration, and ICE

         ----    ----    ----    ----    ----    ----    ----

RELATED ARTICLES

  Hogan follows up on emulation user replies plus market share data
  Hogan warns Lauro missed emulation's TOTAL power use footprint
  The 14 metrics - plus their gotchas - used to select an emulator

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2025 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)