( ESNUG 481 Item 7 ) -------------------------------------------- [04/29/09]
Subject: ( ESNUG 477 #3 ) A second US-based C synthesis design reports in
> We do a lot of architectural exploration and optimization with CatapultC,
> typically about 3 months worth. Once our C/C++ coding is done and we are
> happy with the architectural exploration, it takes us 2 hours to go from
> C/C++ to RTL for a 300K gate design with Catapult C.
>
> For us to do a manual Verilog RTL implementation of those same 300K gates
> would take at least 6 months.
>
> - [ Captain America ]
From: [ Slumdog Billionaire ]
Hi, John,
Keep me and my company anon, please. It's OK if you tell people that I'm
from a US co., though. (I know you've been tracking that.)
I have been using Mentor's CatapultC for 3 years now. I used it primarily
for wireless parts likes correlators, demodulators and decoders. I have
done 5 designs with Catapult so far.
One recent example: I implemented a 3GPP-like turbo decoder with CatC. The
size of my design was approx 5000 LEs on on Altera Stratix and slightly
under 4000 LEs on Stratix II. A lot of my designs had a Catapult runtime
of between 30 to 60 minutes including all the way to extraction of RTL code.
It took me about 2.5 months to create my first fully verified decoder. In
contrast, hand coding a decoder in VHDL would have taken 6 months, roughly
a 60% savings in TTM.
The size of the design Catapult produced was comparable to the hand-coded
VHDL for similar performance decoders. (Like the Xilinix Turbo Decoder
from their SysGen flow.) Our decoder supportes multiple codeword sizes.
I later modified our decoder to add a lookup table-based depuncturing logic
without significantly adding my LE count.
VERIFICATION
With CatC, we don't have to write an RTL testbench because it automatically
generates a testbench from our top level C code using SCVerify and it
automatically creates the SystemC transactors to wrap my RTL code.
C testbench (e.g. main())
/ \
/ \
/ --------------------
Turbo decoder function | SystemC transactor |
call (e.g. Turbo_Decoder_inst) | Generated VHDL |
\ | SystemC transactor |
\ --------------------
\ /
(gcc) (modelsim)
\ /
\ /
Comparator(pass/fail)
CatapultC then runs both the C simulation (e.g. GCC) and the RTL simulation
(e.g. MTI) using the same stimulus and gives a pass/fail message. Here is
a code sample:
//instrument main to become testbench
#ifdef CCS_SCVERIFY
#include "mc_testbench.h"
void testbench::main()
#else
int main()
#endif
//instrument function call for Turbo Decoder
#ifdef CCS_SCVERIFY
testbench::exec_Turbo_Decoder_inst(DataIn, DataOut);
#else
Turbo_Decoder_inst(DataIn, DataOut);
#endif
I really like this functionality as it frees up my time to do actual design.
Doing unit level verification is really as simple as coding C++... once I
got the design working, the unit level verification only took 2 days.
Cat's SCVerify helped me find memory leaks in my C code. These errors were
in the test bench not the design and it made me go back to clean up my code
(always a good thing). I usually have at least one out-of-bound array
access in my testbench, as I am usually not very careful with my test bench.
If you don't initialize a variable or close down an array after using it, it
can cause a memory leak. Usually we would run Purify or Valgrind to find
these types of errors. MSVC++ with strict checks "on" also catches them.
I haven't run any problems with the functional correctness of Cat's C-to-RTL
translation, which is exactly how a high level synthesis tool should work.
HIERARCHY
I used Catapult hierarchy in one of my designs, primarily to model 2 blocks
running concurrently. I used its ac_channels for data transfer between the
blocks. Ac_channels lets the one block know when the data is available from
another block as their outputs may not be present for each clock cycle.
CatapultC also allows you to use hierarchy to optimize and debug. In this
bottom up approach, you can restrict the scope of the design while Catapult
optimizes/balances your area and throughput requirements. This helps to
isolate the blocks so you can adjust latency to first output and area
requirements on a block-by-block basis. That helps us in integrating these
cores with top level design.
Our turbo decoder employs a MAP decoder which was iterated 16 times. In
order to keep latency of turbo decoder reasonable, it was very important
that the through-put of the MAP decoder be no more than 1 cycle. Any extra
cycles in MAP decoder adds 16 cycles in turbo decoder latency. (In this
design, sharing resources was not as important as meeting throughput
requirement.)
In other designs, I started out with a minimal latency and throughput
requirement and successively modified my code/architecture to get better
performance. As I iterate the design through Catapult, I find out how to
best use Catapult for my design. It's a learning process.
INTERFACE SYNTHESIS
With CatapultC, you can select different interfaces without changing your C
code. The blocks I/Os can be
- single or dual-port RAM, or
- wired interfaces with or without handshake (e.g. RDY, ACK).
We can experiment with different interfaces, and CatapultC optimizes our
design for each interface based on each I/Os throughput. I like being able
to keep my C code independent of the interface protocol, as I can:
1. Focus on the algorithm rather than the tedious job of integration
2. Retarget the design more easily cause it is independent of interface
3. Debug complex interfaces more easily. I can get the algorithm right
and debugged, get the IO right and debugged, then have Catapult connect
the complex algorithm with the complex IO rather than trying to debug
a design problem at the same time as I debug the complex IO.
CatC also has examples using the AMBA APB, Philips I2S, Altera Avalon-MM,
Xilinx FSL, and custom RAM interfaces. For sub-blocks, CatapultC will
automatically generate FIFO or ping-pong mem interfaces for data transfer.
Once the block is done in CatapultC, it took us about 3 weeks for our VHDL
engineers to integrate that block into our core. That is, the integration
is the same amount of time for hand-coded blocks.
Catapult C's main strengths are the initial design and verification time
savings, plus reuse. I have been able to reuse the cores I have previously
generated with Catapult C to make ongoing improvements in performance for
derivative designs. My team and I can modify our C++ and the changes are
automatically implemented. The other advantage is that I can tweak the
architecture without having to rewrite everything from scratch.
GOTCHAS
My primary issue with Catapult C is that it isn't always obvious how we can
get around some problems. The feedback provided by CatapultC is sometimes
cryptic and would take some effort or help to interpret. For example, I'll
get a chained feedback data dependency error and it will not be obvious to
me how to get around this problem. Sure, Catapult C provides a Gantt chart
to help with these kind of issues but for larger designs Gantt charts become
a little too unweildy for me.
The tool does require some learning on the part of the designer. It takes a
while to build trust in the tool -- that it will produce a design that meets
your constraints instead designing your own VHDL where you have more sense
of control.
For a prior datapath design, I liked that I could hand instantiate datapath
elements like the Mentor Altera Alt_4mult_add accelerated library component
in my C source code. We wanted to use Catapult C to design our datapath,
but at the last stage, we used hand instantiation to override any datapath
issues at the upper level. (I liked being able to override things when I
needed to.) This hand instantiation allowed us to control the exact number
of registers in the pipeline, for example inserting extra input or output
registers, or additional registers between the multiplier and the adder that
mapped directly to the Altera DSP block.
Before I first used Catapult, my primary reservation was that I might not be
able to hit my design goals using it -- and then I wouldn't have time left
to switch back to our traditional design flow to make my deadlines. Mentor
support helps a lot there; they were critical to our success with CatC. I
plan to continue to use CatapultC. It has helped me immensely to finish
some of our key designs in short order.
- [ Slumdog Billionaire ]
Join
Index
Next->Item
|
|