( ESNUG 459 Item 5 ) -------------------------------------------- [12/14/06]

Subject: ( ESNUG 458 #2 ) Bluespec issues a chip design challenge to Forte

> So, John, I hope this letter illustrates some of the reasons why people
> are picking our Cynthesizer for real hardware & system design and not
> George's, seemingly appropriately named, "BS Compiler".
>
>     - Brett Cline
>       Forte Design Systems                       Acton, MA


From: George Harper <gharper=user domain=bluespec spot mom>

Hey John,

Did you notice Brett didn't respond to the central point of my original
posting? -- that Bluespec is the only ESL-level solution for control logic
and everything else is just RTL?  -- and Brett even ignored his previous
characterizations of SystemC-based control logic as RTL-like with his:

  "When you start writing a design that is 90% control based, where you
   are writing 'if-then-else', the code you are writing looks a lot like
   Verilog code.  For those customers advancing immediately to a SystemC
   based design flow or a C based design flow may not provide the same
   ROI level as the customer who has a mostly algorithmic description."

In fact, the only place he mentions "control" at all comes at the very end
where he talks about Forte customers doing designs that contain control
logic in them.  As was clear in my original posting, I never questioned
whether Forte can "support" control logic (despite Brett's implications
otherwise) -- my point was not "whether" but "how".

But instead of theoretical discussions, why don't we look at some real
design examples?  Let's get away from abstract claims and move on to some
concrete code.

Here are 2 design descriptions -- chosen to provide, on a manageable scale,
an illustration of control-related issues encountered in real life.  And
the rules are simple:

  1. For each design, 4 things should be included:

      a. The source for the design (which must be synthesizable),
         including any library elements used
      b. The RTL generated from your tool's synthesis (for the
         complete design, including any elements)
      c. The testbench source
      d. RTL synthesis results (speed/area) based on your RTL code
         using TSMC 180 GP Artisan libraries.  A gatelevel netlist
         in Verilog in plain text (not .db) is expected.

  2. If you use any library or IP elements at all, you need to publish
     the source for these elements.  The purpose of this exercise is
     solely to illustrate the semantics of control logic design; not the
     availability of IP components.

  3. Each design should include a small testbench that illustrates,
     at a minimum, the required test cases and core functionality.


Design Challenge #1: a Basic 2x2 Interconnect

Suppose we design a crossbar switch for an SoC connecting initiators to
targets like processors and DMA engines.  Example targets are memories,
I/O blocks, and the DMA configuration port.  The specs are:

 - For uniformity, all busses are 32 bits wide.

 - Two initiators and two targets.

 - Requests (initiators to targets) are completely decoupled from
   responses (targets to initiators).

 - Both requests and responses can be pipelined.

 - The switch should preserve request order from a particular initiator
   to a particular target, and the response order from a particular
   target to a particular initiator.

 - For simultaneous requests from the 2 initiators towards the same
   target, and for simultaneous requests from the 2 targets to the
   same initiator, there should be round-robin arbitration.

 - In the best case (i.e., if allowed by arbitration and absence of
   back-pressure), requests and responses should make it across the
   switch in one clock cycle, i.e., they should be buffered for just
   one cycle in the switch.

Each connection between an initiator and the switch and between a target and
the switch (also called a socket) has the following structure and protocol,
similar to the OCP-IP protocol:

 - Each initiator has a master interface, connecting to a slave interface
   on the switch.

 - Each target has a slave interface, connecting to a master interface on
   the switch.

 - A master must send a request on every clock cycle (it sends a NOP request
   if it does not have a real request to send).  It can advance to the next
   request whenever it sees an accept signal from the slave.  An accept
   refers to the request on the current cycle, and so the master can send
   the next request on the very next cycle, i.e., requests can be pipelined
   at full bandwidth of one request per clock.

 - Symmetrically, a slave must send a response to a master on every clock
   cycle (it sends a NOP response if it does not have a real response to
   send).  It can advance to the next response whenever it sees an accept
   signal from the master.

Testbench Requirements:

 - Must ensure that simultaneously sent traffic from 2 different initiators
   to 2 different targets occurs at full bandwidth.

 - Must ensure that simultaneously sent traffic from 2 different initiators
   to the same target occurs in properly arbitrated form (round-robin).


Design Challenge #2: a Simple 4-Channel DMA Controller

Let's consider a DMA module which is, of course, configurable, and supports
multiple concurrent transactions (multi-channel).  At its interface, the
following 3 groupings can be considered:

  1. Configuration port is a target interface (similar to the OCP-like
     socket interface described in the 2x2 interconnect).

  2. Memory port is an initiator port (OCP-like socket interface) on which
     both read and write DMA transfers operate.

  3. Third group contains interrupt lines which begin transfers and status
     lines which mark end of transfers on a per channel basis (there are a
     pair of interrupt/status lines per channel.  Once a channel has been
     configured, an interrupt request indicates to the DMA controller to
     begin an operation; status indicates completion)

With regards to features:

 - For uniformity, all busses are 32 bits wide.

 - 4 channel DMA, where all channels can have pending read or write
   operations; channel number dictates priority for read/write operations.
   Channel 0 take priority over channel 1, etc.

 - Memory requests can be sent every cycle; can be delayed due to
   back-pressure from the memory port.

 - Memory responses are in-order for each channel, but may be out
   of order between channels.  Response latency from the memory is
   completely arbitrary.

 - Memory requests and responses should be tagged with a 2-bit thread
   ID to identify the request/response channel.

 - Configuration allows setting of source address, destination address
   and number of words transferred, plus the enabling of the channel
   via software configuration or hardware interrupt.

 - Write operations from the DMA controller take precedence over read
   operations.

 - The ports should be fully utilized when possible.

Assume no more than 4 outstanding memory read requests per channel; largest
transaction for the memory transaction is 64 bytes (the DMA transactions
may be larger).

Testbench Requirements:

We'll leave this pretty open ended.  Must demonstrate the core functionality
and behavior including interleaved operations.


John, as a measure of flexibility, and to more closely measure real life
chip design conditions, I'd like to suggest that you later add 2 mystery
features after the basic features have been implemented.  Ideas for this
include: pre-emption, reservation, additional channels, additional ports,
additional addressing mode, etc.


And one final point:

Brett makes lots of claims about industry standard SystemC.  While I'm sure
that's true for simulation, but what about synthesis?  With Verilog & VHDL,
there is consensus about synthesis subsets -- and a single design can be
synthesized by Synopsys, Cadence or Magma.  Where is this for SystemC?  I'd
love for Brett to point to one vendor (other than Forte) that can synthesize
Forte's interface library/designs into RTL.  There's a huge difference
between claiming "standard" and having true "portability" across tools.


I'm so very much looking forward to comparing designs after the New Years
break; of course, this is on the assumption that Brett doesn't change the
subject again like how he did with his last email!

    - George Harper
      Bluespec, Inc.                             Waltham, MA
Index    Next->Item








   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)