( ESNUG 274 Item 5 ) --------------------------------------------- [12/10/97]

Subject: ( ESNUG 261 #7 ) Creating A Single Cycle Write With An Asynch Memory

> We see a problem with doing Asynchronous memory accesses in single-clock
> cycle.  The question is : Is it safe to use both clock edges to generate
> write enable (gate it with the clock) to the memory.
>  
>      0      1       2          3
>    --       ---------          ----------           
>      \_____/         \________/          \_______      Clock
> 
>    ------------------           ----------
>                      \_________/                        Wr_Enbl
> 
>    -------  ---------------------  ----------
>    _______X _____VALID___________X __________          ADDR/Data
> 
> 
> The problem is at edge 3 where hold time on addr/data will be entirely
> dependent on buffers/routing delays.  Another problem is that when we use
> both clock-edges, there's a restriction on the duty cycle of the clock.
> These problems could be dealt with by trying to meet these by adding delay
> lines/buffers - but this is not a reliable solution.
> 
> One solution we can think of is to double the memory width so that data
> from 2 clocks can be written at a time, allowing synchronous write enable.
> 
> Any ideas, (other than increasing the memory width), are greatly
> appreciated.
> 
>   - N. Chandra
>     Synopsys


From: Steve Masteller <masteller@crmail.indy.tce.com>

Hello, John,

Try using the other side of the clock for your write pulse for single cycle
asynchronous writes as shown below.

        0      1       2          3
      --       ---------          ----------
        \_____/         \________/          \_______      Clock

      ---------           --------------------------
               \_________/                                Wr_Enbl

      -------  ---------------------  --------------
      _______X _____VALID___________X ______________      ADDR/Data

The write enable can be generated glitch free with an active low latch on
the enable followed by a nand gate to generate the Wr_Enbl.  Hold time is
generally not a problem since almost half a clock cycle should be available.
An ideal clock can be used to calculate setup times since any delays in the
gated clock will help rather than hinder setup time.  Of course, the duty
cycle restriction remains.  Finally, if you wish to avoid gated clocks, I
believe your only options are to double the memory width, as you mentioned,
or double the clock frequency.

   - Steve Masteller
     Thomson Multi-Media

         ----    ----    ----    ----    ----    ----   ----

From: Michael Ericson <ericson@nexen.com>

John,

Using the falling edge of the clock to generate the asserting edge of the
write pulse may cause pulsewidth violations depending on the speed of your
clock and memories.  You may suffer pulsewidth shrinkage due to variations
of the duty cycle and differences between the rise and fall times of your
output buffers.

We just finished several chips with single-cycle memory accesses; we had
two solutions depending on what was available on the chip.  These memory
interfaces proved to be the most challenging areas to verify timing in
the designs, but I guess this would be expected considering the nature
of the problem.


Solution #1:

Create a delayed version of the clock and gate it with the chip clock
to build the pulse:

                               1
   --       ---------          ----------           
     \_____/         \________/          \_______      Clock


               2
   ------       ---------          ----------           
         \_____/         \________/          \___      Delayed_Clock



              ------------------
   __________/                  \________________      Write_Enable


                 3              4
   --------------               -----------------
                 \_____________/                       Wr_Enbl_Pulse


The logic for the pulse is:

	~ ((~ Clock | Delayed_Clock) & Write_Enable)

Edge 2 of Delayed_Clock creates the asserting edge (3) of the pulse; edge 1
of Clock creates the deasserting edge (4) of the pulse.  The critical path
is from Clock to Wr_Enbl_Pulse; the shorter you make this path, the better
your address and data hold margins will be.

The advantages of this method are easy implementation and the use of only
the rising edge of the clock to create both edges of the pulse.  The
disadvantage is the fact that the delay on the clock scales with process,
temperature, and voltage; because of this, you may run into address setup
problems if you make the delay shorter simultaneously with pulsewidth
problems if you make the delay longer.  If your clock is 50MHz or slower
and your process is 0.6 micron or smaller and your SRAM is 12ns or faster,
you should be able to avoid this situation.

When doing static timing verification, you must ensure that Write_Enable
is always faster than the rising edge of Delayed_Clock and slower than
the rising edge of Clock to avoid glitches.  Also, realize that similar
logic will be required for the data output enable pulses and that the
asserting edges of these pulses may come fairly close to the beginning
of the clock cycle in the min case.  Because of this, you may want to make
sure that a dead cycle is inserted when going from a read cycle to a write
cycle to avoid contention on the data bus (write->read is easier to
accomplish since it takes longer to turn on the SRAM data bus than to turn
off the ASIC data bus).  We avoided this dead cycle for one of our chips
because the interface was shared among three engines and we needed the
bandwidth, but the timing verification was time-consuming.

If you're using VTI, they created a programmable delay cell for us and are
familiar with this implementation in their Boston-area technical center.
The programmable delay cell made it convenient for us to tweak the timing
late in the backend process with minimal schedule impact.  Also, the
b1->zn path through their fn03d2 macrocell may be used to for very fast
timing between Clock and the input of the output pad for Wr_Enbl_Pulse.
(the fn03d2 cell needs to be placed next to the pad, though)


Solution #2:

If you are using a PLL, you can create a 2x version of the clock and
use its falling edge to create a delayed version of the clock (looks
much like the one above).  To create a 2x version of the clock, send
the output clock of the PLL through a divide-by-two circuit and use
the divided clock as the feedback clock.  The PLL will speed up its
output clock until the feedback clock matches the frequency of the
reference clock; the result will be a clock on the output of the
PLL with a frequency 2 times that of the reference clock.

The advantage of this method is tighter control over the skew of the
delayed clock through variations in process, temperature, and voltage.
The disadvantage is the complexity of the clock circuit and resolving
the skew between the two clock domains.  If you don't need the 2x clock
for anything else in your design, it may not be worth the effort; in
any case, I would try solution #1 first.

The good news is that both solutions are working in our labs right now.

  - Mike Ericson
    Nexion, Inc.



 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)