( ESNUG 225 Item 3 ) ---------------------------------------------- [8/24/95]
[ Editor's Note: I *really* liked this contribution Bob sent! - John ]
From: exabyte!loki!bobsug@uunet.uu.net (Bob Sugar)
Subject: My Surprising Answer To A Personal Metastability Quest
Dear John,
A few weeks ago, I sent out a memo to a bunch of designers asking about
metastability. After going through a bunch more articles, papers and
responses, I finally have a good answer. Since this is a problem that all
digital designers face at sometime, I decided to write up a brief refresher
on metastability calculations including the answer to my previous question.
- Bob Sugar
Exabyte Corporation
Here's my original question:
> From bobsug Fri Jul 7 11:54:36 1995
> Subject: A Modest Question
>
> I have what I thought was a simple question, but I'm having a devil of
> a time finding an good answer to it. The question involves metastability
> and synchronizer circuit design. Specifically, within a given ASIC
> technology is a dual-stage synchronizer running at 40 MHz more reliable
> than a single-stage synchronizer running at 20MHz [all other delays
> being equal]?
>
> Pictorially, here's the problem:
>
> Dual-Stage Design:
> _____ ------
> / \--------|D Q|
> ------ ------ SYNC | | | |
> ASYNC ----|D Q|----|D Q|-------| Blob | | |
> | | | | / of \ ---|> |
> | | | | | logic | | ------
> --|> | --|> | \_ | | ------
> | ------ | ------ \ |------|D Q|
> | | \____/ | | |
> | | | | |
> 40MHZ -------------------------------------------|> |
> ------
>
>
> Single-Stage Design:
> _____ ------
> / \--------|D Q|
> ------ SYNC | | | |
> ASYNC ----|D Q|--------| Blob | | |
> | | / of \ ---|> |
> | | | logic | | ------
> --|> | \_ | | ------
> | ------ \ |------|D Q|
> | \____/ | | |
> | | | |
> 20MHZ ----------------------------------|> |
> ------
>
> Which is more reliable?
>
> I've looked at a bunch of references and found a surprisingly large
> number of typo's and mistakes in equations and illustrations. None of
> the scholarly papers I have discuss multi-stage synchronizer designs.
> Depending upon what values I use for the metastability constants To and
> t [tau] along with which set of equations I use, I get conflicting
> results. Some say that the first design is better, some say that
> the second is better. Do any of you know the answer to this problem [and
> have good references to back it up]?
Now, after a quick review of simple metastability calculations, we'll get
to multi-stage calculations and finally, The Answer. In general, when
you clock a flip-flop, the output goes to the proper state a propagation
delay later. If the input just happens to be changing at the critical
time, the output may go metastable and take a little longer to resolve
itself to a stable state.
Experimentally, metastability is measured by clocking an evenly distributed
random input into a flip-flop and measuring the time until the output has
stabilized. If the results are plotted as number of occurances versus delay
time, an exponential curve will result [simplified model]:
| *
| *
Number | *
of | *
Events | *
| *
| *
| *
| *
| *
--------+------------------------------------
Tpd
Delay from clock
Just like a linear line can be represented in the form
Y = MX + B
an exponential curve can be represented in the form
-(T-Tpd)/tau
Y = K * e
After rearranging this, accounting for the nominal propagation delay (Tpd)
and adding clock and data rate scaling factors, we can come up with an
equation for failure rates. A "failure" is when the output has not
resolved itself by Tr seconds after the nominal propagation delay. A
common equation for for a single flip-flop synchronizer is:
(Tr/tau)
e
MTBF = ------------------------
2 * fdata * fclock * To
Tr is the time available for the metastability to resolve itself
(generally the clock period minus setup and propagation
delays of any intermediate logic)
tau is the resolution rate (experimentally measured)
2*fdata is the incoming event (edge) rate
fclock is the clock frequency
To is the metastability aperature (experimentally measured)
(this is a constant related to the width of time window
during which an input transition will cause a metastability
event)
For a multi-stage design, the equation becomes: [Many articles have incorrect
multi-stage equations -- I'm fairly sure that this one is correct.]
(T1/tau)
e
MTBF = ------------------------
2 * fdata * fclock * To
T1 is the total resolution time available from N stages
= N*Tclock - (N-1)*(Tpd+Tsu)
Tclock is the clock period = 1/fclock
Tpd is the clock to Q propagation delay
Tsu is the data to clock setup time
Overall, the longer the TOTAL resolution time across all stages, the better.
THE ANSWER:
For the earlier question of a two-stage design versus a single-stage
design at half the clock rate, THE SINGLE-STAGE DESIGN IS BETTER! The
single stage design is better because it has an entire extra 25ns period
(the difference between 40MHz and 20MHz) for the metastability event to
resolve whereas the dual-stage design has only an extra 25ns - Tsu - Tpd.
In terms of MTBF, the single stage design is better by a factor of:
(Tsu+Tpd)/tau
delta MTBF = e
For example using Tsu=1ns Tpd=2ns tau=.2ns gives:
delta MTBF = 3.3*10^6 times better!!!
Note: the published values I've seen for modern technologies (<2um) are:
tau = .1ns to .5ns (the smaller the better)
To = .1ps to 100ns (the smaller the better)
In general, small changes in resolution time cause very large changes in
MTBF. For best MTBF when using Synopsys, split up the modules so that
the delays in the "Blob of logic" from the above example can be made
as small as possible (using a large input delay and "dont_touch"ing the
resulting logic).
Again, remember that the longer the TOTAL resolution time across all
stages, the better. For example, inserting a falling-edge clocked flip-
flop in the middle of a 2-stage synchronizer is a bad idea. Likewise,
while inserting a schmidt-trigger buffer between stages may clean up a
marginal voltage level, it ends up reducing the MTBF by much more than
the additional hysteresis can save. A change of only a couple nanoseconds
in resolution time can change the MTBF by a factor of over a million!
- Bob Sugar
Advanced Technology Group
Exabyte Corporation
For a couple good articles on metastability, see:
D. Grosse, "Keep metastability from killing you digital design,"
EDN, June 23, 1994, pp. 109-116.
[An easy to read article, but 2-stage equations are wrong]
L. Kleeman and A. Cantoni, "Metastable Behavior in Digital Systems,"
IEEE Design and Test of Computers, December 1987, pp. 4-19.
[A much chewier paper including multi-stage equations and an
extensive list of additional references]
|
|