( ESNUG 495 Item 6 ) -------------------------------------------- [11/17/11]
From: Uri Tal <uri=user domain=rocketick got calm>
Subject: Rocketick warns of Design Compiler 'X' vs. VCS/NC-Sim/Questa 'X'
Hi, John,
While working on RocketSim, our GPU-based Verilog accelerator, I've run into
some dramically differing simulation behavior depending on whether you're
running VCS/NC-Sim/Questa or Design Compiler (RTL 'X' vs gate-level 'X').
Thought your DeepChip readers might be interested in what I've found.
'X' IN A CONDITIONAL:
---------------------
Consider the following RTL code:
if (cond)
res=2'b00;
else
res=2'b11;
And let cond be 1'bx.
The Verilog standard (IEEE-1364, 2001) deals with this case explicitly:
"If the expression evaluates to true (that is, has a nonzero known
value), the first statement shall be executed.
If it evaluates to false (has a zero value or the value is x or z),
the first statement shall not execute.
If there is an else statement and expression is false, the else
statement shall be executed."
Therefore, the result would be res=2'b11. But is that what you really
expected? If cond='x', its value is unknown, meaning that it could be
either '0' or '1', and therefore the output 'res' could also be either
'00' or '11' respectively, so it should thus be assigned 2'bxx.
In this case, VCS/NC-Sim/Questa is 'x'-optimistic, i.e. it outputs a non-'x'
value where there should be 'x'. The Design Compiler netlist simulation
will behave as we would expect, and output would be 'x'.
What about the 'case' construct?
case (cond)
1'b0: res=2'b00;
1'b1: res=2'b11;
endcase
Surprisingly, if cond='x', VCS/NC-Sim/Questa would not execute any of the
lines, because the above code is actually equivalent to:
if (cond===1'b0)
res=2'b00
else if (cond===1'b1)
res=2'b11;
Note the case-equality operator used above. None of the 'if' conditions are
met. Therefore res=2'b11. We could solve this by adding a 'default' case:
case (cond)
1'b0: res=2'b00;
1'b1: res=2'b11;
default: res=2'bxx;
endcase
Now, if cond is 'x', the output will be 2'bxx.
Design Compiler will disregard the default line, but VCS/NC-Sim/Questa will
behave correctly and put 'x' in the result when cond==='x'.
MUX OPERANDS WHEN COND=='X':
----------------------------
Consider the more general case of a multiplexer.
case (cond)
1'b0: res=a;
1'b1: res=b;
default: res=2'bxx;
endcase
When cond='x', we expect VCS/NC-Sim/Questa to combine the values of 'a' and
'b' in the following way: if a==b, res should be equal to a, but if a!=b,
res should be 'x'. The conditional operator (aka the ternary operator)
res = cond ? a : b;
can be used to correctly combine the values 'a','b' when cond===1'bx.
DESIGN COMPILER VS. VCS/NC-SIM/QUESTA:
--------------------------------------
Consider the following example
RTL:
c = cond ? a : b;
Gate Level:
c = (cond & a) | (~cond & b);
This could potentially result with 'x' even if a==b. In this case the gate-
level simulation will be 'x'-pessimistic compared to the RTL simulation.
Consider the following opposite example:
RTL:
c = ~a & a;
Gate Level:
c = 0;
In this case, Design Compiler optimized the logic expression assigned to the
variable 'c' and realized that 'c' is the constant '0'.
Now, if 'a===1'bx', VCS/NC-Sim/Questa will evaluate 'c=1'bx', which is
'x'-pessimistic.
Generally, logic manipulations that preserve the logical functionality of
the design can still change the simulation behavior when some of the inputs
are unknown ('x'). Note that equivalence-checking tools (like Conformal or
Formality) will not find these issues because the logic cones are still
equivalent from a 0/1 point of view. The Verilog standard specifies exactly
how 'x' should be propagated through logic/arithmetic operations. This
spec, however, is not equivalent to stressing the design with both 0 and 1
for each 'x' value (which is the more accurate though computationally
intensive approach).
ARITHMETIC & PART-SELECT PESSIMISM:
-----------------------------------
In most arithmetic & part-select operations, the Verilog standard specifies
the simulator to output 'x' whenever one of the bits in the input operands
is 'x'. This is usually over-pessimistic.
Consider the following variable-part-select example:
a=35'b000x0000x0000x0000x0000x0000x0000x0
out = a[a[5:0]]
According to the Verilog standard, 'out' will get the value 'x'. However,
the expression a[5:0] can be either 0 or 2, and in both indices
a[0]==a[2]==0,
therefore out should be '0'. In most gate-level implementations, where the
above RTL is synthesized into MUXes, the result would be '0' and not 'x'.
Barrel shifter:
va35 = 35'b00110x01x00000000000000000000000001
vb35 = 35'bx000000000x000000000000000000000110
out = va35 << vb35
According to the Verilog standard:
out = 35'bxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
This is because at least one bit in the vb35 vector is unknown. However,
this is much more pessimistic than the actual uncertainty in the result:
out = 35'b0xx0000000000000000000000000x000000
Here's another example with the equal-operator:
out = ({cnt[0],cnt[0]} == 2'b10);
According to the Verilog standard, if cnt[0]===1'bx, out=1'bx, but we
actually know that for every value of cnt[0], out will be '0'.
A similar example is:
always @(cnt)
cnt_copy = cnt;
assign out = (cnt[15:0] == cnt_copy[15:0]);
In this case, if some of the bits of cnt[15:0] are 'x', the result will
be 'x', but the gate-level simulation result might calculate the more
'correct' value- '1'.
Another example:
cnt[2:0]=3'bxxx;
out = ($signed(cnt[2:0]) == 5);
According to the Verilog standard, out='x', but actually the range of
$signed(cnt[2:0]) is [-4,3], so it can never be 5.
GATE-LEVEL BEHAVIORAL MODELING ISSUES:
--------------------------------------
The following examples are of behavioral models of gate-level library cells.
Consider the following implementation of a simple buffer:
primitive xbuf (o, i, dummy);
output o;
input i;
table
// i : o
1 : 1;
0 : 0;
x : 1;
endtable
endprimitive
This buffer behavior that specifically masks occurrences of 'x' gives way to
an 'x'-optimistic behavior of the gate-level simulation.
Now consider the following simplified flip-flop implementation:
primitive udp_dff (q, d, cp);
output q;
input d, cp;
reg q;
table
// d cp : q : qn;
0 (01) : ? : 0; // latch 0
1 (01) : ? : 1; // latch 1
0 * : 0 : 0; // keep 0
1 * : 1 : 1; // keep 1
? (1?) : ? : -; // ignore negative edge of clk
? (?0) : ? : -; // ignore negative edge of clk
? ? : ? : -; // ignore data change on steady clk
endtable
endprimitive
Now, let's focus on the case of cp transitions from 0-->'x'.
If, in the original RTL, the syntax was the usual 'always @(posedge cp)',
then according to the Verilog standard, a transition of 0-->'x' is
considered a positive edge. According to the UDP above, the result in
this case depends on the 'd' input.
If the 'd' input matches 'q', then we keep the current q.
If, on the other hand, 'd!=q', there is no line that matches this and,
according to the Verilog standard, the output will be 'x'.
Which behavior is more logical?
If the clock transitions from 0->'x', then there are 2 possible transitions
of the clock: 0-->1 and 0-->0 (no change).
Let's separate two cases: d==q, and d!=q.
d==q: in both transitions 0-->0 and 0-->1, q remains the same.
d!=q: the output 'q' will be different in each transition, therefore the
value 'x' should be output.
Therefore, the UDP behavior above matches the best logical conclusion if
the semantic of 'x' is the unknown value.
CONCLUSIONS:
------------
I have given several examples that yield different behaviors of RTL and
gate-level simulations.
We've seen some non-intuitive behaviors specified by the Verilog standard
that, in some cases, contradict one's understanding of the 'x' semantics.
As a result of these mismatched behaviors, many users will spend long hours
debugging 'x' problems in gate-level simulation that run extremely slowly.
It would be a nice feature if VCS/NC-Sim/Questa had a command-line flag that
causes them to simulate RTL 'x' scenarios in a way that better resembles a
gate-level situation (Design Compiler output), or better yet, equivalent
results to stressing '0' and '1' for every 'x'.
This, of course, would cause them to run more slowly in some cases, but
hopefully not as slow as in gate-level.
- Uri Tal
Rocketick Ramat Gan, Israel
Join
Index
Next->Item
|
|