( Post 49 Item # 1 ) ----------------------------------------------------------------
[ From: jaa@mongoose.ess.harris.com (John Auer) ]
I've found what seems to be a real problem for Synopsys and 2 phase
clock regimes. Synopsys indicates that the "set_clock -mult 0 phase_b"
be used for the phase_b clock. However, if you do this, timing paths
from phase_b to phase_b are not optimized.
On the other hand, if you use the "set_clock -mult 1 phase_b" command,
phase_b to phase_b paths ARE optimized, but now phase_a to phase_b AREN'T.
I took a look at the formula Synopsys uses to generate the setup constraints
for sequential elements, and it makes sense that things behave as they do.
It appears to be a fundamental flaw in the constraint method Synopsys uses;
with 2 phase clocks, one set of constraints (A -> B or B -> B) is always
missed. Unfortunately, Synopsys takes advantage of this and pushes alot
of the chip delay into this unconstrained "space", allowing Synopsys to
artificially improve performance.
Has anyone out there optimized 2 phase circuits (especially those
containing latches and flops, with communication between the regimes) and
run into the same problem? Any workaround ? (Incremental mapping hasn't
helped much, BTW).
John Auer
Harris GASD
( Post 49 Item # 2 ) ----------------------------------------------------------------
[ From: parkin@ultrasparc.Eng.Sun.COM (Michael Parkin) ]
Subject: ESNUG posting - FSM compiler, arrays
1. ref post 47 item 1
>> Some ASIC suppliers do not care for or allow feed-throughs.
There is a compile variable called compile_fix_multiple_port_nets.
When true, compile inserts extra logic to prevent feedthroughs and to
insure that no two output ports are connected to the same net.
However multiple outputs connected to PWR or GND will not be
fixed by this compile option.
2. ref post 47 item 2
>> d) set_flatten true -effort high -minimize single_output -phase false
On page 9-35 in the Design Compiler reference manual it states:
"When minimization is being performed, since a state table representation
is already flat, the flattening step later in compile is redundant and
is automatically disabled." Likewise the design compiler reports a similar
message during the actual compile. So I am curious as to how statement
d) produces a better design. Nevertheless it is worth trying.
3. state machine quality of results
I have had poor success using the state machine compiler. In most cases I
have tried, a normal compile produces better results. I suspect that if
there were a lot of unused or dont_care states the state machine compiler
would do much better. A normal compile can not take advantage of dont_care
conditions unless it is flattened, which may be very time consuming and in
some cases not possible. The state machine compiler can also
merge redunant states if they exist. In addition the dont_care states
are automatically extracted. I have obtained the best results by using
a pla description as input. This can also be obtained by extracting the
state table and replacing the state names with their corresponding values.
I do not understand why a PLA which is equivalent to the state table,
should produce significantly better results.
4. use of arrays
The use of arrays to implement a register file can give very different
results depending on how the verilog is coded. Using the first
method an extremely inefficient implementation is produced. A much better
implementation is produced from the second specification.
// version 1 - not very efficient
module regfile_4pt(
dia, dib,
sa, sb, da, db, clk, wea, web,
doa, dob);
input dia, dib;
input [1:0] sa, sb, da, db;
input clk, wea, web;
output doa, dob;
reg rf [0:3];
integer i;
assign doa = rf[sa];
assign dob = rf[sb];
always @(posedge clk) begin
if (wea)
rf[da] <= dia;
if (web)
rf[db] <= dib;
end
endmodule
// version 2 - much better
module regfile_4pt(
dia, dib,
sa, sb, da, db, clk, wea, web,
doa, dob);
input dia, dib;
input [1:0] sa, sb, da, db;
input clk, wea, web;
output doa, dob;
reg rf [0:3];
integer i;
assign doa = rf[sa];
assign dob = rf[sb];
// simple way to specify a decoder with enable
wire [3:0] sela = wea << da;
wire [3:0] selb = web << db;
wire [3:0] we = sela | selb;
// generated design exaclty matches the specification below
always @(posedge clk)
for(i = 0; i <= 3; i = i + 1)
rf[i] <= we[i] ? (selb[i] ? dib : dia) : rf[i];
endmodule
|
|