( ESNUG 487 Item 1 ) -------------------------------------------- [02/01/11]
From: [ The Man in the Iron Mask ]
Subject: An Oasys RealTime Designer vs. SNPS DC-Topo/DC-Graphical benchmark
Hi, John,
Please keep me anonymous for fear of Synopsys retribution.
We've been using the Oasys RealTime Designer physical RTL synthesis tool in
production mode since October of 2010.
We evaluated RealTime Designer versus DC-Topo/DC-Graphical. We also looked
at Cadence RTL Compiler briefly, but since RTL Compiler benchmarked only
comparable to DC we stopped looking at it. Our evaluation criteria was as
follows:
1. Runtimes. We were seeing extremely long synthesis run times for
Synopsys Design Compiler, so we wanted a substantial speed-up.
2. We ran into some very poor correlation between DC and ATopTech P&R.
Anecdotally, the differences we saw were 200% off on RC values for
long nets, and 100's of picoseconds of difference in static timing
on 500 picosecond paths, i.e. ~20% off. So we wanted a synth tool
that not only gave us reasonable QoR but also allowed us to base RTL
timing fixes off of the synthesis results alone, with fairly high
confidence it would have an impact after place and route.
3. We needed a way to integrate the tool easily into our existing flows.
4. As long as we had workarounds to make sure we had scan stitching and
SDC's we didn't include these as must-haves for the synth tool.
5. Our final criteria was to compare the timing, area, route-ability/
congestion and power of the final synthesized blocks across a broad
set of representative designs.
REALTIME DESIGNER (RTD) INPUTS AND OUTPUTS
Inputs:
- RTL (Verilog/SystemVerilog)
- Standard cell libraries views (lib, LEF)
- DEF for input pin location and bounding block size and internal
hardened macro placements
- SDC formatted constraints (like clock definitions)
Outputs:
- Gate level synthesized netlist
- DEF output for handoff to the P&R tool
---- ---- ---- ---- ---- ---- ----
REALTIME DESIGNER vs. SYNOPSYS DC-GRAPHICAL/DC-TOPO COMPARISIONS
RealTime Designer: Un-buffered min-sized final netlist
DC-Graphical/Topo: Fully sized and buffered netlist
----
RealTime Designer: DEF output with seed placement for stronger correlated
results with P&R. Instead of having Atoptech place all the cells from
scratch using its own placement algorithms, RTD passes a DEF which gives
a good guideline for all the initial placements based on how it optimizes
the netlist.
DC-Graphical/Topo: No similar function; IC Compiler required for P&R.
----
RealTime Designer: No DFT functionality for scan stitching and compression.
We are testing their beta version and so far it looks promising. In the
interim, we still use DC to do this.
DC-Graphical/Topo: Has DFT for scan stitching and compression.
----
RealTime Designer: Incomplete SDC hierarchical outputs
DC-Graphical/Topo: Has complete SDC hierarchical outputs, so we use DC for
that purpose.
----
RealTime Designer: Placement-driven optimization approach. Can read in
wire technology files, physical placements of custom datapaths and macros,
as well as DEF shapes for route blockages, power grid blockages, and
boundary definitions.
DC-Graphical/Topo: Tries to address placement impact on synthesis through
placement integration, but it's inadequate. I think their fundamental
architecture being based on gate optimization rather than equation
optimization makes it order of magnitude slower. To absorb more information
about placements and sizing and physical impact, Synopsys has to consume
massively more information which makes its runtimes unbearable. To band-aid
the problem, they do things like heuristics for wire delay estimation that
can be very wrong for certain types of paths and gate delay vs. sizing
tradeoffs that also don't work for all cases. This allows them to reduce
how much data they process but with widely variable and unpredictable
results.
----
RealTime Designer: Runtimes ~10x faster than DC. Details below.
DC-Graphical/Topo: Slow runtimes for large designs/blocks. Details below.
---- ---- ---- ---- ---- ---- ----
RUNTIME AND RESULTS BENCHMARKS
BLOCK 1 - (40 nm, 1.2M gates)
Oasys RTD Synopsys DC-Graphical/Topo
Tool Runtime 1.2 hrs 9 hrs
QoR vs. specification
(total neg. slack) -5.3 ns -24461 ns
Timing correlation vs. Primetime +/-100 ps +/-500 ps
(Worst Negative Slack)
Overall time spent (duration) 37.28 hrs 63 hrs
Timing correlation. We correlated ATopTech with Primetime, so each tool's
correlation with ATopTech is about the same as for PrimeTime.
BLOCK 2 - (40 nm, 3.6 M gates)
Oasys RTD Synopsys DC-Graphical/Topo
Tool Runtime 2 hrs > 3 days
QoR vs. specifications -742 ps -2 ns
(Worst Negative Slack)
Capacity > 3.6 M gates Unknown - *
*- We were unable to get the DC run for Block 2 to complete. We didn't run
it flat for DC, so can't give a comparison on overall time. For this
exercise, we didn't really take Block 2 all the way through our Atoptech
flow, because we were just interested in whether RTD would actually be able
to generate a complete netlist. This is also why we don't have timing
correlation information for this block.
Duration. Duration is very interesting. Do you use the faster runtime to
do more optimizations, or to reduce your total project time, or both? We
use the faster runtime and correlation to shorten the time to results seen
by RTL folks to close a block.
Total project time. Since we are still in the process of taping out, it's
hard to gauge RTD's impact on total project time. Definitely the faster
runtime allows us to spin a block significantly more times, and in our case
iteration is the key to achieving more aggressive targets, so I guess
indirectly that the faster runtime allows us to hit more aggressive timing
targets.
---- ---- ---- ---- ---- ---- ----
OTHER OASYS USABILITY FACTORS
- Language coverage/support. We have not encountered any coverage issues
with RealTime Designer on Verilog syntax. Seems very solid on there.
- Set up time. RTD's initial set up took about 2 days of concerted
effort. After this, the setup time was basically integrating of the
tool into our flow. The tool is still in active development to support
a number of features we've requested, so we've had to rewrite portions
of our flow integration as Oasys made improvements. I'd say it was
production ready in about 1 month. On a comparative basis this wasn't
too bad considering many other EDA tools take substantially more effort
to integrate.
- Compatibility with Synopsys Scripts. Oasys RTD TcL isn't entirely
compatible with Synopsys TcL syntax for the script files. There is
some overlap, but there is a bit of pain in script porting that needs
to be overcome. That was part of our 2 day effort, and additionally
that is part of an ongoing effort that they continue to work to
improve. Oasys has generally been very responsive in adding TcL
support for feature requests as we have requested them.
---- ---- ---- ---- ---- ---- ----
SYNTHESIS USE MODEL
We use Oasys RTD for all our designs - from our multiple Ghz CPU block, to
our 250 Mhz IO controller. Our blocks range in size from 50 K to 4 million
instances. We used to partition some of them into separate blocks, but with
RealTime Designer we can run larger clusters of blocks as one design, which
lets us partition and optimize across partitions much better.
We currently use Oasys RTD similarly to Synopsys DC but with the augmented
features of physical DEF hand-off in addition to the netlist.
For our hybrid approach, we still depend on DFT compiler in DC to do scan
stitching, as the Oasys RTD scan stitch features are in beta. We also use
DC to read the hierarchical netlist from RTD and input SDC to generate
sub SDC's for each partitioned block in our divide and conquer strategy.
We are starting 2011 with a hybrid synthesis approach using both Oasys RTD
and Synopsys DC. Once we reach 28 nm, we plan to phase out using DC for
optimization and use only Oasys to do full chip synthesis. This is planned
for the second half of 2011 - once RealTime Designer is fully tested for
scan stitching.
Q1: design derivatives in TSMC 45 nm. Use Oasys RTD and Synopsys DC
hybrid approach.
Q2: 1st 28 nm core tapeout. Use Oasys RTD and Synopsys DC hybrid
approach.
Q3: 2nd 28 nm tapeout low power variant 28 nm. Use RTD only.
Q4: 3rd 28 nm tapeout ultrahigh end variant 28 nm. Use RTD only.
In general, we will use RTD to do the bulk of the optimization and synthesis
work for all our high-speed timing critical 28 nm blocks. For less critical
blocks that don't require significant respins or optimization (more or less
the dumb-port blocks from 45 nm), we will likely still use DC.
---- ---- ---- ---- ---- ---- ----
RTD STRENGTHS AND WEAKNESSES
RTD's Strengths:
- Capacity and runtime. In an age of data explosion and design
explosion, for design optimization, if we can get rapid feedback on
designs with same or better QOR than DC, we can effectively eliminate
a significant bottleneck. Then we can focus on the other bottlenecks
and tackle more challenging and larger designs.
- The AE support for the tool has been pretty outstanding.
RTD's Weaknesses:
- Needs more API hooks into the database.
- UI could be a bit more clean.
- Scan/Test features for DFT need to be fully implemented.
- Having quicker fixes for QoR bugs and feature enhancements, i.e. fully
buffered and physically sized netlist output, effort settings on
optimization, etc. The tool is still new, and we've been fortunate to
have pretty responsive fixes to most issues we've seen.
- Oasys TcL isn't 100% compatible with DC Tcl.
It is hard to really gauge the ripple effect of rapid physical synthesis
with strong correlation to backend timing tools, but I believe it represents
a fundamental paradigm shift in the way design is done.
With Synopsys DC, we would have to break up the design to run it through
synthesis in pieces. We would then run into problems at place and route
when we stitched the blocks together, and the methodology would require
many iterations between the front-end and back-end teams.
With Oasys RTD, RTL designers can now have a rapid prototyping feedback
mechanism. Because RTD gives us placement views with synthesis, our front
end designer can fix problems at the source. This allows for dramatically
shorter implementation. (It can definitely shave off weeks to months of
time from starting RTL to tapeout, but I have no means to scientifically
quantify it.) All this means we get better RTL sooner which could mean
more rapid design closure, faster time-to-market, and all sorts of less
easily quantifiable benefits.
In our next project we will be shifting away from Synopsys DC and adding
additional support features for Oasys RealTime Designer.
- [ The Man in the Iron Mask ]
Join
Index
Next->Item
|
|