( ESNUG 335 Item 1 ) ---------------------------------------------- [11/3/99]
From: Bob Prevett <prevett@nvidia.com>
Subject: A Design Engineer's Impressions Of The New Synopsys "PhysOpt" Tool
Hi, John,
I know you like user reviews of Synopsys products, so I thought I'd send ou
a review of PhysOpt, their new physical synthesis tool. At my company,
NVIDIA, we must create large, high-speed designs as soon as possible. Time
to market is *everything* in the graphics business, so shrinking the timeit
takes to place and route a design while achieving timing convergence is a
critical business interest for us. For us, time really is money.
Our Old DC Reoptimize Design Flow
---------------------------------
To understand PhysOpt, you first have to understand what we used to do
before PhysOpt. Here's our old design flow.
1. Write, simulate, synthesize Verilog into gates using DC.
2. Partition the gate-level netlist into 250K to 300K blocks
for P&R in Avanti. Partitioning to smaller blocks is a
file management headache; partitioning to larger blocks
causes Avanti Apollo to choke. Around 300K is optimal.
3. Floorplan each partition with Apollo and some internal tools.
4. Placement in Avanti Apollo.
5. Routing in Avanti.
6. Extraction of loading and RC data in Apollo.
7. Generate annotated .db file in Design Compiler. Use PrimeTime
to explore the timing of the annotated .db file.
8. Use -reoptimize_design in DC to tweak the design.
9. Generate new netlist and incremental PDEF.
10. Go back to step 4 until everything converges.
Using this flow, layout partitions typically took 6 to 10 passes to achieve
timing. Each pass could take 2 to 3 days. Our main headache was that
Reoptimize Design would make timing by disturbing a large percentage of
the netlist. Then we'd get caught up in a chicken & egg loop where the
incremental P&R required to fix the reoptimized design would cause enough
P&R disturbance to require another major pass through DC reoptimize design.
We often discovered that going through a large number of reoptimize design
passes would result in an unroutable layout. Reoptimize design, by running
outside of the layout environment, just did not have enough information to
make good IPO decisions.
Our New PhysOpt Design Flow
---------------------------
PhysOpt accepts 2 basic types of input: RTL or gates. When we started with
PhysOpt, we already had a working netlist, so we naturally used PhysOpt at
the gate level. In this new PhysOpt flow, steps 1 through 3 (above)
remained the same. What changed for us was at step 4.
4. Synthesis/Placement using PhysOpt
5. Routing in Avanti.
6. Extraction of loading and RC data using Avanti Apollo.
7. Generate annotated .db file. Use PrimeTime to explore the timing
of the annotated .db file.
8. Go back to PhysOpt in step 4 using annotated db from step 7; repeat
this loop until everything converges and is routable.
Using our PhysOpt flow, 300k gate layout partitions typically took 2 to 3
passes to achieve both timing and routing convergence. Each iteration
(doing steps 4 through 8 above) took 2 days for the first pass and 1 day
for each incremental pass. Since PhysOpt tweaks placement for timing
fixes while simultaneously assessing routing congestion, we found it made
better optimizations. We also found first pass placement quality from
PhysOpt was better in both timing and routeability than first pass
placements from the Avanti Apollo timing-driven placer.
We were able knock off about 3 to 4 weeks in our layout process by using
the new PhysOpt flow. In addition, our flow became more streamlined. That
is, we already had a flow in place using Design Compiler and Primetime to
specify timing constraints and to run back annotation. Our annotated db
could then be directly fed to PhysOpt without having to translate everything
back into the Avanti database for each iteration like we did with our old
design flow.
Using PhysOpt was very similar to using Design Compiler with TCL. A sample
PhysOpt script that compiles a mythical block called "george" looks like:
psyn_shell> set physical_library lsi25.pdb
psyn_shell> set target_library lsi25.db
psyn_shell> read_db george.db
psyn_shell> read_pdef george.pdef
psyn_shell> set_ideal_net scan_enable
psyn_shell> set compile_delete_unloaded_sequential_cells false
psyn_shell> set_dont_touch_network [ list [ all_clocks ]]
psyn_shell> physopt -effort medium -congestion -congestion_effort medium
psyn_shell> write_pdef -v3.0 -output george_out.pdef
psyn_shell> write -f db -o george_out.db
psyn_shell> report_timing -nets -input_pins -physical > george_report
psyn_shell> report_qor >> george_report
Overall it was a very easy tool to use. We were an alpha code site for
PhysOpt. They've improved its run time considerably. Initially, they had
PhysOpt (first pass) compiles that took 72 hours -- but Synopsys R&D quickly
got that down to 20 hours. The incremental compiles are now down to 4
hours. Also, when we first used PhysOpt, about 1 in 4 of our design blocks
we fed it compiled to unroutable designs. Now, all our design blocks are
fully routable coming out of PhysOpt. ( Using "-congestion_effort high"
helped a lot with this. ) It's production quality code now.
While PhysOpt wasn't that Holy Grail of synthesis tools, it represented for
us an important step forward towards that goal. It knocked about 4 weeks
off our design schedule. In the PC graphics chip world, this has a very
significant impact on our bottom line.
- Bob Prevett, Design Engineer
NVIDIA Santa Clara, CA
|
|