( ESNUG 340 Item 7 ) --------------------------------------------- [1/19/00]
Subject: ( ESNUG 338 #4 ) Smart Flat P&R Designs Are Faster & Lower Power
> I know it's very common for some companies to do layout as a process on
> one big flat design. We considered flat, but these five "hells" came up:
>
> - big flat designs are run-time hell
From: [ Intel Inside ]
John, A few comments (please keep me anon on this)...
Yep, big flat designs are run-time hell. This is the biggest challenge,
and ultimately may be the showstopper for some people...
> - big flat designs are extraction hell
Naw. If you run 2.5D P&R extraction, runtimes are overnight. There are
plenty of industry tools available where a multi-threaded approach can be
used to hit these thru-put requirements on 500+K instance designs. The
major limitations we usually encounter relate to memory issues on 32 bit
operating systems. Once 64 bit code becomes more available, it should be
less of an issue.
> - big flat designs are back-annotation hell
If you have to live with what is available in the marketplace, this may be
true. It doesn't take a rocket scientist to write code to take "big flat"
delay calculation results and back annotate individual unit results back to
each unit owner (providing synthesis is still being done hierarchically).
This allows a hierarchical synthesis loop with flat P&R.
> - big flat designs are clock tree hell
It is more difficult to handle clocks flat, but in the long run, the global
clock network is smaller and hence consumes less power. I have not
encountered any industry tools where you can get away with pushing a button,
and getting results anywhere near where you need them; however, with
creative solutions and manual tweaks, 50 psec clock skews can be achieved
on 400K instance flat designs on a quarter micron process with the "help"
of industry tools and a day or two of effort. Whoever is responsible for
clock treeing needs to understand the architecture of the design...
> - big flat designs are timing closure hell
Huh? All designs are timing closure hell - I think it all boils down to
when you solve your unit to unit timings. I believe design teams focus more
on top level timing issues much earlier in the design flow on hierarchical
P&R designs than they do on flat P&R designs; hence, flat designs can be
timing closure hell if you focus on unit level timing first and save top
level timing for last.
> In practical terms, with engineers here running around tweaking & pumping
> netlists out of Design Compiler every day, some way to compartmentalize
> their work is a MUST. So, we chose the hierarchical approach.
If the same people who do the synthesis also do the P&R, the controversial
flat vs. heirarchical arguement is usually a non-issue because the
trade-offs are understood. Most large companies have one cluster of people
do synthesis and a different crew do the P&R.
That's when the decision process gets muddled. (My personal observation is
that the less influence the P&R team has over design methodology, the more
likely it is to be hierarchical, because that is what design engineers
understand most.)
I don't like the real estate penalty you pay with hierarchical designs.
Also, a hierarchical design is only as good as the planning and partitioning
that is done up front - the further you get down the design cycle, the more
difficult it becomes to make any changes to partitions.
To summarize on the flat vs. hierarchy arguement, I think methodology and
discipline are the biggest factors that should determine which direction a
design team should go...
- [ Intel Inside ]
|
|