( ESNUG 429 Item 1 ) --------------------------------------------- [06/03/04]

From: Venkata Simhadri <venkata=user  domain=time2mkt spot calm>
Subject: A User Review Of The Unannounced Monterey AFP Block Floorplanner

Hi, John,

We got involved recently in a project that Monterey was doing with one of
our customers.  The customer was using a Magma flow and was integrating
a new tool from Monterey that they called "AFP".  (Not sure what AFP stands
for.)  It's not released yet, but apparently some customers have an early
version of it.  We had heard that Monterey was working on something that
reduces die size as a result of applying a different approach to hard
macro placement.

Because our customer was in a holding pattern waiting for some data,
Monterey allowed us to install AFP at our site and play with it for about
10 days.  We had a couple of blocks in the process of re-implementing,
and although we had not pushed the block sizes in our initial implementation
because of schedule considerations, we decided to run them through the new
software to see if it could help.

The blocks we were working on seemed to fit with what Monterey claimed
would be ideal.  They say that AFP works well on designs that have over 20
hard macros.  One of our blocks was a marginal fit because it contained
only 11 hard macros while the other had 35.  Both blocks were to be
implemented in an 8-layer TSMC 0.13 um process.

Installation of the AFP software was pretty simple from the tarball Monterey
provided.  There were no template scripts (this is still unreleased code)
but with the help of one of Monterey's AEs, we were able to set up a basic
design planning flow in a couple of hours.


Data Formats
------------

Data requirements for AFP are pretty standard.  LEFs are required for
technology and physical cell (macro and standard cell) information, and
a hierarchical Verilog netlist (gate level) is needed for the design.  (I'm
not sure it would work with a flat netlist, but Monterey claims that it
does -- provided that the hierarchy separators are still in the netlist.)
We did not specify any port location constraints for this eval; these would
have been read from DEF.  Data input went surprisingly smooth for a new
tool. The only thing that we did manually was to define the power structure
using TCL commands.

AFP does not do the detailed power routing (i.e. create metal1 rails, drop
vias or do pin-tapping).  Just basic meshes and rings are supported.
Monterey says to use their Calypso for that for the moment, but any P&R tool
can probably take it from there.


First Block
-----------

Once the setup was complete, we ran the first block through the basic
steps of AFP that consist of some heuristic hierarchy manipulations and
cluster generation followed by block placement.  AFP places both the hard
macros and the logic clusters.  For this block, which contained 11 hard
macros and about 180K standard cells, we got just under 4,000 clusters.

The number of clusters can be controlled by the user or is automatically
calculated, we used the automatic mode.  Not sure in which cases it makes
sense to define them by the user.  Placement of these clusters and hard
macros took about 10 minutes.

We examined the degree of connectivity between blocks using the "bundle
net" analysis in the GUI.  There were about 6 long bundle nets visible
which meant 6 groups of nets that are long and can cause trouble.  The
Monterey AE told us that the floorplan looked sufficiently good to start
the real prototyping and implementation process.  (BTW, we did not run the
global router built into AFP because we already have an environment with a
fast prototype router in place.)  The AFP came up with the area by itself.

What it came up with was now about 20% smaller than in the previous
implementation so we were keen to see what would happen in the back-end.

Basically this meant writing a standard DEF out of AFP and reading this
into our standard SoC Encounter environment.  We had minor issues reading
the DEF generated by AFP but we worked around them and ran trial routing.
The issue was that we should had specified a 5 um minimum distance from
the block boundary to the first macro in AFP so we wouldn't have to snap
the macros back inside the core in SoC Encounter.  Otherwise macros would
fall outside the core boundary.

Surprisingly, the routing showed no congestion which was bizarre given
that we had some congestion on our previous implementation.  Our initial
timing analysis showed large negative slack but we had not fixed
transition violations.  We corrected this by running IPO to buffer high
fan-out nets.  This brought the timing slack down to less than -200 ps and
trial routing mode in FE still showed no congestion.  Just for grins, we
ran Nanoroute to see if there were going to be any routing issues.  Sure
enough, Nanoroute showed over 9,000 DRCs, but we examined these in the
GUI and quickly determined that the violations were caused by tech file
related issues and also not leaving enough blockages around the macros.
This was quickly corrected.  With fairly high confidence that the design
could be closed, we then ran the full back-end implementation flow
including PKS and successfully closed the block with zero DRC violations
and meeting the timing requirements.  The final utilization was close
to 90%.


Second Block
------------

Our experience on the second block was similar.  This block contained 35
hard macros and had 4 main clock domains.  We repeated the experiment but
needed less time to find the work-arounds to the DEF reading problems.
This block was larger but also ran in under 1 hour for hard macro
placement.  We did notice that the block placement from AFP could create
routing congestion and if you got congestion, you'd rather iterate
inside AFP between the design planning and the global route, and export
only when you have a high confidence.  Anyway, using the AFP tool hard
macro placement, it again produced a result that was just under 20% smaller
than our previous implementation (remember, it decides how big it should
be.)  We closed this block using the same interfaces, and basically the
same flow.  Then to be thorough we also ran SI-aware routing with Nanoroute
in just over 1 day and achieved close to 90% utilization.


Overall
-------

AFP still has its rough edges, but all said, I think that it can certainly
add significantly to any prototyping and implementation flow, either in
reducing the size and/or producing a routable floorplan faster.  We plan on
using AFP in some new design service engagements to complement our well
proven SoC Encounter-based flow.  The combination of the two certainly seems
to give an edge in terms of finding better macro placement for smaller die
size and still have a fast turn-around.

    - Venkata Simhadri
      TTM, Inc.                                  San Jose, CA


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)