Conference Paper

A network-flow approach to timing-driven incremental placement for ASICs.

DOI: 10.1109/ICCAD.2006.320061 Conference: 2006 International Conference on Computer-Aided Design (ICCAD'06), November 5-9, 2006, San Jose, CA, USA
Source: DBLP

ABSTRACT We present a novel incremental placement methodology called FlowPlace for significantly reducing critical path delays of placed standard-cell circuits. FlowPlace includes: a) a timing-driven (TD) analytical global placer TAN that uses accurate delay functions and minimizes a combination of linear and quadratic objective functions; b) a network flow based detailed placer TIF that has new and effective techniques for performing TD incremental placement and satisfying row-length (white space) constraints. We have obtained results on three sets of benchmarks: i) TD versions of the ibm benchmark suite that we have constructed; ii) benchmarks used in TD-Dragon; iii) the Faraday benchmarks. Results show that starting with Dragon-placed circuits, we are able to obtain up to 34% and an average of 18% improvement in critical path delays, at an average of 17.5% of the run-time of the Dragon placer. Starting with a state-of-the-art TD placer TD-Dragon, for the TD-Dragon benchmarks we obtain up to about 10% and an average of 4.3% delay improvement with 12% of TD-Dragon's run times; this is significant as we are extracting performance improvements from a performance-optimized layout. Wire length deterioration on the average over all benchmark suites is less than 8%

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We present a novel incremental placement methodology called FlowPlace for significantly reducing critical path delays of placed standard-cell circuits without appreciable increase in wire length (WL). FlowPlace includes: 1) a timing-driven (TD) analytical global placer TAN that uses accurate pre-route delay functions and minimizes a combination of linear and quadratic objective functions; 2) a discretized network-flow-based detailed placer DFP that has new and effective techniques for performing TD/WL-driven incremental placement while satisfying row-width (white space) constraints; 3) new and accurate unrouted net delay models that are suitable for an analytical placer; and 4) an effective probability-based WL-cost function in detailed placement for reducing WL deterioration while performing TD-incremental placement. We ran FlowPlace on three sets of benchmarks with up to 210 K cells. Starting from WL-optimized placements done by Dragon 2.23, and using purely timing-driven incremental placement, we are able to obtain up to 33.4% and an average of 17.3% improvement in circuit delays at an average of 9.0% WL increase. When incorporating both timing and WL costs in the objective functions of global and detailed placement, the average WL increase reduces to 5.8%, a 35% relative reduction, while the average delay improvement is 15.7%, which is only relatively 9% worse. The run time of our incremental placement method is only about 10% of the run time of Dragon 2.23. Furthermore, starting from an already timing-optimized placement done by TD-Dragon, we still obtain up to 10% and an average of 6.5% delay improvement with a 6.1% WL deterioration; the run time is about 6% of TD-Dragon's.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 08/2011; · 1.22 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a timing-driven discrete cell-sizing algorithm that can address total cell size and/or leakage power constraints. We model cell sizing as a “discretized” mincost network flow problem, wherein available sizes of each cell are modeled as nodes. Flow passing through a node indicates the choice of the corresponding cell size, and the total flow cost reflects the timing objective function value corresponding to these choices. Compared to other discrete optimization methods for cell sizing, our method can obtain near-optimal solutions in a time-efficient manner. We tested our algorithm on ISCAS’85 benchmarks, and compared our results to those produced by an optimal dynamic programming- (DP-) based method. The results show that compared to the optimal method, the improvements to an initial sizing solution obtained by our method is only 1% (3%) worse when using a 180 nm (90 nm) library, while being 40–60 times faster. We also obtained results for ISPD’12 cell-sizing benchmarks, under leakage power constraint, and compared them to those of a state-of-the-art approximate DP method (optimal DP runs out of memory for the smallest of these circuits). Our results show that we are only 0.9% worse than the approximate DP method, while being more than twice as fast.
    VLSI Design 05/2013; 2013.
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present a physical-synthesis based power opti- mization technique that simultaneously explores the four trans- forms, multiple Vdd, multiple Vth, cell sizing, and placement, to find a minimum power solution under timing and other constraints. The optimal selection of the design options of all transforms for all cells in the circuit is solved using a new opti- mization technique called discretized network flow that we have recently developed. Among the constraints we consider, timing and the voltage-island constraints are the two most important and complex ones. The voltage-island constraint specifies the maximum allowed number of voltage islands in the layout, and the requirement that each island be rectangular. We develop an approach that along with the option selection process can simultaneously determine the voltage islands needed, as well as satisfy all given constraints. Experimental results on ISCAS'89 and Faraday benchmarks show that compared to an initial wire length (WL)-optimized placement with high supply and low threshold voltage levels, we obtain a power reduction by up to 42% and an average of 30% for the same delay as that of the initial design. These improvements are also 44-50% relatively better than the improvements yielded by sequentially applying the four power reduction transforms, which is the currently standard method for applying multiple transforms. Finally, compared to an industry tool Synopsys IC Compiler (ICC) that also applies all four transforms, our method reduces power by an additional amount of up to 19%, and an average of 16%.
    IEEE Trans. on CAD of Integrated Circuits and Systems. 01/2011; 30:746-759.


Available from