Article

A parallel algorithm fork-way graph partitioning

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the recent development of semiconductor integration technology, the amount of data that must be handled in the layout design of VLSI is increasing rapidly. Even if the improvement of the processing speed of the computer in the future is considered, it is desired to develop a high-speed layout algorithm compared to the conventional method. This paper discusses the k (> 2)-way graph partitioning problem, which is one of the most basic problems concerning the layout design. A parallel algorithm is proposed. The general method to solve this problem has been to apply hierarchically the two-way graph partitioning algorithm. In this method, the algorithm can easily be executed in parallel by operating a number of processors at each hierarchy. A problem then is the efficiency of the processor and the computation time. This paper considers the k-way graph partitioning and proposes a new method called nonhierarchical k-way graph partitioning, aiming at the education of the computation time by parallel processing. In general, it is considered difficult to improve the speed sufficiently by the parallel processing, while maintaining the same accuracy of the solution as that of the sequential algorithm. In this paper, the effectiveness of the proposed algorithm is shown by a simulation experiment on the sequential computer.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Though simple to define, such "grouping" problems often pose significant challenges in practical settings because it is not always easy to judge whether the imposed constraints can be satisfied. Indeed many such problems including those concerning graph partitioning (Hertz et al., 2008;Isomoto et al., 1993;Jensen and Toft, 1994;Nakano1 et al., 1995), school and university timetabling (Lewis, 2008;McCollum et al., 2010), sports fixture scheduling (de Werra, 1988;Kendall et al., 2010;Rasmussen and Trick, 2008), load balancing (Falkenauer, 1998), and frequency assignment (Aardel et al., 2002;Valenzuela, 2001), are known to be NP-hard (Garey and Johnson, 1979;Karp, 1972), implying that we cannot hope to establish polynomially bounded algorithms for solving them in the general sense. ...
Article
This paper considers variants of the one-dimensional bin packing (and stock cutting) problem in which both the ordering and orientation of items in a container influences the validity and quality of a solution. Two new real-world problems of this type are introduced, the first that involves the creation of wooden trapezoidal-shaped trusses for use in the roofing industry, the second that requires the cutting and scoring of rectangular pieces of cardboard in the construction of boxes. To tackle these problems, two variants of a local search-based approximation algorithm are proposed, the first that attempts to determine item ordering and orientation via simple heuristics, the second that employs more accurate but costly branch-and-bound procedures. We investigate the inevitable trade-off between speed and accuracy that occurs with these variants and highlight the circumstances under which each scheme is advantageous.
Article
In this thesis, we deal with the following k-way graph partitioning (GP) problem: given an undirected weighted graph G(V; E), partition the nodes of G into k parts of almost equal size such that the partition-cost (sum of the weights on edges with nodes in different parts) is minimized. We propose some simple and fast algorithms for this problem for both sequential and distributed computing environments. We give three main algorithms for graph partitioning: direct algorithm Auction; and iterative algorithms GreedyPass and GreedyCycle. In the algorithm Auction, we introduce the idea of using auction and biddings for the GP problem. This is an inherently distributed algorithm. To the depth of our knowledge this is the first distributed algorithm for this problem. The algorithm GreedyPass is a greedy iterative algorithm. In each iteration we send a node from the current part to another part in order to get maximum decrease in partition-cost, and iteration can continue taking the destin...
Book
1. The Linear Sum Assignment Problem.- 2. The Linear Bottleneck Assignment Problem.- 3. The Cardinality Matching Problem.- 4. The Sum Matching Problem.- 5. The Bottleneck Matching Problem.- 6. The Chinese Postman Problem.- 7. Quadratic Assignment Problems.- 8. QAP Heuristic 1: The method of increasing degree of freedom.- 9. QAP Heuristic 2: Cutting plane and exchange method.- 10. General Subroutines.
Article
We consider the problem of partitioning the nodes of a graph with costs on its edges into subsets of given sizes so as to minimize the sum of the costs on all edges cut. This problem arises in several physical situations—for example, in assigning the components of electronic circuits to circuit boards to minimize the number of connections between boards. This paper presents a heuristic method for partitioning arbitrary graphs which is both effective in finding optimal partitions, and fast enough to be practical in solving large problems.
Conference Paper
An iterative mincut heuristic for partitioning networks is presented whose worst case computation time, per pass, grows linearly with the size of the network. In practice, only a very small number of passes are typically needed, leading to a fast approximation algorithm for mincut partitioning. To deal with cells of various sizes, the algorithm progresses by moving one cell at a time between the blocks of the partition while maintaining a desired balance based on the size of the blocks rather than the number of cells per block. Efficient data structures are used to avoid unnecessary searching for the best cell to move and to minimize unnecessary updating of cells affected by each move.
Conference Paper
A new parallel hierarchical algorithm for global routing (PHIGURE) is presented. The router is based on the work of M. Burstein and R. Pelavin, (IEEE Trans. CAD, vol.CAD-2, no.4, p.223-34, Oct. 1983) but has many extensions for general global routing and parallel execution. Main features of the algorithm include structured hierarchical decomposition into separate independent tasks which are suitable for parallel execution and adaptive simplex solution for adding feedthroughs and adjusting channel heights for row-based layout. The algorithm is described and results are presented for a shared-memory multiprocessor implementation
Conference Paper
The k -way graph partitioning problem can be transformed into the maximum k -cut problem using a proposed technique of graph modification. It is possible to transform the graph partitioning problem into the max-cut problem by incorporating node size information into the edge weight. After transformation, a very simple cost function can be devised which makes the proposed algorithm more efficient than the Kernighan-Lin (K-L) algorithm (1970). The computing time per iteration of the algorithm is O ( k × N 2), where N is the number of nodes in the given graph. Experimental results show that the proposed algorithm outperforms the K-L algorithm both in the quality of solutions and in the elapsed time. Also, as the difference between the sizes of the nodes increases, the performance gap between the proposed algorithm and the K-L algorithm becomes larger
Article
This paper describes a method of automatic placement for standard cells (polycells) that yields areas within 10-20 percent of careful hand placements. The method is based on graph partitioning to identify groups of modules that ought to be close to each other, and a technique for properly accounting for external connections at each level of partitioning. The placement procedure is in production use as part of an automated design system; it has been used in the design of more than 40 chips, in CMOS, NMOS, and bipolar technologies.
Article
The potential speedup of a standard cell global router using a general-purpose multiprocessor is investigated. LocusRoute, a global routing algorithm for standard cells, and its parallel implementation are presented. The uniprocessor speed and quality of LocusRoute is comparable to modern global routers. LocusRoute compares favorably with the TimberWolf 5.0 global router and a maze router that searches the same space more completely. Two successful methods of parallel decomposition of the router are presented. The first, in which multiple wires are routed in parallel, uses the notion of chaotic parallelism to achieve significant performance gains by relaxing data dependencies, at the cost of a minor loss in quality. Using iteration and careful assignment of wires to processors, this degradation is reduced. The approach achieves measured speedups from 5 to 14 using 15 processors. The second parallel decomposition technique is the evaluation of different routes for each wire on separate processors. It achieves speedups of up to 6 using 10 processors. It is demonstrated that when these two approaches are combined, the aggregate speedup is the product of the individual approaches' speedup, and, using an improved scheduling approach, it can be even greater. With a simple model based on these results, speedups of more than 75 using 150 processors are predicted
Article
A VLSI design procedure is presented for finding netlist partitions. The method consists of approximating a netlist, which can be represented as a hypergraph, by an undirected graph with weighted edges. A Gomory-Hu cut tree allows one to generate netlist partitions for every pair of modules and estimate how far from optimality netlist cut is. Experimental results indicate that the netlist partitions are optimal or near-optical
Article
Recently, a fast (linear) heuristic for improving min-cut partitions of VLSI networks was suggested by Fiduccia and Mattheyses [6]. In this-paper we generalize their ideas and suggest a class of increasingly sophisticated heuristics. We then show, by exploiting the data structures originally suggested by them, that the computational complexity of any specific heuristic in the suggested class remains linear in the size of the network.
Assignment and Matching Problems: Solution Methods with FORTRAN-Program. Lecture Notes in Economics and Mathematical Systems
  • R. E. Burkard
  • U. Derigs
An empirical study on parallelk-way graph partitioning algorithms
  • K S Isomoto
  • J Wakabayashi
  • Miyao
  • Yoshida
An empirical study on parallel k-way graph partitioning algorithms
  • Isomoto