Efficient Power Network Analysis with Modeling of Inductive Effects

Shan ZENG†, Nonmember, Wenjian YU†|a†, Member, Xianlong HONG†|b†, and Chung-Kuan CHENG†|c†, Nonmembers

SUMMARY In this paper, an efficient method is proposed to accurately analyze large-scale power/ground (P/G) networks, where inductive parasitics are modeled with the partial reluctance. The method is based on frequency-domain circuit analysis and the technique of vector fitting [14], and obtains the time-domain voltage response at given P/G nodes. The frequency-domain circuit equation including partial reluctances is derived, and then solved with the GMRES algorithm with rescaling, preconditioning and recycling techniques. With the merit of sparsified reluctance matrix and iterative solving techniques for the frequency-domain circuit equations, the proposed method is able to handle large-scale P/G networks with complete inductive modeling. Numerical results show that the proposed method is orders of magnitude faster than HSPICE, several times faster than incomplete inductive modeling. Therefore, the circuit simulation based on partial reluctance is much more efficient than that based on partial inductance matrix. The partial reluctance matrix is the inverse of the inductance matrix, and can be easily sparsified. Thus, the circuit simulation based on partial reluctance is much more efficient than that based on partial inductance matrix [2]–[5]. A recent work of inductance modeling of large-scale on-chip P/G networks is presented in [6], where the partial reluctances are extracted.

Due to the uncertainty of current return path, the partial element equivalent circuit (PEEC) with partial inductance is often used to model the on-chip inductive effects. A concept of partial reluctance (or, K-element) was proposed in [2] to overcome the inefficiency brought by the dense partial inductance matrix. The partial reluctance matrix is the inverse of the inductance matrix, and can be easily sparsified. Therefore, the circuit simulation based on partial reluctance is much more efficient than that based on partial inductance matrix [2]–[5]. A recent work of inductance modeling of large-scale on-chip P/G networks is presented in [6], where the partial reluctances are extracted.

During the past few years, the main focus of power network analysis has been on how to trade off between the simulation accuracy and the speed. Many previous works focused on the efficient time-domain transient analysis of large-scale power networks. In some works, the circuit simulation is accelerated by fast linear equation solvers. They include the direct solver “KLU” [7], iterative solvers like the preconditioned conjugate gradient (PCG) method [8], and the generalized minimal residual (GMRES) method [9]. In others, the circuit size is reduced by using methods such as circuit partitioning [10] and hierarchical model reduction [11].

Recently, a frequency-domain based simulation method was proposed to obtain the time-domain voltage response in [12] and [13]. With the vector fitting technique [14], the frequency-domain voltage responses are transformed into time-domain voltage response. However, the existing works do not consider the on-chip inductive effects of P/G wires. A simple nodal analysis formulation was employed in [12], [13], which only considered the on-chip RC parasitics.

In this paper, efficient techniques for the frequency-domain based simulation method are proposed to consider the complete inductive effects within on-chip P/G grids. With a technique of modified nodal analysis (MNA), we firstly derive a frequency-domain circuit equation including parasitic inductances. Then, we replace the inductances with partial reluctances (through inversing the inductance matrix). The techniques of rescaling, preconditioning and recycling are proposed to accelerate the GMRES iterative solution [19] of the frequency-domain circuit equations. Numerical results show the proposed techniques are efficient, which makes the frequency-domain based simulation several orders of magnitude faster than the HSPICE
while preserving high accuracy. Compared with the INDUCTWISE algorithm in [4], the proposed method also gains speedup of several times. Finally, it is demonstrated that the proposed method is able to handle a large-scale P/G structure with more than 100,000 wire segments, while the other methods fail. The early version of this paper was published in [16].

The rest of this paper is organized as follows. In Sect. 2, the background of the power network model, partial reluctance, and the frequency-domain based simulation method are introduced. The efficient techniques for power network analysis considering the reluctance parameters are proposed in Sect. 3. Then, the numerical results are given in Sect. 4. Finally, it is the conclusion.

2. Background

2.1 Power Network Model with Complete Inductive Effects

On-chip power network is usually routed in several metal layers to form mesh structure. In each layer, the orientation of metal wires is along either X-axis or Y-axis, alternatively. And, the power wires are interlaced with the ground wires. Between two adjacent layers, the P/G wires are connected through vias, which cut the wires into small wire segments. Figure 1 shows the 3-D view of a portion of two-layer power network. For complete electromagnetic modeling of power network, the partial element equivalent circuit (PEEC) technique should be employed, which results in a circuit including resistance, capacitance and inductance elements for each wire segment. Time-varying current sources are connected to some bottom-layer circuit nodes, characterizing the supply current for active circuit modules. These current sources draw current from the power network and cause voltage fluctuations. The waveform of current source is usually described as a piecewise linear (PWL) function.

In the PEEC model, the inductive effect is characterized with the partial inductance, including self and mutual inductances. The resulting inductance matrix is dense due to the coupling of partial inductance among all conductors. For the P/G grid with a large amount of wire segments, the partial inductance matrix in circuit equation becomes a large-scale dense matrix and is hard to be sparsified while preserving accuracy. This makes it prohibitive to extract and simulate the P/G grid with the partial inductance. The partial reluctance matrix $K$ is defined to be the inverse of partial inductance matrix $L$:

$$K = L^{-1}$$

(1)

Related works showed that the partial reluctance has the locality property like capacitance, so that it could be easily sparsified. With the sparsified partial reluctance matrix, the circuit simulation is not only largely accelerated but also stable [3], [4]. Efficient techniques have been proposed to extract the partial reluctance for large interconnect structures [4], [6], [17], [18].

As the clock frequency continues to increase, the high-frequency effect should be considered for wider global interconnects, such as the high-level P/G wires. The reluctance extraction techniques considering high-frequency effect were proposed in [17], [18]. In this work, they are utilized to extract the frequency-dependent reluctance and resistance parameters for upper-layer P/G wires.

2.2 Time-Domain Simulation Based on Frequency-Domain Analysis and Vector Fitting

Figure 2 describes the flow of the frequency-domain based simulation method [12]. Three main steps in the flow are as follows.

1. The time-domain waveform of current sources is converted to frequency-domain expression with Laplace transformation. Since each input current source $I(t)$ is described as a PWL function, its frequency-domain expression can be derived analytically.

2. The circuit equation for frequency-domain analysis is formulated, and solved for each specified frequency. With suitable frequency sampling, the vector fitting technique [14] is adopted to fit the frequency-domain voltage responses $V(s)$ with a partial fractional expression $\tilde{V}(s)$.

3. The result of vector fitting is converted to the time-domain voltage waveform $v(t)$ [12], [13].
For power network analysis, usually the voltage responses on a small amount of nodes are needed. So, this frequency-domain based simulation method has large advantages over the conventional time-domain transient simulation. The previous works showed it is orders of magnitude faster than the conventional time-domain simulation methods, while preserving sufficient accuracy.

In [12] and [13], only the nodal analysis (NA) technique was used. So, it could not deal with the complete P/G parasitic circuit with mutual inductive elements. In this paper, we extend the work in [12] with the modified nodal analysis (MNA), and consider the complete parasitic effects with the reluctance model. The GMRES method [19] with rescaling and preconditioning techniques is adopted to efficiently solve the frequency-domain circuit equations.

3. Fast Power Network Analysis Considering Partial Reluctances

3.1 Basic Idea

To apply the frequency-domain based simulation method for the P/G grid, we need to derive the frequency-domain circuit equation with partial reluctances. Firstly, we derive the equation with inductance elements. Then, the inductance matrix is replaced with the reluctance matrix based on (1). The sparsified reluctance matrix extracted with the DRRE (direct reluctance and resistance extraction) algorithm [17], [18] is used here.

The GMRES algorithm is an efficient iterative solver for general non-symmetric sparse linear equation system [20]. With suitable preconditioning technique, GMRES converges very fast. Besides, it generally has robust behavior in various applications. In this work, the GMRES algorithm is adopted to solve the circuit equation at each frequency point. Along with efficient techniques of rescaling, preconditioning and recycling, the algorithm is very efficient for large-scale P/G structures. For the selection of frequency samples, we follow the technique in [12]. The highest frequency of the voltage response \( f_{\text{max}} \) is usually not larger than several tens of GHz, and the logarithmic scale sampling of frequency is adopted. So, the number of frequency points is of \( O(\log f_{\text{max}}) \), and is about several tens in our experiments.

To account for the high-frequency effect, the resistance and reluctance elements can be generated with the techniques in [17], [18], and their values vary with frequency. So for different frequency, the SPICE input file may be different. This does not bring difficult to the frequency-domain based simulation method, while the conventional time-domain simulation is not able to handle these frequency-dependent parameters.

3.2 Frequency-Domain Circuit Equations

Figure 3 shows the PEEC model for a small portion of the P/G grid, which includes four wire segments. The mutual inductances between wire segments and coupling capacitance between nodes are considered, but are not drawn in Fig. 3 for a clear view. We note that the connecting node between the inductor and resistor for a same wire segment is a “spurious” node, whose voltage has no physical meaning (see Fig. 3). So, we can set the voltage variables only at the points connecting wire segments. Then, with the variables like those shown in Fig. 3, the following circuit equation can be derived using the MNA technique:

\[
\begin{bmatrix}
G & A_L^T \\
-A_L & R
\end{bmatrix}
\begin{bmatrix}
V_n(t) \\
I_L(t)
\end{bmatrix}
+ \begin{bmatrix}
C & 0 \\
0 & L
\end{bmatrix}
\begin{bmatrix}
\frac{dV_n(t)}{dt} \\
\frac{dI_L(t)}{dt}
\end{bmatrix}
= \begin{bmatrix}
-A_L^T I_n(t) \\
0
\end{bmatrix}
\]  

\[ (2) \]

where \( I_n(t), V_n(t) \) and \( I_L(t) \) are the vectors of independent current sources, the unknown nodal voltages and the unknown currents through inductor branches, respectively. The \( G \) matrix includes the conductances of resistors not on inductor branches, while \( R \) includes the resistances on inductor branches. \( C \) and \( L \) are the capacitance and inductance matrices, respectively. Matrix \( A_L \) and \( A_L \) are the adjacency matrix for the current source and inductors, respectively. This formulation is actually different from the conventional MNA formulation [15]. Since the “spurious” node connecting the resistor and inductor is not involved, the resulting equation has fewer unknown variables than the conventional MNA equation.

For frequency-domain analysis, Eq. (2) is converted to:

\[
\begin{bmatrix}
G & A_L^T \\
-A_L & R
\end{bmatrix}
\begin{bmatrix}
V_n \\
I_L
\end{bmatrix}
+ \begin{bmatrix}
C & 0 \\
0 & L
\end{bmatrix}
\begin{bmatrix}
V_n \\
I_L
\end{bmatrix}
= \begin{bmatrix}
-A_L^T I_n \\
0
\end{bmatrix}
\]  

\[ (3) \]

where \( s = j\omega \) and \( \omega \) is the angular frequency. Here \( I_n \) stands for the frequency-domain expression of current sources, which is obtained with the Laplace transform of time-domain waveforms. For a given frequency, the frequency-domain voltage response can be obtained by solving the complex-valued linear equation system (3). The number of unknowns in (3) equals to the number of circuit nodes \( n_n \) plus the number of inductor branches \( n_L \).

The complex-valued vectors \( I_n, V_n \) and \( I_L \) in (3) can be decomposed into real and imaginary parts:

\[
I_n = I_{\text{re}} + j I_{\text{im}}
\]
\[ V_n = V_{nre} + jV_{nim} \]
\[ I_L = I_{Lre} + jI_{Lim} \]

Then, (3) is transformed into a real-valued linear equation system:

\[
\begin{bmatrix}
  G & -\omega C & A_L^T & 0 \\
  \omega C & G & 0 & A_L^T \\
  -A_L & 0 & R & -\omega L \\
  0 & -A_L & \omega L & R
\end{bmatrix}
\begin{bmatrix}
  V_{nre} \\
  V_{nim} \\
  I_{Lre} \\
  I_{Lim}
\end{bmatrix}
= \begin{bmatrix}
  -A_L^T I_{sre} \\
  -A_L^T I_{sim} \\
  0 \\
  0
\end{bmatrix}
\]

(4)

We replace the \( L \) matrix in (4) with \( K^{-1} \), where \( K \) is the partial reluctance matrix, and then derive the circuit equation with reluctance elements:

\[
\begin{bmatrix}
  G & -\omega C & A_L^T & 0 \\
  \omega C & G & 0 & A_L^T \\
  -KA_L & 0 & KR & -\omega L \\
  0 & -KA_L & \omega L & KR
\end{bmatrix}
\begin{bmatrix}
  V_{nre} \\
  V_{nim} \\
  I_{Lre} \\
  I_{Lim}
\end{bmatrix}
= \begin{bmatrix}
  -A_L^T I_{sre} \\
  -A_L^T I_{sim} \\
  0 \\
  0
\end{bmatrix}
\]

(5)

where \( I \) stands for the identify matrix. After solving (5), we can get the frequency-domain voltage responses at the specified frequency point.

3.3 Efficient Techniques for Solving the Frequency-Domain Equations

The dimension of the linear equation system (5) could be very huge for a large-scale power network, but the coefficient matrix is very sparse. Below we introduce the efficient techniques for solving the frequency-domain equations on multiple frequency points.

3.3.1 Storing Scheme

The coefficient matrix of (5) is a \( 4 \times 4 \) block matrix, and every block is a sparse matrix or a zero matrix. We store these sparse matrix blocks separately, so that they can be reused for different frequency points. The multiplications of \( KA_L \) and \( KR \) are performed prior to the equation solution procedure. Due to the property of matrix \( A_L \), \( KA_L \) is still a sparse matrix, with a little more non-zero entries than \( K \). On the other hand, if \( R \) is a diagonal matrix, \( KR \) has the same sparse pattern as \( K \). Then, each matrix block is stored with the compressed sparse row (CSR) scheme [20]. The CSR scheme and its variants are very efficient for storing general sparse matrix, and suitable for iterative equation solvers like GMRES [20], [21]. With this storing scheme, the coefficient matrix is established only once, ignoring the frequency-dependent factor \( \omega \). For different frequency points, we only multiply with the \( \omega \) while performing the matrix-vector multiplication in the iterative solution procedure. This technique is very efficient for solving multiple equations for different frequency points.

3.3.2 Rescaling and Preconditioning

The convergence rate of iterative equation solver is mostly related with the condition number of the coefficient matrix. The rescaling technique, which balances the order of magnitude of the coefficients in (5), is used here to reduce the matrix condition number. We find out that the non-zero entries in \( KR \) have much less value than that in matrix \( G \). So, a rescaling factor \( \delta \) is multiplied to the last two block rows of (5), resulting in

\[
\begin{bmatrix}
  G & -\omega C & A_L^T & 0 \\
  \omega C & G & 0 & A_L^T \\
  -\delta KA_L & 0 & \delta KR & -\delta \omega L \\
  0 & -\delta KA_L & \delta \omega L & \delta KR
\end{bmatrix}
\begin{bmatrix}
  V_{nre} \\
  V_{nim} \\
  I_{Lre} \\
  I_{Lim}
\end{bmatrix}
= \begin{bmatrix}
  -A_L^T I_{sre} \\
  -A_L^T I_{sim} \\
  0 \\
  0
\end{bmatrix}
\]

(6)

The value of \( \delta \) is selected to be the ratio of a typical non-zero entry of \( G \) to a typical non-zero entry of \( KR \).

After rescaling, the condition number of the coefficient matrix is largely reduced, and the convergence rate of GMRES algorithm is remarkably improved. In Table 1, the condition numbers for the coefficient matrices before and after the rescaling are listed. They correspond to two examples of P/G grid and two different frequencies. From the table, it is clear that the condition number is dramatically reduced.

The preconditioning technique is necessary to guarantee the fast convergence of GMRES algorithm. The Jacobi preconditioner is the most economical one with almost no extra computational cost, and sometimes very efficient [20], [21]. However, since there may be zero diagonal in matrix \( G \), the Jacobi preconditioner cannot be directly applied here. At the fewer positions where the diagonal item of \( G \) is zero, we can assign one to the corresponding item of preconditioner.

More sophisticated preconditioners can be constructed to improve the convergence rate. The ILU (incomplete LU factorization) is such a technique. With some dropping strategy, the insignificant entries are dropped during the LU factorization procedure of the coefficient matrix. This results in two sparse triangular matrices \( L \) and \( U \), and their product approximates the coefficient matrix. Then, \( M = U^{-1}L^{-1} \) is used as the preconditioner matrix. Among different ILU strategies, the ILUTP (ILU with threshold and pivoting) technique uses threshold values to drop the small items produced in the LU factorization, and the column pivoting technique to prevent the ILU procedure from failing due to zero pivot. So, the ILUTP preconditioner has the best adaptability, while guaranteeing the stability of ILU factorization [20]. In this work, we employ the ILUTP preconditioner.

<table>
<thead>
<tr>
<th>Cases</th>
<th>Frequency</th>
<th>Before rescaling</th>
<th>After rescaling</th>
</tr>
</thead>
<tbody>
<tr>
<td>case1</td>
<td>100 Hz</td>
<td>1.04 x 10^{13}</td>
<td>1.72 x 10^{6}</td>
</tr>
<tr>
<td></td>
<td>3.5 GHz</td>
<td>1.48 x 10^{13}</td>
<td>3.82 x 10^{6}</td>
</tr>
<tr>
<td>case2</td>
<td>100 Hz</td>
<td>1.77 x 10^{13}</td>
<td>3.33 x 10^{6}</td>
</tr>
<tr>
<td></td>
<td>3.5 GHz</td>
<td>2.69 x 10^{13}</td>
<td>8.31 x 10^{6}</td>
</tr>
</tbody>
</table>

Table 1: The comparison of condition number.
and demonstrate its high efficiency in solving the frequency-domain equations.

3.3.3 Recycling among Different Frequencies

As explained in Sect. 3.3.1, most computation for forming the coefficient matrix of (6) is frequency-independent and can be recycled among different frequencies. Besides, the solution of the previous frequency-domain equation may also be recycled. For lower frequencies, we observed that the coefficient matrix varies little for adjacent frequency points. The solution of the previous equation can then be used as the initial guess of the GMRES iteration for the next frequency point. Another kind of recycling can be performed on the generation of ILUTP preconditioner, since matrices \( L \) and \( U \) varies little for similar coefficient matrices. So, the \( L \) and \( U \) produced for a previous frequency may be reused for subsequent several frequencies, without loss of effectivity. Numerical results show that with these recycling techniques, the solution time of the equations for the lower frequencies can be further reduced. While for high-frequency points (the equation conditioning is not good), the recycling of initial guess has no benefit and the \( L \) and \( U \) matrices should not be recycled.

3.4 Algorithm Flow and Discussion

The proposed method can be summarized as follows:

1. Convert the current sources from time-domain waveform to frequency-domain expression with Laplace transform.
2. Form the frequency-domain Eq. (6) with extracted reluctance, resistance and capacitance parameters.
3. Solve Eq. (6) using GMRES method with the proposed techniques to get the frequency-domain voltage responses.
4. Repeat the steps 2 and 3 for each frequency point to get the voltage responses for all the frequency-domain voltage responses.
5. Obtain the time-domain voltage response using the vector fitting method.

With the proposed techniques, the computational complexity of solving one equation is about \( O(N^\alpha) \), where \( N \) is the number of unknowns in (6), and \( \alpha \) is a quantity between 1 and 2. Since the logarithmic scale sampling of frequency is adopted, the number of frequency points is of \( O(\log f_{\text{max}}) \), where \( f_{\text{max}} \) is the upper bound of frequency. The time complexity of vector fitting is \( O(N_\alpha \cdot \log f_{\text{max}}) \), where \( N_\alpha \) is the order of approximation [12].

For large-scale power network, the time for solving frequency-domain equations dominates the total computational time, because the dimension of (6), \( N \), is much larger than \( N_\alpha \). So the time complexity for solving the frequency-domain linear equation system is about \( O(N^\alpha \cdot \log f_{\text{max}}) \). If the voltage responses of multiple nodes on power network are considered, the time for vector fitting will be multiplied by the number of output nodes \( N_{\text{out}} \). For analysis of maximum voltage variation, only the nodes at the lowest level of P/G grid are considered. Therefore, \( N_{\text{out}} \) is a small number.

From above analysis, it is clear that the memory usage and computational time for solving the frequency-domain equations are efficiently reduced by utilizing the partial reluctance and related techniques. Moreover, the total computational time is proportional to the number of frequency samples, not related to the number of time steps as in a conventional time-domain transient simulation. So, the proposed simulation method has advantages over the conventional time-domain transient simulations.

4. Numerical Results

The proposed simulation method is implemented in C and Matlab programming language, and is called FBS (Frequency-domain Based Simulator). A Matlab program is written to take in the frequency-domain responses and convert them to the time-domain voltage waveform with the help of vector fitting [22]. We compare the proposed simulation method with the commercial simulator HSPICE, and the simulator INDUCTWISE [4],[24], which can handle the partial reluctance. All experiments are carried out on a Linux server with Intel(R) Xeon(R) CPU of 2.33 GHz, except for that indicated explicitly.

4.1 Accuracy Validation

For mesh structured power networks with complete inductive model using partial reluctance, we get the frequency-domain responses at 38 frequency points spreading from DC to 3.5 GHz. Figure 4 shows the result of the frequency-domain voltage responses and its fitting result with vector fitting technique. In Fig. 5, the time-domain voltage waveform recovered from the partial fractional expression is compared with that obtained from transient simulation of
HSPICE. From Fig. 5, we see that the voltage waveforms obtained with the both method have little discrepancy. The relative error is only 1.9% for the minimum voltage and 1.0% for the maximum voltage. This validates the accuracy of the proposed method.

4.2 Efficiency Validation

For five test cases of P/G network with varied number of wire segments, the simulation times of the proposed method with two preconditioners, HSPICE, and INDUCTWISE are listed in Table 2. The ILUTP preconditioner is implemented through the MATLAB function ilu [23]. We use a C program to establish the frequency-domain equation and then dump the coefficient matrix and right-hand sides to a data file. A MATLAB program is written to read in the data file and solve the multiple frequency-domain equations with the ILUTP preconditioned GMRES algorithm. The data in the fifth column of Table 2 include these two parts of computational time, excluding the time for dumping and reading the data file. Because the INDUCTWISE program can not run on the Linux server, we run it on a Windows machine with Intel(R) Core(TM)2 CPU of 1.66 GHz and 2.5 G memory. Considering the running time ratio of a same problem on the both machines, the computational time of INDUCTWISE on the Linux server is estimated and listed in the seventh column of Table 2. The last two columns of Table 2 show the speedup ratios of our method with ILUTP preconditioner to HSPICE and INDUCTWISE, respectively. For the last three test cases, HSPICE results are not available due to out-of-memory error. For the last case, INDUCTWISE halts about about half a hour without any message or result. This may be caused by its inside sparse direct equation solver [24], which consumes a large amount of memory due to the “fill in” phenomena. Both FBS-Jacobi and FBS-ILUTP use the proposed rescaling and recycling techniques. The rescaling technique is used to guarantee the convergence of GMRES algorithm. The time of the proposed method does not includes that for converting the frequency-domain responses to time-domain waveform through the vector fitting technique, which is only 0.2 second per output node.

From the table we can see, the proposed method is able to handle large-scale P/G structures that HSPICE or INDUCTWISE can not afford. The largest case includes more than 100,000 wire segments, corresponding to the equation with order larger than 400,000. This advantage of our method is due to that iterative solver consumes less memory than the direct sparse solver. Compared with INDUCTWISE, the computational time of our method with Jacobi preconditioner is of the same order. The method with ILUTP preconditioner is several times faster than INDUCTWISE, and the speedup ratio increases with the problem size. The speedup ratio of our method to HSPICE is at least several hundreds. With the computational times in Table 2, we can also validate the complexity of the proposed GMRES-based equation solver, which shows that it is of $O(N^{1.37})$ for the five test cases, where $N$ is the number of unknowns.

4.3 More Details about the GMRES Solution

In this subsection, we present more details about the preconditioned GMRES solution, and compare it with the direct equation solver. The “/” operator in MATLAB is used as the state-of-the-art direct solver, which employs the UMFPACK algorithm for general spares matrix [25].

For the fourth and fifth test cases, we compare the efficiency of the preconditioned GMRES algorithm with the MATLAB “/”, on two frequency points. One is a middle frequency (6.5 MHz), and the other is the highest frequency. The GMRES algorithm is preconditioned with the ILUTP technique, where the setting parameters are droptol=0.2 and thresh=0.5. The former parameter makes the resulting $L$ and $U$ matrix very sparse, while the latter parameter well

![Fig. 5 Comparison of the voltage responses obtained by HSPICE and the proposed method.](image)

Table 2 The comparison of the FBS with different preconditioners, HSPICE and INDUCTWISE on computational time (in unit of second).

<table>
<thead>
<tr>
<th>Cases</th>
<th>#nodes</th>
<th>#segments</th>
<th>FBS-Jacobi</th>
<th>FBS-ILUTP</th>
<th>HSPICE</th>
<th>INDUCTWISE</th>
<th>Speedup to HSPICE</th>
<th>Speedup to INDUCTWISE</th>
</tr>
</thead>
<tbody>
<tr>
<td>case1</td>
<td>348</td>
<td>344</td>
<td>2.0</td>
<td>1.3</td>
<td>159.7</td>
<td>2.5</td>
<td>122</td>
<td>1.9</td>
</tr>
<tr>
<td>case2</td>
<td>808</td>
<td>804</td>
<td>8.2</td>
<td>3.1</td>
<td>4028.8</td>
<td>9.6</td>
<td>1300</td>
<td>3.1</td>
</tr>
<tr>
<td>case3</td>
<td>1460</td>
<td>1456</td>
<td>23.6</td>
<td>5.8</td>
<td>N.A.</td>
<td>23.7</td>
<td>N.A.</td>
<td>4.1</td>
</tr>
<tr>
<td>case4</td>
<td>10832</td>
<td>10828</td>
<td>948</td>
<td>82.6</td>
<td>N.A.</td>
<td>434.5</td>
<td>N.A.</td>
<td>5.3</td>
</tr>
<tr>
<td>case5</td>
<td>103344</td>
<td>103340</td>
<td>66726</td>
<td>3221</td>
<td>N.A.</td>
<td>N.A.</td>
<td>N.A.</td>
<td>N.A.</td>
</tr>
</tbody>
</table>
balances the factorization between stability and efficiency. In Table 3, the computational times of \("\) and GMRES solution are listed, along with the time for ILU factorization and the number of GMRES iterations. For the largest cases with 413368 unknowns, the \("\) encounters the out-of-memory error. From the table we can see the ILUTP preconditioned GMRES is several to several tens times faster than \("\). Actually, the ILUTP produces very sparse L and U matrices, but improves the convergence rate of GMRES remarkably. We have also examined the error of GMRES solution. Regarding the results of \("\), as the golden value, the GMRES solution has at most 2% error.

5. Conclusions

An efficient framework is proposed for the dynamic P/G network analysis with modeling of complete inductive effects. This work includes two main contributions:

1. A frequency-domain based simulation method is developed to take advantage of the sparsiﬁed reluctance matrix. And, the method can collaborate with the frequency-dependent parameters to model the high-frequency effect.

2. The techniques of storing sparse matrices, rescaling, preconditioning, and recycling are proposed to enable fast GMRES solution of the frequency-domain circuit equations.

Numerical results show that the proposed simulation method has remarkable advantages over the conventional time-domain simulation tool HSPICE and INDUCTWISE, and is able to simulate the complete parasitic effects for large-scale P/G structures.

Because the frequency-domain based simulation method has large potential for parallel computation, the parallelization of the proposed method would be considered in a future work.

Acknowledgments

This work was supported by the Tsinghua National Laboratory for Information Science and Technology (TNList) Cross-discipline Foundation, and in part by National High-Tech Research and Development (863) Program of China (No.2009AA01Z126).

References


Shan Zeng received the B.S. degree in computer science from University of Electronic Science and Technology, Chengdu, China in 2004. She received the Ph.D. degree in computer science at Tsinghua University, Beijing, China in 2009. Now she is with the School of Software, China University of Geosciences, Beijing 100083, China. Her main research interests include interconnect parasitic extraction and PG analysis.

Wenjian Yu received the B.S. and Ph.D. degrees in computer science from Tsinghua University, Beijing, China, in 1999 and 2003, respectively. In 2003, he joined Tsinghua University, where he is an Associate Professor with the Department of Computer Science and Technology. He has visited the Computer Science and Engineering Department of the University of California, San Diego (UCSD), for several times during the period from September 2005 to January 2008. His research interests include parasitic parameter extraction of interconnects in VLSI circuits, direct boundary-element analysis of electromagnetic fields, and modeling and simulation of VLSI interconnects. Dr. Yu is a Technical Program Subcommittee member of the ACM/IEEE Asia South-Pacific Design Automation Conference in 2005, 2007 and 2008. He was the recipient of the distinguished Ph.D. Award from Tsinghua University in 2003, and has served as a reviewer for the IEEE Transactions on Computer-Aided Design from 1994 to 2003. He is a recipient of the Best Paper Awards, IEEE Transactions on Computer-Aided Design, in 1997 and 2002. He is the recipient of the NCR Excellence in Teaching Award, School of Engineering, University of California, San Diego, in 1991.

Xianlong Hong graduated from Tsinghua University, Beijing, China in 1964. He was a Visiting Scholar at the University of California, Berkeley, and worked in the group of Prof. E.S. Kuh from April 1991 to October 1992 and from June to September in 1993. Since 1988, he has been a professor in the Department of Computer Science and Technology, Tsinghua University. His research interests include VLSI layout algorithms and design automation systems. He is IEEE Fellow and the Senior Member of Chinese Institute of Electronics.

Chung-Kuan Cheng received the B.S. and M.S. degrees in electrical engineering from National Taiwan University, Taipei, Taiwan, R.O.C., and the Ph.D. degree in electrical engineering and computer sciences from the University of California, Berkeley, in 1984. From 1984 to 1986, he was a Senior Computer-Aided Design (CAD) Engineer at Advanced Micro Devices, Inc. In 1986, he joined the University of California, San Diego, where he is a Professor in the Computer Science and Engineering Department and an Adjunct Professor in the Electrical and Computer Engineering Department. In 1999, he served as a Chief Scientist with the Mentor Graphics. His research interests include network optimization and design automation on microelectronic circuits. Dr. Cheng was an Associate Editor of IEEE Transactions on Computer-Aided Design from 1994 to 2003. He is a recipient of the Best Paper Awards, IEEE Transactions on Computer-Aided Design, in 1997 and 2002. He is the recipient of the NCR Excellence in Teaching Award, School of Engineering, University of California, San Diego, in 1991.