Duo Li

Synopsys, Mountain View, California, United States

Are you Duo Li?

Claim your profile

Publications (20)2.91 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Efficient temperature estimation is vital for designing thermally efficient, lower power and robust integrated circuits in nanometer regime. Thermal simulation based on the detailed thermal structures no longer meets the demanding tasks for efficient design space exploration. The compact and composable model-based simulation provides a viable solution to this difficult problem. However, building such thermal models from detailed thermal structures was not well addressed in the past. In this article, we propose a new compact thermal modeling technique, called ThermComp, standing for thermal modeling with composable modules. ThermComp can be used for fast thermal design space exploration for multicore microprocessors. The new approach builds the composable model from detailed structures for each basic module using the finite difference method and reduces the model complexity by the sampling-based model order reduction technique. These composable models are then used to assemble different multicore architecture thermal models and realized into SPICE-like netlists. The resulting thermal models can be simulated by the general circuit simulator SPICE. ThermComp tries to preserve the accuracy of fine-grained models with the speed of coarse-grained models. Experimental results on a number of multicore microprocessor architectures show the new approach can easily build accurate thermal systems from compact composable models for fast architecture thermal analysis and optimization and is much faster than the existing HotSpot method with similar accuracy.
    ACM Transactions on Design Automation of Electronic Systems (TODAES). 03/2013; 18(2).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a new parameterized dynamic thermal modeling algorithm for emerging thermal-aware design and optimization for high-performance microprocessor design at architecture and package levels. Compared with existing behavioral thermal modeling algorithms, the proposed method can build the compact models from more general transient power and temperature waveforms used as training data. Such an approach can make the modeling process much easier and less restrictive than before and, thus, more amenable for practical measured data. The new method, called ParThermSID, consists of two steps. First, the response surface method based on second-order polynomials is applied to build the parameterized models at each time point for all of the given sampling nodes in the parameter space. Second, an improved subspace system identification method, called ThermSID, is employed to build the discrete state space models, by construction of the Hankel matrix and state space realization, for each time-varying coefficient of the polynomials generated in the first step. To overcome the overfitting problems of the subspace method, the new method employs an overfitting mitigation technique to improve model accuracy and predictive ability. Experimental results on a practical quad-core microprocessor show that the generated parameterized thermal model matches the given data very well. The compact models generated by ParThermSID also offer two orders of magnitude speedup over the commercial thermal analysis tool FloTHERM on the given example. The results also show that ThermSID is more accurate than the existing ThermPOF method.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 01/2012; 20:211-224. · 1.22 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a new voltage IR drop analysis approach for large on-chip power delivery networks. The new approach is based on recently proposed sampling based reduction technique to reduce the circuit matrices before the simulation. Due to the disruptive nature of tap current waveforms in typical industry power grid networks, input current sources typically has wide frequency power spectrum. To avoid the excessively sampling, the new approach introduces an error check mechanism and on-the-fly error reduction scheme during the simulation of the reduced circuits to improve the accuracy of estimating the the large IR drops. The proposed method presents a new way to combine model order reduction and simulation to achieve the overall efficiency of simulation. The new method can also easily trade errors for speed for different applications. Experimental results show the proposed IR drop analysis method can significantly reduce the errors of the existing ETBR method at the similar computing cost, while it can have 10X and more speedup over the the commercial power grid simulator in UltraSim with about 1--2% errors on a number of real industry benchmark circuits.
    Proceedings of the 15th Asia South Pacific Design Automation Conference, ASP-DAC 2010, Taipei, Taiwan, January 18-21, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we propose a new architecture-level parameterized dynamic thermal behavioral modeling algorithm for emerging thermal-related design and optimization problems for high-performance multicore microprocessor design. We propose a new approach, called ParThermPOF, to build the parameterized thermal performance models from the given accurate architecture thermal and power information. The new method can include a number of variable parameters such as the locations of thermal sensors in a heat sink, different components (heat sink, heat spreader, core, cache, etc.), thermal conductivity of heat sink materials, etc. The method consists of two steps: first, a response surface method based on low-order polynomials is applied to build the parameterized models at each time point for all the given sampling nodes in the parameter space. Second, an improved Generalized Pencil-Of-Function (GPOF) method is employed to build the transfer-function-based behavioral models for each time-varying coefficient of the polynomials generated in the first step. Experimental results on a practical quad-core microprocessor show that the generated parameterized thermal model matches the given data very well. The compact models by ParThermPOF offer two order of magnitudes speedup over the commercial thermal analysis tool FloTHERM on the given examples. ParThermPOF is very suitable for design space exploration and optimization where both time and system parameters need to be considered.
    ACM Trans. Design Autom. Electr. Syst. 01/2010; 15.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the most critical challenges in today's CMOS VLSI design is the lack of predictability in chip performance at design stage. One of the process variabilities comes from the voltage drop variations in on-chip power distribution networks. In this paper, we present a novel analysis approach for computing voltage drops of large power grid networks under process variations. The new algorithm is very efficient and scalable for huge networks with a large number of variational variables. This approach, called variational extended truncated balanced realization (varETBR), is based on model order reduction techniques to reduce the circuit matrices before the variational simulation. It performs the parameterized reduction on the original system using variation-bearing subspaces. After the reduction, Monte Carlo based statistical simulation is performed on the reduced system and the statistical responses of the original system are obtained thereafter. varETBR calculates variational response Grammians by Monte Carlo based numerical integration considering both system and input source variations in generating the projection subspace. varETBR is very scalable for the number of variables and flexible for different variational distributions and ranges as demonstrated in experimental results. Experimental results, on a number of IBM benchmark circuits up to 1.6 million nodes, show that the varETBR can be 1900X faster than the Monte Carlo method and is much more scalable than one of the recently proposed approaches.
    Integration, the VLSI Journal. 01/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Efficient temperature estimation is critical for designing thermal efficient, low power and robust integrated circuits in nanometer regime. Thermal simulation starts from the detailed thermal structures by solving thermal diffusion equations no longer meets demanding tasks for efficient design space exploration. Compact and composable model-based simulation provides a viable solution to this difficult problem. However, building such thermal models from detailed thermal structures was not well addressed in the past. In this paper, we propose a new thermal compact modeling techniques for fast thermal analysis in the context of multi-core microprocessors design. The new approach builds the models from detailed structures for each core using finite difference method and reduces the model complexity by sampling-based model order reduction and circuit realization techniques. To improve the reduction efficiency, number of ports of thermal models are first reduced by port merging, which actually leads to coarse grids at the boundaries. The resulting thermal circuits can be simulated by general circuit simulator SPICE. Experimental results on a quad-core microprocessor architecture show that the new approach can easily build accurate thermal systems from the composite compact models. The new thermal systems lead to order of magnitude speedup over standard finite difference models in transient thermal simulation.
    01/2010;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper investigates a new architecture-level thermal characterization problem from a behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multicore microprocessor design. We propose a new approach, called ThermPOF , to build the thermal behavioral models from the measured or simulated thermal and power information at the architecture level. ThermPOF first builds the behavioral thermal model using the generalized pencil-of-function (GPOF) method. Owing to the unique characteristics of transient temperature changes at the chip level, we propose two new schemes to improve the GPOF. First, we apply a logarithmic-scale sampling scheme instead of the traditional linear sampling to better capture the temperature changing behaviors. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, a Krylov subspace-based reduction method is performed to reduce the order of the models in the state-space form. Experimental results on a real quad-core microprocessor show that generated thermal behavioral models match the given temperature very well.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 11/2009; · 1.22 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel statistical analysis approach for large power grid network analysis under process variations. The new algorithm is very efficient and scalable for huge networks with a large number of variational variables. This approach, called varETBR for variational extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the variational simulation. It performs the parameterized reduction on the original system using variation-bearing subspaces. varETBR calculates variational response Gramians by Monte-Carlo based numerical integration considering both system and input source variations for generating the projection subspace. varETBR is very scalable for the number of variables and is flexible for different variational distributions and ranges as demonstrated in experimental results. After the reduction, Monte-Carlo based statistical simulation is performed on the reduced system and the statistical responses of the original system are obtained thereafter. Experimental results, on a number of IBM benchmark circuits up to 1.6 million nodes, show that the varETBR can be 4500X faster than the Monte-Carlo method and is much more scalable than one of the recently proposed approaches.
    Proceedings of the 14th Asia South Pacific Design Automation Conference, ASP-DAC 2009, Yokohama, Japan, January 19-22, 2009; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper investigates a new architecture-level thermal characterization problem from a behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multicore microprocessor design. We propose a new approach, called ThermPOF, to build the thermal behavioral models from the measured or simulated thermal and power information at the architecture level. ThermPOF first builds the behavioral thermal model using the generalized pencil-of-function (GPOF) method. Owing to the unique characteristics of transient temperature changes at the chip level, we propose two new schemes to improve the GPOF. First, we apply a logarithmic-scale sampling scheme instead of the traditional linear sampling to better capture the temperature changing behaviors. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, a Krylov subspace-based reduction method is performed to reduce the order of the models in the state-space form. Experimental results on a real quad-core microprocessor show that generated thermal behavioral models match the given temperature very well.
    IEEE Trans. VLSI Syst. 01/2009; 17:1495-1507.
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a new model order reduction approach for large interconnect circuits using hierarchical decomposition and the Krylov subspace projection-based model order reduction methods. The new approach, called hiePrimor, first partitions a large interconnect circuit into a number of smaller subcircuits and then performs the projection-based model order reduction on each of subcircuits in isolation and on the top-level circuit thereafter. The new approach is very amenable for exploiting the multi-core based parallel computing platforms to significantly speed up the reduction process. Theoretically we show that hiePrimor can deliver the same accuracy as the flat reduction method given the same reduction order and it can also preserve the passivity of the reduced models as well. We also show that partitioning has large impacts on the performance of hierarchical reduction and the minimum-span objective should be required to attain the best performance for hierarchical reduction. The proposed method is suitable for reducing large global interconnects like coupled bus, transmission lines, large clock nets in the post-layout stage. Experimental results demonstrate that hiePrimor can be significantly faster and more scalable than the flat projection methods like PRIMA and be order of magnitude faster than PRIMA with parallel computing without loss of accuracy. Interconnect circuits with up to 4 million nodes can be analyzed in a few minutes even in Matlab by the new method.
    Integration, the VLSI Journal. 01/2009;
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel analysis approach for large on-chip power grid circuit analysis. The new approach, called ETBR for extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the simulation. Different from the (improved) extended Krylov subspace methods EKS/IEKS [2],[3], ETBR performs fast truncated balanced realization on response Gramian to reduce the original system. ETBR also avoids the adverse explicit moment representation of the input signals. Instead, it uses spectrum representation in frequency domain for input signals by fast Fourier transformation. The proposed method is very amenable for threading-based parallel computing, as the response Gramian is computed in a Monte-Carlo-like sampling style and each sampling can be computed in parallel. This contrasts with all the Krylov subspace based methods like the EKS method, where moments have to be computed in a sequential order. ETBR is also more flexible for different types of input sources and can better capture the high frequency contents than EKS, and this leads to more accurate results especially for fast changing input signals. Experimental results on a number of large networks (up to one million nodes) show that, given the same order of the reduced model, ETBR is indeed more accurate than the EKS method especially for input sources rich in high-frequency components. If parallel computing is explored, ETBR can be an order of magnitude faster than the EKS/IEKS method.
    IEICE Transactions. 01/2009; 92-A:3061-3069.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate a new architecture-level thermal characterization problem from behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multi-core microprocessor design. We propose a new approach, called ThermPOF, to build the thermal behavioral models from the measured architecture thermal and power information. ThermPOF first builds the behavioral thermal model using generalized pencil-of-function (GPOF) method. And then to effectively model transient temperature changes, we proposed two new schemes to improve the GPOF. First we apply logarithmic-scale sampling instead of traditional linear sampling to better capture the temperature changing characteristics. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, Krylov subspace based model order reduction is performed to reduce the order of the models in the state-space form. Experimental results on a practical quad-core microprocessor show that generated thermal behavioral models match the measured data very well.
    Design Automation Conference, 2008. ASPDAC 2008. Asia and South Pacific; 04/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel simulation approach for power grid network analysis. The new approach, called ETBR for extended truncated balanced realization, is based on model order reduction techniques to reduce the circuit matrices before the simulation. Different from the (miproved) extended Krylov subspace methods EKS/IEKS [15, 2], ETBR performs fast truncated balanced realization on response Grammian to reduce the original system with the similar computation costs of EKS. ETBR also avoids the adverse explicit moment representation of the input signals. Instead, it uses spectrum representation of input signals by fast Fourier transformation. As a result, ETBR is more flexible for different types of input sources and can better capture the high frequency contents than EKS, and this leads to more accurate results especially for fast changing input signals. Experimental results on a number of large networks (up to one million nodes) show that, given the same order of the reduced model, ETBR is indeed more accurate than the EKS method especially for input sources rich in high-frequency components. ETBR also shows similar computation costs of EKS and less memory consumption than EKS.
    Design, Automation and Test in Europe, 2008. DATE '08; 04/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a new model order reduction approach for large interconnect circuits using hierarchical decomposition and Krylov subspace projection-based model order reduction. The new approach, called hiePrimor, first partitions a large interconnect circuit into a number of smaller subcircuits and then performs the projection-based model order reduction on each of subcircuits in isolation and on the top level circuit thereafter. The new approach can exploit the parallel computing to speed up the reduction process. Theoretically we show hiePrimor can have the same accuracy as the flat reduction method given the same reduction order and it can also preserves the passivity of the reduced models as well. We also show that partitioning is important for hierarchical projection-based reduction and the minimum-span objective should be required to archive best performance for hierarchical reduction. The proposed method is suitable for reducing large global interconnects like coupled bus, transmission lines, large clock nets in the post layout stage. Experimental results demonstrate that hiePrimor can be significantly faster than flat projection method like PRIMA and be order of magnitude faster than PRIMA with parallel computing without loss of accuracy.
    Proceedings of the 13th Asia South Pacific Design Automation Conference, ASP-DAC 2008, Seoul, Korea, January 21-24, 2008; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we investigate a new architecture-level thermal characterization problem from behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance multi-core microprocessor design. We propose a new approach, called ThermPOF, to build the thermal behavioral models from the measured architecture thermal and power information. ThermPOF first builds the behavioral thermal model using generalized pencil-of-function (GPOF) method. And then to effectively model transient temperature changes, we proposed two new schemes to improve the GPOF. First we apply logarithmic-scale sampling instead of traditional linear sampling to better capture the temperature changing characteristics. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. To further reduce the model size, Krylov subspace based model order reduction is performed to reduce the order of the models in the state-space form. Experimental results on a practical quad-core microprocessor show that generated thermal behavioral models match the measured data very well.
    Proceedings of the 13th Asia South Pacific Design Automation Conference, ASP-DAC 2008, Seoul, Korea, January 21-24, 2008; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a new architecture-level parameterized transient thermal behavioral modeling algorithm for emerging thermal related design and optimization problems for high-performance chip-multiprocessor (CMP) design. We propose a new approach, called ParThermPOF, to build the parameterized thermal performance models from the given architecture thermal and power information. The new method can include a number of parameters such as the locations of thermal sensors in a heat sink, different components (heat sink, heat spread, core, cache, etc.), thermal conductivity of heat sink materials, etc. The method consists of two steps: first, response surface method based on low-order polynomials is applied to build the parameterized models at each time point for all the given sampling nodes in the parameter space. Second, an improved generalized pencil-of-function (GPOF) method is employed to build the transfer-function based behavioral models for each time-varying coefficient of the polynomials generated in the first step. Experimental results on a practical quad-core microprocessor show that the generated parameterized thermal model matchs the given data very well. ParThermPOF is very suitable for design space exploration and optimization where both time and system parameters need to be considered.
    2008 International Conference on Computer-Aided Design (ICCAD'08), November 10-13, 2008, San Jose, CA, USA; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study a new architecture level thermal modeling problem from behavioral modeling perspective to address the emerging thermal related analysis and optimization problems for high-performance quad-core microprocessor designs. We propose a new approach to build the thermal behavioral models by using transfer function matrix from the measured thermal and power information at the architecture level. The new method builds behavioral thermal model using generalized pencil-of-function (GPOF) method, which was developed in the communication community to build the rational modeling from the measured data of real-time systems. To effectively model transient temperature changes, we propose two new schemes to improve the GPOF. First we apply logarithmic-scale sampling instead of traditional linear sampling to better capture the temperature changing characteristics. Second, we modify the extracted thermal impulse response such that the extracted poles from GPOF are guaranteed to be stable without accuracy loss. Experimental results on a practical quad-core microprocessor show that generated thermal behavioral models match the measured data very well.
    Behavioral Modeling and Simulation Workshop, 2007. BMAS 2007. IEEE International; 10/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As EDA industry advances to smaller and smaller technology nodes, a tighter link between VLSI circuit manufacturing and physical design is becoming a necessity. This paper introduces several design for manufacturability (DFM) related problems such as critical area reduction, redundant via insertion, chemical-mechanical polishing (CMP), etc. Then the corresponding DFM-aware routing problems are formulated and solved using the proposed routing algorithms, respectively. Experimental results show that great yield enhancement can be obtained with a little runtime burden in routing, which proves the feasibility and effectiveness of considering DFM issues during the routing stage
    Circuits and Systems, 2006. APCCAS 2006. IEEE Asia Pacific Conference on; 01/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A new gridless router to improve the yield of IC layout is presented. The improvement of yield is achieved by reducing the critical areas where the circuit failures are likely to happen. This gridless area router benefits from a novel cost function to compute critical areas during routing process, and heuristically lays the patterns on the chip area where it is less possible to induce critical area. The router also takes other objectives into consideration, such as routing completion rate and nets length. It takes advantage of gridless routing to gain more flexibility and a higher completion rate. The experimental results show that critical areas are effectively decreased by 21% on average while maintaining the routing completion rate over 99%.
    Journal of Computer Science and Technology 01/2007; 22(5):653-660. · 0.48 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A new multilayer gridless area router to improve the yield of IC layout is presented. The improvement of yield is achieved by reducing the critical areas where circuit faults are likely to happen. This gridless area router utilizes a cost function for critical area computation and heuristically lays the patterns on the places where it is less possible to induce critical areas. The router also takes other objectives into consideration, such as routing completion rate and net length. It takes advantage of gridless routing to gain more flexibility and higher completion rate. The results show that the critical area is effectively decreased by 13% on average while maintaining the routing completion rate over 99%.
    Communications, Circuits and Systems, 2005. Proceedings. 2005 International Conference on; 06/2005

Publication Stats

58 Citations
2.91 Total Impact Points

Institutions

  • 2013
    • Synopsys
      Mountain View, California, United States
  • 2007–2012
    • University of California, Riverside
      • Department of Electrical Engineering
      Riverside, California, United States
  • 2009
    • CSU Mentor
      Long Beach, California, United States
  • 2005–2007
    • Tsinghua University
      • Department of Computer Science and Technology
      Beijing, Beijing Shi, China