G.I. Stamoulis

University of Thessaly, Iolcus, Thessaly, Greece

Are you G.I. Stamoulis?

Claim your profile

Publications (27)5.08 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Efficient analysis of massive on-chip power delivery networks is among the most challenging problems facing the EDA industry today. Due to Joule heating effect and the temperature dependence of resistivity, temperature is one of the most important factors that affect IR drop and must be taken into account in power grid analysis. However, the sheer size of modern power delivery networks (comprising several thousands or millions of nodes) usually forces designers to neglect thermal effects during IR drop analysis in order to simplify and accelerate simulation. As a result, the absence of accurate estimates of Joule heating effect on IR drop analysis introduces significant uncertainty in the evaluation of circuit functionality. This work presents a new approach for fast electrical-thermal co-simulation of large-scale power grids found in contemporary nanometer-scale ICs. A state-of-the-art iterative method is combined with an efficient and extremely parallel preconditioning mechanism, which enables harnessing the computational resources of massively parallel architectures, such as graphics processing units (GPUs). Experimental results demonstrate that the proposed method achieves a speedup of 66.1X for a 3.1M-node design over a state-of-the-art direct method and a speedup of 22.2X for a 20.9M-node design over a state-of-the-art iterative method when GPUs are utilized.
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2013; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Efficient analysis of massive on-chip power delivery networks is among the most challenging problems facing the EDA industry today. In this paper, we present a new preconditioned iterative method for fast DC and transient simulation of large-scale power grids found in contemporary nanometer-scale ICs. The emphasis is placed on the preconditioner which reduces the number of iterations by a factor of 5X for a 2.6M-node industrial design and by 72.6X for a 6.2M-node synthetic benchmark, compared with incomplete factorization preconditioners. Moreover, owing to the preconditioner's special structure that allows utilizing a Fast Transform solver, the preconditioning system can be solved in a near-optimal number of operations, while it is extremely amenable to parallel computation on massively parallel architectures like graphics processing units (GPUs). Experimental results demonstrate that our method achieves a speed-up of 214.3X and 138.7X for a 2.6M-node industrial design, and a speed-up of 1610.5X and 438X for a 3.1M-node synthetic design, over state-of-the-art direct and iterative solvers respectively when GPUs are utilized. At the same time, its matrix-less formulation allows for reducing the memory footprint by up to 33% compared to the memory requirements of the best available iterative solver.
    Proceedings of the International Conference on Computer-Aided Design; 11/2012
  • Source
    G. Giannakas, F. Plessas, G. Stamoulis
    [Show abstract] [Hide abstract]
    ABSTRACT: novel 2.45 GHz RF power harvester has been implemented in a 90 nm standard CMOS process. The proposed architecture reduces the threshold voltage (Vth) by employing a pseudo floating-gate (pseudo-FG) new technique and achieves better performance compared with other conventional rectifiers at 90 nm and 2.45 GHz, without additional fabrication cost. The system is initially optimised via a matching-boosting circuit, which has a dominant dual role. Extremely low power (−15.43 dBm) RF signals can be rectified and converted to 1.25 V DC.
    Electronics Letters 04/2012; 48(9-http://dx.doi.org/10.1049/el.2011.3576):522. · 1.04 Impact Factor
  • G Stamoulis, P Kikiras
    Sensors & …. 01/2012;
  • G Stamoulis, P Kikiras
    [Show abstract] [Hide abstract]
    ABSTRACT: Emergency management is an essential capability in modern society. As disasters can happen at any time and can differ from each other considerably, it is necessary to develop a supple framework which can comply with each situation. This article discusses the fire simulation component of i-Protect. i-Protect is an Emergency Management Framework aiming to support emergency management process during the four phases of a crisis: mitigation, preparedness, response and recovery.
    Informatics (PCI), 2012 …. 01/2012;
  • Source
    S Karagiorgou, GI Stamoulis, PK Kikiras
    CTRQ 2012, The Fifth …. 01/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The evolution of wireless sensor technology allows for the provision of enhanced services to miscellaneous application domains. In parallel, Quality of Service (QoS) support becomes necessary to satisfy the needs of these new service models. This paper presents QoS requirements from a service model perspective and describes challenges for QoS support in WSNs. We also provide a review of current efforts in Medium Access Control (MAC) QoS support in WSNs. Then, we investigate various performance metrics of IEEE 802.15.4 standard in order to determine the technological issues that arise. From the outcome of the experiments conducted, using ns-2, we identified that different schemes of services and application scenarios for different ways of deployment, scales of network and traffic load can satisfy diverse user needs and requirements.
    Informatics (PCI), 2011 15th Panhellenic Conference on; 11/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Emergency management is an essential capability in modern society. As disasters can happen at any time and can differ from each other considerably, it is necessary to develop a supple framework which can comply with each situation. The aim of this paper is to provide an overview of the i-Protect Emergency Management Framework and analyze in details each of its components. As a proof of concept, we also present two case studies: an urban chemical explosion and a wildland fire simulation engine.
    Informatics (PCI), 2011 15th Panhellenic Conference on; 11/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this work, the impact of across-chip temperature and power supply voltage variations, on performance predictions in 3D ICs, is investigated. To make this possible, a novel design flow is proposed to perform design exploration of 3D ICs. Power supply voltage and thermal variations are modeled, to allow accurate PPA (power, performance and area) predictions. Using the main parts of this design flow, in a system comprising hundreds of million gates, complicated mechanisms are shown to determine the performance of the system. With increasing number of dies, timing is shown to exhibit 4 distinct regions, where either temperature or voltage drop is the dominant limiting factor. Power consumption does not scale monotonically with increasing die number. As a consequence, optimum system performance is in no way achieved by minimizing temperature and voltage drop, as is assumed in the literature so far. The across-chip temperature and power supply voltage variations are finally shown to cause on average 40% increase in timing and 53% decrease in power consumption, compared to the assumption of nominal conditions.
    18th IEEE International Conference on Electronics, Circuits and Systems, ICECS 2011, Beirut, Lebanon, December 11-14, 2011; 01/2011
  • L Perlepes, G Stamoulis, P Kikiras
    … 2011, The Fifth …. 01/2011;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Validating the robustness of power distribution in modern IC design is a crucial but very difficult problem, due to the vast number of possible working modes and the high operating frequencies which necessitate the modeling of power grid as a general RLC network. In this paper we provide a characterization of the worst-case current waveform excitations that produce the maximum voltage drop among all possible working modes of the IC. In addition, we give a practical methodology to estimate these worst-case excitations on the basis of a sample of the excitation space acquired via plain circuit simulation. In the course of characterizing the worst-case excitations we also establish that the voltage drop function for RLC grid models has nonnegative coefficients, which has been an open problem so far.
    Computer-Aided Design (ICCAD), 2010 IEEE/ACM International Conference on; 12/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this work, an overview of the state-of-the-art of the design techniques of power harvesting (rectifying) circuits is presented. The evolution of each circuit, the advantages and design constraints, are investigated and compared. Furthermore, a novel 2.45 GHz power-harvesting circuit is implemented in 90 nm CMOS. Using voltage and power conversion efficiency as a FOM, the optimum rectifier topology is determined. When input power is -19.73 dBm, the proposed rectifier allows improving the Power and the Voltage Conversion Efficiency, achieving a PCE of 14.68% (for RL=1MΩ) and a VCE of 29.21%.
    Electronics, Circuits, and Systems (ICECS), 2010 17th IEEE International Conference on; 01/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: Voltage drops are one of the most stringent problems in modern IC implementation, which is exacerbated by the ever decreasing transistor sizes and interconnect line widths. In order to find the true worst case voltage drop that a power net of a design might suffer, the designer would have to check the voltage drops that occur from the simulation of all possible input vector pairs of a design. This is a prohibitive amount of simulations for modern ICs that have hundreds of inputs. Consequently, designers face two basic challenges, fast and accurate estimation of worst case voltage-drop and accurate modeling of the power distribution network. In this paper we present a voltage-drop aware tool for power grid analysis and verification based on a statistical engine, which can estimate the true worst case voltage drops on a design with a typical confidence level of 99%. The statistical engine is based on extensions to the Extreme Value Theory (EVT) which is a pertinent field of statistics for the estimation of the unknown maximum of a related population from one (or more) of its samples. The paper shows how the statistical engine can take input from gate-level simulation of digital logic, combined with transient simulation of the power and ground network with inductance-aware (RLCK) models. Using these techniques, a designer can estimate the true worst case voltage drop on each and every contact of the power and ground distribution network of a digital design, using a relatively small amount of input vectors, thus greatly reducing the turnaround time for power integrity verification.
    17th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2010, Athens, Greece, 12-15 December, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this work, we present a hardware architecture prototype for the various types of transforms and the accompanying quantization, supported in H.264 baseline profile video encoding standard. The proposed architecture achieves high performance and can satisfy quad full high definition (QFHD) (3840middot2160@150Hz) coding. The transforms are implemented using only add and shift operations, which reduces the computation overhead. A modification in the quantization equations representation is suggested to remove the absolute value and resign operation stages overhead. Additionally, a post-scale Hadamard transform computation is presented. The architecture can achieve a reduction of about 20% in power consumption, compared to existing implementations.
    Multimedia and Expo, 2009. ICME 2009. IEEE International Conference on; 08/2009
  • Source
    D. Bountas, G. Stamoulis, N. Evmorfopoulos
    [Show abstract] [Hide abstract]
    ABSTRACT: Accurate simulation of digital circuits is an essential part of the design process. High precision models are generally used to confirm logic behavior and estimate power dissipation, which has become an extremely important design parameter. Unfortunately high precision analysis is expensive in computer execution time, and there is always a trade-off between accuracy and speed. This work proposes a new circuit simulation approach by storing a set of pre-characterized transition configurations for each standard library cell in a lookup table. The lookup table contains information about the voltage and the current transient waveform produced by SPICE simulation. The method achieves good accuracy levels for yielding the total or partial current waveform of a circuit in significantly less time compared to SPICE or other commercial tools.
    Computer Design, 2008. ICCD 2008. IEEE International Conference on; 11/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The H.264 video coding standard can achieve considerably higher coding efficiency than previous video coding standards. The keys to this high coding efficiency are the two prediction modes (Intra & Inter) provided by H.264 which adopt many new features such as variable block size searching, motion vector prediction etc. However, these result in a considerably higher encoder complexity that adversely affects speed and power, which are both significant for the mobile multimedia applications targeted by the standard. Therefore, it is of high importance to design architectures that minimize the speed and power overhead of the prediction modes. In this paper we present a new algorithm, and the architecture that implements it, that can replace the standard sum of absolute differences (SAD) approach in the two main prediction modes, supports the variable block size motion estimation (VBSME) as it is defined in the standard and provide a power efficient hardware implementation without perceivable degradation in coding efficiency or video quality.
    Application -specific Systems, Architectures and Processors, 2007. ASAP. IEEE International Conf. on; 08/2007
  • [Show abstract] [Hide abstract]
    ABSTRACT: Networks of sensors are continuously gaining ground in all types of industry applications and are believed to be evolving in a way similar to the evolution of the first interconnected computer systems into what we call today the Internet. A heterogeneous infrastructure is thus about to emerge as a dense web of rich information sources that will transform the World Wide Web into what has been called the "Real World Web" (RWW). The authors hereby assimilate the impact of this transformation process, the actors involved, the operational and business models associated with it.
    Telecommunication Techno-Economics, 2007. CTTE 2007. 6th Conference on; 07/2007
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The H.264 video coding standard cani achieve considerably higher coding efficiency than previous video coding standards. The keys to this high coding efficiency are the two prediction modes (Intra & Inter) provided by H.264. Unfortunately, these result in a considerably higher encoder complexity that adversely affects speed and power, which are both significant for the mobile multimedia applications targeted by the standard. Therefore, it is of high importance to design architectures that minimize the speed and power overhead of the prediction modes. In this paper we present a new algorithm, and the logic transformations that enable it, that can replace the standard Sum of Absolute Differences (SAD) approach in the two main prediction modes, and provide a power efficient hardware implementation without perceivable degradation in coding efficiency or video quality.
    Low Power Electronics and Design, 2006. ISLPED'06. Proceedings of the 2006 International Symposium on; 11/2006
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A switched-capacitor integrated system is presented in this work that attains sub-fF measurement resolution in integrated capacitive sensors, with 1.5-kHz bandwidth and 50-muW average power consumption in continuous function mode. The proposed design employs a pair of nonoverlapping clocks and an operational transconductance amplifier (OTA) that can be made as simple as a basic differential pair. The system exhibits 0.8% linearity error and 0.01 fF/degC temperature drift. It is appropriate for differential, absolute, and ratiometric capacitance measurements, and shows robustness against interconnection parasitics, transistor dimensional mismatch, and process variations, which are an important feature in the case of sensor-die CMOS postprocessing
    IEEE Sensors Journal 07/2006; · 1.48 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The Jiles-Atherton (JA) theory of hysteresis is currently used in the majority of commercial CAD tools, mainly due to its implementation simplicity in fast and stable algorithms. The JA model provides precise results in the case of isotropic, polycrystalline, multidomain magnetic devices, where flux-reversal is governed by pinning mechanisms. Dynamic response of such devices, including Eddy-current loss and magnetic resonance, can also be accurately modeled. However, JA theory is not applied for three-dimensional (3-D) magnetization simulations and does not account for anisotropy that affects severely hysteresis curves of single-domain, thin-film devices, which are usually incorporated in miniature inductive sensors and actuators. In that case, the Stoner-Wohlfarth (SW) theory can be applied, which, however, does not account for dynamic response and incremental energy loss. In this work, we employ a virtual 3-D anisotropy-field vector calculated with SW theory that introduces magnetic feedback to the classical equation of Paramagnetism, in order to derive a proper 3-D "input" for the JA algorithm. This way, a hybrid 3-D JA/SW model is developed, which incorporates both models into one single formulation, capable of modeling simultaneously: 1) temperature effects, 2) pinning and Eddy-current loss, 3) magnetic resonance, and 4) uniaxial anisotropy, the orientation of which can be simulated to vary with time. The model that owns a solid physical basis has been implemented in a computation-efficient, stable algorithm capable of functioning with arbitrary excitation-field input. The algorithm has been successfully applied to model the behavior of a series of miniature Fluxgate magnetometers based on the Matteucci effect of thin glass-covered magnetic wires
    IEEE Sensors Journal 07/2006; · 1.48 Impact Factor