N. Vijaykrishnan

Pennsylvania State University, University Park, MD, USA

Are you N. Vijaykrishnan?

Claim your profile

Publications (243)35.2 Total impact

  • Source
    Article: New-Age: A Negative Bias Temperature Instability-Estimation Framework for Microarchitectural Components
    [show abstract] [hide abstract]
    ABSTRACT: Degradation of device parameters over the lifetime of a system is emerging as a significant threat to system reliability. Among the aging mechanisms, wearout resulting from Negative Bias Temperature Instability (NBTI) is of particular concern in deep submicron technology generations. While there has been significant effort at the device and circuit level to model and characterize the impact of NBTI, the analysis of NBTI’s impact at the architectural level is still at its infancy. To facilitate architectural level aging analysis, a tool capable of evaluating NBTI vulnerabilities early in the design cycle has been developed that evaluates timing degradation due to NBTI. The tool includes workload-based temperature and performance degradation analysis across a variety of technologies and operating conditions, revealing a complex interplay between factors influencing NBTI timing degradation.
    International Journal of Parallel Programming 04/2012; 37(4):417-431. · 0.57 Impact Factor
  • Source
    Article: Reliability-aware Co-synthesis for Embedded Systems
    Y. Xie, L. Li, M. Kandemir, N. Vijaykrishnan, M. J. Irwin
    [show abstract] [hide abstract]
    ABSTRACT: As technology scales, transient faults have emerged as a key challenge for reliable embedded system design. This paper proposes a design methodology that incorporates reliability into hardware–software co-design paradigm for embedded systems. We introduce an allocation and scheduling algorithm that efficiently handles conditional execution in multi-rate embedded systems, and selectively duplicates critical tasks to detect or correct transient errors, such that the reliability of the system is improved. Two methods are proposed to insert duplicated tasks into the schedule. The improved reliability is achieved by utilizing the otherwise idle computation resources and taking advantage of the overlapping schedule for mutually exclusive tasks in the conditional task graph, such that it incurs no resource or performance penalty.
    Journal of VLSI Signal Processing 04/2012; 49(1):87-99. · 0.73 Impact Factor
  • Chapter: Characterization of Memory Energy Behavior
    H. S. Kim, M. Kandemir, N. Vijaykrishnan, M. J. Irwin
    [show abstract] [hide abstract]
    ABSTRACT: There is a growing consensus that energy consumption is becoming an important issue for a wide variety of systems. Two factors fuel these trends: (1) increasing use of embedded and portable computing devices in everyday life, and (2) very high power dissipation levels reached by current state-of-the-art processors. Consequently, recent years have witnessed a host of studies that address the problem of reducing the energy consumption. Unfortunately, most of the energy-oriented studies are from design automation and architecture domain, and only very recently software-based approaches to the problem have appeared in research papers. In this chapter, we take an early step in evaluating the energy consumption of a selected set of benchmarks from three different application domains. Setting off with the observation that memory energy constitutes a large percentage of overall energy budget, we focus on memory energy behavior and observe the variations in cache and memory energies under different cache configurations. KeywordsEnergy budget–embedded systems–cache memory–block buffering–subbanking
    06/2011: pages 165-180;
  • Article: Variation-Aware Task and Communication Mapping for MPSoC Architecture
    Feng Wang, Yibo Chen, C. Nicopoulos, X. Wu, Y. Xie, N. Vijaykrishnan
    [show abstract] [hide abstract]
    ABSTRACT: As technology scales, the delay uncertainty caused by process variations has become increasingly pronounced in deep submicrometer designs. As a result, a paradigm shift from deterministic to statistical design methodology at all levels of the design hierarchy is inevitable. In this paper, we propose a variation-aware task and communication mapping methodology for multiprocessor system-on-chips that uses network-on-chip communication architecture so that the impact of parameter variations can be mitigated. Our mapping scheme accounts for variability in both the processing cores and the communication links to ensure a complete and accurate model of the entire system. A new design metric, called performance yield and defined as the probability of the assigned schedule meeting the predefined performance constraints, is used to guide both the task scheduling and the routing path allocation procedure. An efficient yield computation method for this mapping complements and significantly improves the effectiveness of the proposed variation-aware mapping algorithm. Experimental results show that our variation-aware mapper achieves significant yield improvements over worst-case and nominal-case deterministic mapper.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 03/2011; · 1.27 Impact Factor
  • Article: Predicting the performance and reliability of future field programmable gate arrays routing architectures with carbon nanotube bundle interconnect
    S. Eachempati, N. Vijaykrishnan, A. Nieuwoudt, Y. Massoud
    [show abstract] [hide abstract]
    ABSTRACT: The authors investigate the performance and reliability of routing architectures in field programmable gate arrays (FPGA) that utilise bundles of single-walled carbon nanotubes (SWCNT) as wires in the FPGA interconnect fabric in future process technologies here. To leverage the performance advantages of nanotube-based interconnect, we explore several important aspects of the FPGA routing architecture including the wire length segmentation distribution and the switch/connection block configurations. The authors also investigate the impact of statistical variations in interconnect properties on FPGA timing yield. The results demonstrate that FPGAs utilising SWCNT bundle interconnect can achieve up to a 54% improvement in area-delay product over the best performing architecture with standard copper interconnect in 22%nm process technology. Furthermore, FPGAs implemented using SWCNT-based interconnect can provide a superior performance-yield trade-off of up to 43% over FPGAs implemented using traditional copper interconnect in future process technologies.
    IET Circuits Devices & Systems 05/2009; · 0.55 Impact Factor
  • Source
    Conference Proceeding: Ultra low power signal processing architectures
    V. Saripalli, S. Mookerjea, S. Datta, N. Vijaykrishnan
    [show abstract] [hide abstract]
    ABSTRACT: An increasing integration of nanoscale sensors is being observed in BioSensing and Biomimetic systems. Power consumption is deemed a major limiter as the complexity of integration increases. Supply voltage based scaling using CMOS is also a challenge due to increasing leakage currents. This work presents alternative devices - Interband Tunnel Field Effect Transistors and Split-Gate Quantum Nanodots to achieve further scaling.
    Biomedical Circuits and Systems Conference, 2008. BioCAS 2008. IEEE; 12/2008
  • Source
    Conference Proceeding: Comparative analysis of NBTI effects on low power and high performance flip-flops
    K. Ramakrishnan, X. Wu, N. Vijaykrishnan, Y. Xie
    [show abstract] [hide abstract]
    ABSTRACT: Mitigating the circuit aging effect in digital circuits has become a very important concern for current and future technology nodes. Negative Bias Temperature Instability (NBTI) is one of the most important circuit aging mechanisms, which can incur timing errors. Flip-flops play a vital role as storage elements in pipelined architectures and are prone to effects of aging. NBTI increases the transistor threshold voltage, affecting the performance of the chip. In this paper, we study the effects of NBTI on the timing characteristics of different types of low power and high performance flip-flops. Factors such as input data probability and temperature which affect the degradation rate are also analyzed.
    Computer Design, 2008. ICCD 2008. IEEE International Conference on; 11/2008
  • Article: Exploring architectural solutions for energy optimisations in bus-based system-on-chip
    [show abstract] [hide abstract]
    ABSTRACT: System-on-chip (SoC) architectures have emerged as ubiquitous option for computation-intensive applications because of the tremendous flexibility provided by them with respect to design reuse and extendibility. Single shared bus architectures have been popularly used as communication channel in such on-chip architectures, by connecting various modules in such systems. However, increasing levels of integration and the number of components connected to the bus render such architectures infeasible. Consequently, such single shared bus architectures fail to scale well with both performance and power aspects. The authors demonstrate here a way to perform memory and bus partitioning and allocate variable frequencies to different bus segments to reduce the power consumption of the system without affecting the performance. The authors use an evolutionary algorithm followed by an iterative search-based frequency allocation algorithm to solve the problem. The effectiveness of the proposed results is validated on a SystemC-based cycle-accurate bus-based SoC simulator.
    IET Computers & Digital Techniques 10/2008; · 0.45 Impact Factor
  • Article: Case Study of Reliability-Aware and Low-Power Design
    [show abstract] [hide abstract]
    ABSTRACT: Based on the proposed reliability characterization model, reliability-aware and low-power design is illustrated for the first time as a design methodology to balance reliability enhancement and power reduction. Low-power and reliable SRAM cell design, reliable dynamic voltage scaling (DVS) algorithm design, and voltage island partitioning and floorplanning for reliable system-on-a-chip (SOC) design are demonstrated as case studies of this new design methodology.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 08/2008; · 1.22 Impact Factor
  • Conference Proceeding: Analysis and solutions to issue queue process variation
    [show abstract] [hide abstract]
    ABSTRACT: The last few years have witnessed an unprecedented explosion in transistor densities. Diminutive feature sizes have enabled microprocessor designers to break the billion-transistors per chip mark. However various new reliability challenges such as process variation (PV) have emerged that can no longer be ignored by chip designers. In this paper, we provide a comprehensive analysis of the effects of PV on the microprocessorpsilas Issue Queue. Variations can slow down issue queue entries and result in as much as 20.5% performance degradation. To counter this, we look at different solutions that include instruction steering, operand- and port- switching mechanisms. Given that PV is non-deterministic at design-time, our mechanisms allow the fast and slow issue-queue entries to co-exist in turn enabling instruction dispatch, issue and forwarding to proceed with minimal stalls. Evaluation on a detailed simulation environment indicates that the proposed mechanisms can reduce performance degradation due to PV to a low 1.3%.
    Dependable Systems and Networks With FTCS and DCC, 2008. DSN 2008. IEEE International Conference on; 07/2008
  • Source
    Conference Proceeding: MIRA: A Multi-layered On-Chip Interconnect Router Architecture
    [show abstract] [hide abstract]
    ABSTRACT: Recently, Network-on-Chip (NoC) architectures have gained popularity to address the interconnect delay problem for designing CMP / multi-core/SoC systems in deep sub-micron technology. However, almost all prior studies have focused on 2D NoC designs. Since three dimensional (3D) integration has emerged to mitigate the interconnect delay problem, exploring the NoC design space in 3D can provide ample opportunities to design high performance and energy-efficient NoC architectures. In this paper, we propose a 3D stacked NoC router architecture, called MIRA, which unlike the 3D routers in previous works, is stacked into multiple layers and optimized to reduce the overall area requirements and power consumption. We discuss the design details of a four-layer 3D NoC and its enhanced version with additional express channels, and compare them against a (6times6) 2D design and a baseline 3D design. All the designs are evaluated using a cycle-accurate 3D NoC simulator, and integrated with the Orion power model for performance and power analysis. The simulation results with synthetic and application traces demonstrate that the proposed multi-layered NoC routers can outperform the 2D and naive 3D designs in terms of performance and power. It can achieve up to 42% reduction in power consumption and up to 51% improvement in average latency with synthetic workloads. With real workloads, these benefits are around 67% and 38%, respectively.
    Computer Architecture, 2008. ISCA '08. 35th International Symposium on; 07/2008
  • Source
    Conference Proceeding: Reconfigurable BDD based quantum circuits
    S. Eachempati, V. Saripalli, N. Vijaykrishnan, S. Datta
    [show abstract] [hide abstract]
    ABSTRACT: We propose a novel binary decision diagram (BDD) based reconfigurable logic architecture based on split-gate quantum nanodots using III-V compound semiconductor-based quantum wells. While BDD based quantum devices architectures have already been demonstrated to be attractive for achieving ultra-low power operation, our design provides the ability to reconfigure the functionality of the logic architecture. This work proposes device and architectural innovations to support such reconfiguration. At the device level, a unique programmability feature is incorporated in our proposed nanodot devices which can operate in 3 distinct operation modes: a) active b) open and c) short mode based on the split gate bias voltages and enable functional reconfiguration. At the architectural level, we address programmability and design fabric issues involved with mapping BDDpsilas into a reconfigurable architecture. By mapping a set of logic circuits, we demonstrate that our underlying device and architectural structure is flexible to support different functions.
    Nanoscale Architectures, 2008. NANOARCH 2008. IEEE International Symposium on; 07/2008
  • Article: Design Space Exploration for 3-D Cache
    [show abstract] [hide abstract]
    ABSTRACT: As technology scales, interconnects have become a major performance bottleneck and a major source of power consumption for sub-micro integrated circuit (IC) chips. One promising option to mitigate the interconnect challenges is 3D ICs, in which a stack of multiple device layers are put together on the same chip. In this paper, we explore the architectural design of cache memories using 3D circuits. We present a delay and energy model 3D cache delay-energy estimation tool (3D-Cacti) to explore different 3D design options of partitioning a cache. The tool allows partitioning of a cache across different device layers at various levels of granularity. The tool has been validated by comparing its results with those obtained from circuit simulation of custom 3D layouts. We also explore the effects of various cache partitioning parameters and 3D technology parameters on delay and energy to demonstrate the utility of the tool.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 05/2008; · 1.22 Impact Factor
  • Chapter: A Holistic Approach to System Level Energy Optimization
    [show abstract] [hide abstract]
    ABSTRACT: Over the past few years, the design automation community has expended a lot of effort in developing low power design methodologies. However, with the increasing software content in mobile environments and the proliferation of such devices in our day to day life, it is essential to take a fresh holistic look at power optimization from an integrated hardware and software perspective. This paper envisions the tools and methodologies that will become necessary for performing such optimizations. It also presents insights into the interaction and influence of hardware and software optimizations on system energy.
    01/2008: pages 88-107;
  • Conference Proceeding: Variation-Aware Low-Power Buffer Design
    [show abstract] [hide abstract]
    ABSTRACT: Process variation (PV) is a consequence of manufacturing imperfections, which may lead to degraded performance or higher leakage power. In this paper, we focus on the design of an intelligent buffer that logically reorders the entries in FIFO buffer to minimize overall leakage power consumption. The buffer architecture, called IntelliBuffer, has been designed and evaluated in 90 nm and 32 nm CMOS technology. Our synthesized results show that our proposed design is as fast as a conventional buffer structure, while providing the ability to reduce power consumption significantly. When our buffer was used in a network-on-chip (NoC) implementation, we obtained 24% leakage savings at 90 nm, and savings of 28% at 32 nm. To further validate the efficacy of our proposed design, we incorporated IntelliBuffer into ViChaR, a recently introduced dynamic buffer management system for NoC routers. Experimental results indicate a marked reduction in ViChaR's leakage power consumption (21% at 90 nm) when IntelliBuffer is employed.
    Signals, Systems and Computers, 2007. ACSSC 2007. Conference Record of the Forty-First Asilomar Conference on; 12/2007
  • Source
    Conference Proceeding: Variation-aware task allocation and scheduling for MPSoC
    [show abstract] [hide abstract]
    ABSTRACT: As technology scales, the delay uncertainty caused by process variations has become increasingly pronounced in deep sub-micron designs. As a result, a paradigm shift from deterministic to statistical design methodology at all levels of the design hierarchy is inevitable [1]. In this paper, we propose a variation-aware task allocation and scheduling algorithm for Multiprocessor System-on-Chip (MPSoC) architectures to mitigate the impact of parameter variations. A new design metric, called performance yield and defined as the probability of the assigned schedule meeting the predefined performance constraints, is used to guide the task allocation and scheduling procedure. An efficient yield computation method for task scheduling complements and significantly improves the effectiveness of the proposed variation-aware scheduling algorithm. Experimental results show that our variation-aware scheduler achieves significant yield improvements. On average, 45% and 34% yield improvements over worst-case and nominal-case deterministic schedulers, respectively, can be obtained across the benchmarks by using the proposed variation-aware scheduler.
    Computer-Aided Design, 2007. ICCAD 2007. IEEE/ACM International Conference on; 12/2007
  • Article: On the Detection of Clones in Sensor Networks Using Random Key Predistribution
    [show abstract] [hide abstract]
    ABSTRACT: Random key predistribution security schemes are well suited for use in sensor networks due to their low overhead. However, the security of a network using predistributed keys can be compromised by cloning attacks. In this attack, an adversary breaks into a sensor node, reprograms it, and inserts several copies of the node back into the sensor network. Cloning gives the adversary an easy way to build an army of malicious nodes that can cripple the sensor network. In this paper, we propose an algorithm that a sensor network can use to detect the presence of clones. Keys that are present on the cloned nodes are detected by looking at how often they are used to authenticate nodes in the network. Simulations verify that the proposed method accurately detects the presence of clones in the system and supports their removal. We quantify the extent of false positives and false negatives in the clone detection process.
    IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews) 12/2007; · 2.01 Impact Factor
  • Source
    Conference Proceeding: FPGA routing architecture analysis under variations
    S. Srinivasan, P. Mangalagiri, Yuan Xie, N. Vijaykrishnan
    [show abstract] [hide abstract]
    ABSTRACT: Systems with the combined features of ASICs and field programmable gate arrays(FPGAs) are increasingly being considered as technology forerunners looking at their extraordinary benefits. This drags FPGAs into the technology scaling race along with ASICs exposing the FPGA industries to the problems associated with scaling. Extensive process variations is one such issue which directly impacts the profit margins of hardware design beyond 65 nm gate length technology. Since the resources in FPGAs are primarily dominated by the interconnect fabric, variations in the interconnect impacting the critical path timing and leakage yield needs rigorous analysis. In this work we provide a statistical modeling of individual routing components in an FPGA followed by a statistical methodology to analyze the timing and leakage distribution. This statistical model is incorporated into the routing algorithm to model a new statistically intelligent routing algorithm (SIRA), which simultaneously optimizes the leakage and timing yield of the FPGA device. We demonstrate and average leakage yield increase of 9% and timing yield by 11% using our final algorithm.
    Computer Design, 2007. ICCD 2007. 25th International Conference on; 11/2007
  • Article: Hotspot Prevention Through Runtime Reconfiguration in Network-On-Chip
    G.M. Link, N. Vijaykrishnan
    [show abstract] [hide abstract]
    ABSTRACT: Many existing thermal management techniques focus on reducing the overall power consumption of the chip, and do not address location-specific temperature problems referred to as hotspots. We propose the use of dynamic runtime reconfiguration to shift the hotspot-inducing computation periodically and make the thermal profile more uniform. Our analysis shows that dynamic reconfiguration is an effective technique in reducing hotspots for NoCs.
    10/2007;
  • Source
    Article: On-chip bus thermal analysis and optimisation
    F. Wang, M. De Bole, X. Wu, Y. Xie, N. Vijaykrishnan, M.J. Irwin
    [show abstract] [hide abstract]
    ABSTRACT: As technology scales, increasing clock rates, decreasing interconnect pitch and the introduction of low-k dielectrics have made self-heating of the global interconnects an important issue in VLSI design. Further, high bus temperatures have had a negative impact on the delay and reliability of on-chip interconnects. Energy and thermal models are used to characterise the effects of self-heating on the temperature of on-chip interconnects. The results obtained show that self-heating of on-chip buses contribute significantly to the temperature of the bus, which increases as technology scales, motivating the need to find solutions to mitigate this effect. The theoretical analysis performed shows that spreading switching activities among all bus lines can effectively reduce the peak temperature of the on-chip bus. Based on this observation, a thermal spreading encoding scheme for on-chip buses is proposed to tackle the thermal issue. The results obtained show that this approach is very effective in reducing the transient peak temperature among bus lines, with much less overhead compared with other low-power encoding schemes. This technique can then be combined with low-power encoding schemes to further reduce the on-chip bus temperature.
    IET Computers & Digital Techniques 10/2007; · 0.45 Impact Factor

Institutions

  • 2–2012
    • Pennsylvania State University
      • Department of Computer Science and Engineering
      University Park, MD, USA
  • 2006
    • University of Cyprus
      Nicosia, Nicosia District, Cyprus
  • 2004–2006
    • Princeton University
      • Department of Electrical Engineering
      Princeton, NJ, USA
  • 2005
    • Southern Illinois University Carbondale
      Carbondale, IL, USA
    • Texas A&M University
      • Department of Computer Science and Engineering
      College Station, TX, USA
  • 2003
    • Stevens Institute of Technology
      • Department of Electrical & Computer Engineering
      Hoboken, NJ, USA
  • 2000
    • University of Texas at Austin
      • Department of Electrical & Computer Engineering
      Port Aransas, TX, USA
  • 1999
    • University of South Florida
      Tampa, FL, USA