N. Ranganathan

University of South Florida, Tampa, Florida, United States

Are you N. Ranganathan?

Claim your profile

Publications (227)84.38 Total impact

  • H. Thapliyal · A. Bhatt · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Conservative reversible logic gate is a reversible logic gate that is reversible in nature and also satisfy the property that there are equal number of 1s in the outputs as in the inputs. In this work, we present a new class of n × n (n inputs and n outputs) conservative reversible logic gate named SCRL (Super Conservative Reversible Logic) gate for the design of reversible quantum circuits. The proposed SCRL gate has 1 control input depending on the value of which it can swap any two n - 1 data inputs, hence is superior to the existing Fredkin gate. In reversible circuits, the constant input bits that are used to realize different logic functions are referred to as ancilla inputs, while the outputs that are neither primary inputs nor contribute to any useful computations are referred to as garbage outputs. As Ancilla inputs and garbage outputs are overhead bits in a reversible circuit, they need to be minimized. Barrel shifter forms an integral component of many computing systems. As an example of using the proposed SCRL gate to design efficient reversible quantum circuits, the design of reversible barrel shifter with zero ancilla inputs and zero garbage outputs is illustrated.
    No preview · Conference Paper · Jan 2013
  • M. Morrison · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Programmable reversible logic is emerging as a prospective logic design style for implementation in modern nanotechnology and quantum computing with minimal impact on circuit heat generation. Recent advances in reversible logic using and quantum computer algorithms allow for improved computer architecture and arithmetic logic unit designs. We present an optimization method for reversible logic synthesis based on the Integrated Qubit (IQ) library. This method works in conjunction with existing methods to further improve quantum cost and delay of a synthesized reversible logic circuit. This algorithm runs in O(N) time, and reduces the quantum cost of synthesized circuit by up to 45 percent. In addition, the process of replacing the gates in the synthesized circuits with IQ gates uses a locally optimal technique whose major benefits include reduction of cost as well as delay.
    No preview · Conference Paper · Jan 2013
  • Conference Paper: [Seven tutorials]
    [Show abstract] [Hide abstract]
    ABSTRACT: These tutorials discusses the following: Operational Amplifiers: Theory, Design and Applications; Reversible Logic: Basics, Prospects in Emerging Nanotechnologies and Challenges in Future; Digital Signal Processing for Communications; Recent Advances on Nyquist and Oversampled Analog-to-Digital Converters; Audio Steganography for Watermarking, Data Embedding and Covert Communication; Delta-Sigma Analog-to-Digital Converters - From System Architecture to Transistor-Level Design; The Memristor: a Circuit Designer's Prospective.
    No preview · Conference Paper · Jan 2012
  • Source
    H. Thapliyal · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Reversible logic is one of the emerging technologies having promising applications in quantum computing. In this work, we present new design of the reversible BCD adder that has been primarily optimized for the number of ancilla input bits and the number of garbage outputs. The number of ancilla input bits and the garbage outputs is primarily considered as an optimization criteria as it is extremely difficult to realize a quantum computer with many qubits. As the optimization of ancilla input bits and the garbage outputs may degrade the design in terms of the quantum cost and the delay, thus the quantum cost and the delay parameters are also considered for optimization with primary focus towards the optimization of the number of ancilla input bits and the garbage outputs. Firstly, we propose a new design of the reversible ripple carry adder having the input carry Co and is designed with no ancilla input bits. The proposed reversible ripple carry adder design with no ancilla input bits has less quantum cost and the logic depth (delay) compared to its existing counterparts. The existing reversible Peres gate and a new reversible gate called the TR gate is efficiently utilized to improve the quantum cost and the delay of the reversible ripple carry adder. The improved quantum design of the TR gate is also illustrated. Finally, the reversible design of the BCD adder is presented which is based on a 4 bit reversible binary adder to add the BCD number, and finally the conversion of the binary result to the BCD format using a reversible binary to BCD converter.
    Preview · Conference Paper · Apr 2011
  • Koustav Bhattacharya · N. Ranganathan

    No preview · Article · Jan 2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Optical flow computation in vision-based systems demands substantial computational power and storage area. Hence, to enable real-time processing at high resolution, the design of application-specific system for optic flow becomes essential. In this paper, we propose an efficient VLSI architecture for the accurate computation of the Lucas-Kanade (L-K)-based optical flow. The L-K algorithm is first converted to a scaled fixed-point version, with optimal bit widths, for improving the feasibility of high-speed hardware implementation without much loss in accuracy. The algorithm is mapped onto an efficient VLSI architecture and the data flow exploits the principles of pipelining and parallelism. The optical flow estimation involves several tasks such as Gaussian smoothing, gradient computation, least square matrix calculation, and velocity estimation, which are processed in a pipelined fashion. The proposed architecture was simulated and verified by synthesizing onto a Xilinx Field Programmable Gate Array, which utilize less than 40% of system resources while operating at a frequency of 55 MHz. Experimental results on benchmark sequences indicate 42% improvement in accuracy and a speed up of five times, compared to a recent hardware implementation of the L-K algorithm.
    No preview · Article · Jan 2010
  • V. Mahalingam · N. Ranganathan · N. Ahmed · T. Haider
    [Show abstract] [Hide abstract]
    ABSTRACT: In the nanometer era, process, voltage and temperature variations are dominating circuit performance, power and yield. Over the past few years, statistical optimization methods have been effective in improving yield in the presence of uncertainty due to process variations. However, statistical methods over consume resources, even in the absence of variations. Hence, to facilitate a better performance-power-yield tradeoff, techniques that can dynamically enable variation compensation are becoming necessary. In this paper, we propose a dynamic technique that controls the instance of data capture in critical path flops, by delaying the clock edge trigger. The methodology employs a dynamic delay detection circuit to identify the uncertainty in delay due to variations and stretches the clock in the destination flip-flops. The delay detection circuit uses a latch and set of combinational gates to dynamically detect and create the slack needed to accommodate the delay due to variations. The clock stretching logic (CSL) is only added to paths, which has a high probability of failure in the presence of variations. The proposed methodology improves the timing yield of the circuit without significant over compensation. Experimental results based on Monte-Carlo simulations on benchmark circuits indicate efficient improvement in timing yield with a small area overhead.
    No preview · Article · Jan 2010
  • R. Hyman · K. Bhattacharya · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: With the continuous decrease in the minimum feature size and increase in the chip density, modern processors are being increasingly susceptible to soft errors. In the past, the technique of lockstep execution with redundant threads on duplicated pipelines have been used for soft error rate reduction which can achieve high error coverage but at the cost of large overheads in terms of area and performance. In this paper, we propose techniques for protection against soft errors in multi-core designs using (i) the properties of spatial and temporal redundancy and (ii) value based detection. We utilize temporal redundancy by using the ldquolatency use slackrdquo (LSC) of an instruction, which we define as the number of cycles before the computed result from the instruction becomes the source operand of a subsequent instruction, while spatial redundancy is exploited by duplicating the instruction to a nearby idle processor core. Further, the value based detection technique is explored by exploiting the width of the operands with small data values and the generation of residue code check bits for the source operands. When a soft error is detected, error correction is achieved by rolling back the execution to a previous checkpoint state and re-executing the instructions. The proposed techniques have been implemented on the RSIM simulation framework and validated using the SPLASH benchmarks. Our results indicate that the soft error detection schemes proposed in this work, can be implemented, on average, with less than 10% increase in CPI on modern multi-core designs.
    No preview · Conference Paper · Jun 2009
  • Koustav Bhattacharya · Mahalingam Venkataraman · Nagarajan Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: The computation of optical flow in video sequences is a challenging task in most camera based scene interpretation systems. In the past, most optical flow computation algorithms has been either implemented in software running on general purpose processors or designed as an application specific hardware. However, these implementations either cannot support real-time processing requirements or result in excessive inaccuracies in the computed velocity values. In this work, we propose a efficient VLSI system architecture for computing the optical flow in video sequences using the Lucas-Kanade (L-K) algorithm. The algorithm is converted into high speed RTL implementation by exploiting the inherent parallelism in the data flow graph. Clever pipelining strategies has been used throughout the design to further improve the speedup of velocity computation. We have mapped the RTL design on a Xilinx Virtex II field programmable gate arrays (FPGA) supported with Kingston DIMM DDR memory module, and a pixel-plus 2.0 mega-pixel camera on the XUPV2P FPGA board. Experimental results of our proposed design showed significant improvements in accuracy with a speedup of five times when compared with other recent hardware implementations.
    No preview · Conference Paper · Jun 2009
  • Source
    Himanshu Thapliyal · Nagarajan Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Reversible logic is attracting the researchers attention for fault susceptible nanotechnologies including molecular QCA. In this paper, we propose concurrently testable FPGA design for molecular QCA using conservative reversible Fredkin gate. Fredkin gate is conservative reversible in nature, in which there would be an equal number of 1s in the outputs as there would be on the inputs, in addition to one-to-one mapping. Fault patterns in Fredkin gate are analyzed using HDLQ tool due to a single missing/additional cell defect in molecular QCA. Exhaustive simulation shows that if there is a fault in molecular QCA implementation of Fredkin gate, there is a parity mismatch between the inputs and the outputs; otherwise the inputs parity is same as outputs parity. Thus, any permanent and transient fault in molecular QCA that results in parity mismatch can be concurrently detected. The logic block and the routing fabric (both are programmable) are the two key components of an FPGA. Thus, we have shown the Fredkin gate based concurrently testable designs of the configurable logic block (CLB) and the routing switch of a molecular QCA-based FPGA. Analysis of power dissipation in the proposed FPGA is also shown.
    Preview · Conference Paper · Jun 2009
  • Source
    Himanshu Thapliyal · Nagarajan Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Nanotechnologies, including molecular QCA, are susceptible to high error rates. In this paper, we present the design of testable reversible latches (D latch, T Latch, JK Latch, RS Latch), based on reversible conservative logic for molecular QCA. Conservative reversible circuits are a specific type of reversible circuits in which there would be an equal number of Is in the output as there would be on the input, in addition to one- to-one mapping. The proposed latches require only two test vectors, all Os and all Is, for detecting any unidirectional stuck-at faults. The design of QCA layouts and the verification of the latch designs performed using the QCA designer tool are presented.
    Preview · Conference Paper · Sep 2008
  • Source
    V. Mahalingam · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: In nanometer regime, the effects of process variations are dominating circuit performance, power and reliability of circuits. Hence, it is important to properly manage variation effects at the design stage to avoid excessive performance and power penalties. Buffer insertion and driver sizing (BIDS) techniques are widely used in design flow due to their simplicity and effectiveness in improving metrics such as power, performance and noise. In this paper, we propose a fuzzy optimization technique to perform variation aware Buffer insertion and driver sizing at the network level. Previous variation aware techniques for buffer insertion, processes individual nets in a critical first fashion, which can result in severe over-buffering. Hence, we formulate the variation aware BIDS problem as a fuzzy piece-wise linear program to maximize variation resistance in the presence of delay, power and noise constraints. The uncertainty due to process variations in circuit delay is modeled using fuzzy numbers and the fuzzy approach performs pre-processing deterministic optimizations with the worst and average case values to convert the uncertain problem into a crisp problem. The proposed approach evaluated on ITCpsila99 benchmarks shows a 45% reduction in resource (buffer, driver) cost compared to deterministic worst case approach.
    Preview · Conference Paper · May 2008
  • Source
    Venkataraman Mahalingam · N. Ranganathan · J. E. Harlow

    Preview · Article · Jan 2008
  • N. Ranganathan · Upavan Gupta · Venkataraman Mahalingam
    [Show abstract] [Hide abstract]
    ABSTRACT: Technology scaling has not only magnified the effects of device process variations, but it has also precipitated the need for simultaneous optimization of several performance metrics. In this paper, we propose a novel gate sizing approach for multi-metric optimization of delay, power, and crosstalk noise. The algorithm is based on the concepts of mathematical programming, and models the process variation uncertainty considering spatial correlations. The approach identifies leakage power, dynamic power, and crosstalk noise as the objectives, and the optimized gate delays are kept as constraints. Initially, the deterministic upper and lower bounds of the objectives are identified, and during the final step, a crisp non-linear programming problem is formulated using these boundary values. The problem is solved using KNITRO, an interior-point based optimization solver. The proposed model maximizes the variation resistance, thus providing higher yield. ITC'99 benchmarks were used to test the proposed approach, and the results indicate that our algorithm identifies the solution points that are closest to the nominal bounds, while maintaining high timing yield.
    No preview · Conference Paper · Jan 2008
  • U. Gupta · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: The aggregation of robots into teams is necessitated due to the limited power and communication capabilities in emergency environments. The formation of robot teams significantly enhances the performance and efficiency of search and rescue missions in such environments. As opposed to the classical partitioning application domains, the robot aggregation requires multiple confticting objectives to be optimized. We propose a novel microeconomic methodology for simultaneous multi-objective partitioning of robots. The method utilizes the strengths of K-Means algorithm, game theoretic modeling, and Nash equilibrium methodology for fast and socially fair partitioning. In this work, partitions are created on the basis of compaction and equipartitioning objectives to identify decentralized robot teams with each robot in a team closest to its communication gateway, as well as each team equally represented in terms of strength. Rigorous simulations were performed to evaluate the performance of the method, and the results indicate that the proposed method performs significantly better than the K-Means methodology, and identifies good solution points.
    No preview · Conference Paper · Dec 2007
  • Source
    K. Bhattacharya · S. Kim · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: The reliability of large on-chip L2 cache poses a significant challenge due to technology scaling trends. As the minimum feature size continues to decrease, the L2 caches become more vulnerable to multi-bit soft errors. Traditionally, L2 caches have been protected from multi-bit soft errors using techniques like using error detection/correction codes or employing physical interleaving of cache bit lines to convert multi-bit errors into single-bit errors. These methods, however, incur large overheads in area and power. In this work, we investigate several new techniques for reducing multi-bit errors in large L2 caches, in which the multi-bit errors are detected using simple error detection codes and corrected using the data redundancy in the memory hierarchy. Further, we develop a reliability aware replacement policy that dynamically trades performance for reliability whenever the soft-error budget is exceeded. In order to further improve reliability, we propose the duplication of the data values in cache lines by exploiting their small data widths. The proposed techniques were implemented in the Simplescalar framework and validated using the SPEC 2000 integer and floating point benchmarks. The proposed techniques improve the reliability of L2 caches by 40% and 32% on the average, for integer and floating point applications respectively, with little impact on performance and area.
    Preview · Conference Paper · Nov 2007
  • Source
    S.P. Mohanty · E. Kougianos · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Research in digital watermarking is mature. Several software implementations of watermarking algorithms are described in the literature, but few attempts have been made to describe hardware implementations. The ultimate objective of the research is to develop low-power, high- performance, real-time, reliable and secure watermarking systems, which can be achieved through hardware implementations. The development of a very-large-scale integration architecture for a high-performance watermarking chip is presented which can perform both invisible robust and invisible fragile image watermarking in the spatial domain. The watermarking architecture is prototyped in two ways: (i) by using a Xilinx field-programmable gate array and (ii) by building a custom integrated circuit. This prototype is the first watermarking chip with both invisible robust and invisible fragile watermarking capabilities.
    Full-text · Article · Oct 2007 · IET Computers & Digital Techniques
  • Source
    Venkataraman Mahalingam · N. Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: In nanometer regime, the effects of variations are having an increasing impact on the delay and power characteris-tics of devices as well as the yield of the circuit. Statistical timing analysis and variation aware optimization schemes are required to improve the performance and yield of fabri-cated circuits. In this paper, we propose a new timing based incremental placement algorithm using fuzzy mathematical programming (FMP) in which the uncertainty due to pro-cess variations is modeled using fuzzy numbers. The ob-jective is to minimize the worst negative slack of the circuit with the variations in gate and interconnect delays mod-eled as triangular fuzzy numbers. In this approach, average and worst case deterministic optimizations are performed to determine the bounds of the problem and then, use these bounds to convert the fuzzy problem into a crisp problem. The crisp model is targeted at maximizing the variation re-sistance or robustness of the circuit without compromising on the achievable performance of the circuit. The proposed approach being tested on ITC '99 benchmarks shows an average of 12% improvement when compared to the worst case process variations setting, with a timing yield of 99 -100%, as verified through monte-carlo simulations.
    Preview · Conference Paper · Mar 2007
  • V. Mahalingam · N. Ranganathan · Justin E. Harlow
    [Show abstract] [Hide abstract]
    ABSTRACT: Increasing dominance of process variations in the nanometer designs are posing significant challenges for circuit design and optimization. The variations in parameters such as channel length and the gate oxide thickness impacts circuit delay and power. In this paper, we propose a new gate sizing algorithm using fuzzy mathematical programming (FMP) in which the uncertainty due to process variations is modeled using fuzzy numbers. The variations in gate delay, which is a function of gate sizes and the fan-outs of the gate, are represented using triangular fuzzy numbers with linear membership functions. The variation aware gate sizing problem is formulated as a fuzzy mathematical program to perform a delay constrained power minimization in the presence of variations. Initially, a deterministic optimization is performed by fixing the fuzzy parameters to the worst and the average case values and the results are used to convert the fuzzy optimization problem into a crisp non-linear problem which is then solved using a non-linear optimization solver. The above model with delay and power as constraints, maximizes the robustness, i.e., the variation resistance of the circuit and thus the yield. The proposed approach was tested on ISCAS '85 benchmarks and the results were validated for timing yield using monte-carlo simulations. The fuzzy approach yields significantly better results compared to stochastic programming based gate sizing approach with a comparable runtime.
    No preview · Conference Paper · Nov 2006
  • Upavan Gupta · Nagarajan Ranganathan
    [Show abstract] [Hide abstract]
    ABSTRACT: Resource management is a well studied field. Existence of multiple emergencies in a locality, in a time overlapped manner, demands an optimal allocation of required resources to the emergencies. This is an intricate problem if the availability of resources is limited. Involvement of human lives in such situations poses a very important constraint of social fairness on the optimality criteria of allocation. Hence, these situations necessitate an allocation methodology that could allocate the requested resources to the emergencies such that even a lower criticality emergency is serviced in a socially optimal manner. In this research, an emergency management tool is developed that models the problem as a game theoretic framework in which the crisis events are modeled as the players, the emergency response centers as the resource locations with the possible emergency unit allocations as strategies. The pay-off is modeled as a function of the criticality of the event and the anticipated response times. A single step, non-cooperative, normal form game is formulated and a Nash equilibrium based solution methodology is implemented to provide fair allocation of resources to the emergencies. Experimental results are presented to illustrate the efficacy of the proposed methodology and metrics are derived to quantify the fairness of the solution. A regression analysis is performed to establish the statistical significance of the results.
    No preview · Conference Paper · Jul 2006

Publication Stats

2k Citations
84.38 Total Impact Points

Institutions

  • 1989-2013
    • University of South Florida
      • Department of Computer Science & Engineering
      Tampa, Florida, United States
  • 1998-2003
    • University of Texas at El Paso
      • Department of Electrical and Computer Engineering
      El Paso, TX, United States
    • University of South Florida St. Petersburg
      St. Petersburg, Florida, United States
    • The University of Tampa
      Tampa, Florida, United States
  • 1993
    • University of Zagreb
      Zagrabia, Grad Zagreb, Croatia
  • 1988-1992
    • University of Central Florida
      • Department of Electrical Engineering & Computer Science
      Orlando, FL, United States
  • 1991
    • University of Kentucky
      • Department of Computer Science
      Lexington, KY, United States