K. Bazargan

University of Minnesota Duluth, Duluth, Minnesota, United States

Are you K. Bazargan?

Claim your profile

Publications (74)18.81 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Maintaining the reliability of integrated circuits as transistor sizes continue to shrink to nanoscale dimensions is a significant looming challenge for the industry. Computation on stochastic bit streams, which could replace conventional deterministic computation based on a binary radix, allows similar computation to be performed more reliably and often with less hardware area. Prior work discussed a variety of specific stochastic computational elements (SCEs) for applications such as artificial neural networks and control systems. Recently, very promising new SCEs have been developed based on finite-state machines (FSMs). In this paper, we introduce new SCEs based on FSMs for the task of digital image processing. We present five digital image processing algorithms as case studies of practical applications of the technique. We compare the error tolerance, hardware area, and latency of stochastic implementations to those of conventional deterministic implementations using binary radix encoding. We also provide a rigorous analysis of a particular function, namely the stochastic linear gain function, which had only been validated experimentally in prior work.
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems 03/2014; PP(99):1-1. · 1.22 Impact Factor
  • Yanzi Zhu, Peiran Suo, Kia Bazargan
    [Show abstract] [Hide abstract]
    ABSTRACT: Stochastic computing refers to a mode of computation in which numbers are treated as probabilities implemented as 0/1 bit streams, which essentially is a unary encoding scheme. Previous work has shown significant reduction in area and increase in fault tolerance for low to medium resolution values (6-10 bits). However, this comes at very high latency cost. We propose a novel hybrid approach combining traditional binary with unary stochastic encoding, called binary stochastic. Similar to the binary representation, it is a positional number system, but instead of only 0/1 digits, the digits would be fractions. We show how simple logic such as adders and multipliers can be implemented, and then show more complex function implementations such as the gamma correction function and functions such as tanh, absolute and exponentiation using both combinational and sequential binary stochastic logic. Our experiments show significant reduction in latency compared to unary stochastic, while using significantly smaller area compared to binary implementations on FPGAs.
    Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays; 02/2014
  • [Show abstract] [Hide abstract]
    ABSTRACT: Most digital systems operate on a positional representation of data, such as binary radix. An alternative is to operate on random bit streams where the signal value is encoded by the probability of obtaining a one versus a zero. This representation is much less compact than binary radix. However, complex operations can be performed with very simple logic. Furthermore, since the representation is uniform, with all bits weighted equally, it is highly tolerant of soft errors (i.e., bit flips). Both combinational and sequential constructs have been proposed for operating on stochastic bit streams. Prior work has shown that combinational logic can implement multiplication and scaled addition effectively while linear finite-state machines (FSMs) can implement complex functions such as exponentiation and tanh effectively. Prior work on stochastic computation has largely been validated empirically.This paper provides a rigorous mathematical treatment of stochastic implementation of complex functions such as exponentiation and tanh implemented using linear FSMs. It presents two new functions, an absolute value function and exponentiation based on an absolute value, motivated by specific applications. Experimental results show that the linear FSM-based constructs for these functions have smaller area-delay products than the corresponding deterministic constructs. They also are much more tolerant of soft errors.
    IEEE Transactions on Computers 01/2014; 63(6):1474-1486. · 1.38 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Stochastic computing is a novel approach to real arithmetic, offering better error tolerance and lower hardware costs over the conventional implementations. Stochastic modules are digital systems that process random bit streams representing real values in the unit interval. Stochastic modules based on finite state machines (FSMs) have been shown to realize complicated arithmetic functions much more efficiently than combinational stochastic modules. However, a general approach to synthesize FSMs for realizing arbitrary functions has been elusive. We describe a systematic procedure to design FSMs that implement arbitrary real-valued functions in the unit interval using the Taylor series approximation.
    Computer Design (ICCD), 2013 IEEE 31st International Conference on; 01/2013
  • N. Saraf, K. Bazargan
    [Show abstract] [Hide abstract]
    ABSTRACT: Stochastic computing is an alternative approach to conventional real arithmetic. A stochastic computing module is a digital system that operates on random bit streams representing real numbers. The success of stochastic computing relies on the efficient generation of random bit streams encoding real values in the unit interval. We present the design of random bit stream generators based on finite state machines (FSMs) that emulate Reversible Markov chains. We develop a general synthesis method to designs FSMs for generating arbitrary probabilities with finite resolution. We show that our method uses fewer input random sources for the constant random bit streams needed in a computation compared to the previous work. We further show that the output random bit stream quality and convergence times of our FSMs are reasonable.
    Computer-Aided Design (ICCAD), 2013 IEEE/ACM International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: The paradigm of logical computation on stochastic bit streams has several key advantages compared to deterministic computation based on binary radix, including error-tolerance and low hardware area cost. Prior research has shown that sequential logic operating on stochastic bit streams can compute non-polynomial functions, such as the tanh function, with less energy than conventional implementations. However, the functions that can be computed in this way are quite limited. For example, high order polynomials and non-polynomial functions cannot be computed using prior approaches. This paper proposes a new finite-state machine (FSM) topology for complex arithmetic computation on stochastic bit streams. It describes a general methodology for synthesizing such FSMs. Experimental results show that these FSM-based implementations are more tolerant of soft errors and less costly in terms of the area-time product that conventional implementations.
    Computer-Aided Design (ICCAD), 2012 IEEE/ACM International Conference on; 11/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: Numerical integration is a widely used approach for computing an approximate result of a definite integral. Conventional digital implementations of numerical integration using binary radix encoding are costly in terms of hardware and have long computational delay. This work proposes a novel method for performing numerical integration based on the paradigm of logical computation on stochastic bit streams. In this paradigm, ordinary digital circuits are employed but they operate on stochastic bit streams instead of deterministic values; the signal value is encoded by the probability of obtaining a one versus a zero in the streams. With this type of computation, complex arithmetic operations can be implemented with very simple circuitry. However, typically, such stochastic implementations have long computational delay, since long bit streams are required to encode precise values. This paper proposes a stochastic design for numerical integration characterized by both small area and short delay - so, in contrast to previous applications, a win on both metrics. The design is based on mathematical analysis that demonstrates that the summation of a large number of terms in the numerical integration could lead to a significant delay reduction. An architecture is proposed for this task. Experiments confirm that the stochastic implementation has smaller area and shorter delay than conventional implementations.
    Computer-Aided Design (ICCAD), 2012 IEEE/ACM International Conference on; 01/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Stochastic Computational Element (SCE) uses streams of random bits (stochastic bits streams) to perform computation with conventional digital logic gates. It can guarantee reliable computation using unreliable devices. In stochastic computing, the linear Finite State Machine (FSM) can be used to implement some sophisticated functions, such as the exponentiation and tanh functions, more efficiently than combinational logic. However, a general approach about how to synthesize a linear FSM-based SCE for a target function has not been available. In this paper, we will introduce three properties of the linear FSM used in stochastic computing and demonstrate a general approach to synthesize a linear FSM-based SCE for a target function. Experimental results show that our approach produces circuits that are much more tolerant of soft errors than deterministic implementations, while the area-delay product of the circuits are less than that of deterministic implementations.
    01/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mounting concerns over variability, defects, and noise motivate a new approach for digital circuitry: stochastic logic, that is to say, logic that operates on probabilistic signals and so can cope with errors and uncertainty. Techniques for probabilistic analysis of circuits and systems are well established. We advocate a strategy for synthesis. In prior work, we described a methodology for synthesizing stochastic logic, that is to say logic that operates on probabilistic bit streams. In this paper, we apply the concept of stochastic logic to a reconfigurable architecture that implements processing operations on a datapath. We analyze cost as well as the sources of error: approximation, quantization, and random fluctuations. We study the effectiveness of the architecture on a collection of benchmarks for image processing. The stochastic architecture requires less area than conventional hardware implementations. Moreover, it is much more tolerant of soft errors (bit flips) than these deterministic implementations. This fault tolerance scales gracefully to very large numbers of errors.
    IEEE Transactions on Computers 02/2011; · 1.38 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: FPGA placement and routing are still challenging problems. Given the increased diversity of logic and routing resources on FPGA chips, it seems appropriate to tackle the placement problem as a mapping between the nodes and edges in a circuit graph to compatible resources in the architecture graph. We explore utilizing graph isomorphism algorithms to perform FPGA placement. We use a hierarchical approach in which the circuit and architecture graphs are simultaneously clustered to reduce the size of the search space, and then a novel reductive graph product method is used to solve the isomorphism problem. The graph product algorithm is called reductive as it eliminates a linear number of candidates at every step of the search process, reducing the number of candidate nodes by approximately 1/3. Compared to the annealing-based placement tool VPR 5.0, we achieve approximately 40% improvement in placement runtime, while improving the critical path delay by about 7% and wire length by 5%, while demanding 1.3% more channels on average.
    Proceedings of the ACM/SIGDA 19th International Symposium on Field Programmable Gate Arrays, FPGA 2011, Monterey, California, USA, February 27, March 1, 2011; 01/2011
  • P. Maidee, K. Bazargan
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes two set-of-pairs-of-functions-to-be-distinguished (SPFD)-based rewiring algorithms to be used in a multi-tier rewiring framework, which employs multiple rewiring techniques. The first algorithm has two unique features: 1) a satisfiability problem (SAT) instance was devised so that an unsuccessful rewiring can be identified very quickly, and 2) unlike binary decision diagram-based methods that require all pairs of SPFD, our algorithm uses a few SAT instances to perform rewiring for a given wire without explicitly enumerating all SPFDs. Experimental results show that the runtime of our algorithm is about three times faster than that of a conventional one under a simulated setting of such a framework and it scales well with the number of candidate wires considered. The efficacy of the framework can be further improved by the second proposed algorithm. The algorithm relies on a theory presented herein to allow adding a new wire outside of the restricted set of dominator nodes, a feature common in automatic-test-pattern-generation-based rewiring, but absent in existing SPFD-based ones. Although this algorithm may suffer from long runtimes in the same way conventional SPFD-based techniques do, experiments show that the number of wires which can be rewired increases 13% on average and the number of alternative wires also increases.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 01/2011; · 1.09 Impact Factor
  • Pongstorn Maidee, Kia Bazargan
    [Show abstract] [Hide abstract]
    ABSTRACT: Circuit rewiring can be used to explore a larger solution space by modifying circuit structure to suit a given optimization problem. Among several rewiring techniques that have been proposed, SPFD-based rewiring has been shown to be more effective in terms of solution space coverage. However, its adoption in practice has been limited due to its long runtime. We propose a novel SAT-based algorithm that is much faster than the traditional BDD-based methods. Unlike BDD-based methods that completely specify all pairs of SPFD using BDDs, our algorithm uses a few SAT instances to perform rewiring for a given wire without explicitly enumerating all SPFDs. Experimental results show that our algorithm's runtime is only 13% of that of a conventional one when each wire has at most 25 candidate wires and the runtime scales well with the number of candidate wires considered. Our approach evaluates each rewiring instance independently in the order of milliseconds, rendering deployment of an SPFD-based rewiring inside the optimization loop of synthesis tools a possibility.
    Proceedings of the 15th Asia South Pacific Design Automation Conference, ASP-DAC 2010, Taipei, Taiwan, January 18-21, 2010; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With ever-shrinking device geometries, process variations play an increased role in determining the delay of a digital circuit. Under such variations, a gate may lie on the critical path of a manufactured die with a certain probability, called the criticality probability. In this paper, we present a new technique to compute the statistical criticality information in a digital circuit under process variations by linearly traversing the edges in its timing graph and dividing it into ldquozones.rdquo We investigate the sources of error in using tightness probabilities for criticality computation with Clark's statistical maximum formulation. The errors are dealt with using a new clustering-based pruning algorithm which greatly reduces the size of circuit-level cutsets improving both accuracy and runtime over the current state of the art. On large benchmark circuits, our clustering algorithm gives about a 250times speedup compared with a pairwise pruning strategy with similar accuracy in results. Coupled with a localized sampling technique, errors are reduced to around 5% of Monte Carlo simulations with large speedups in runtime.
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 04/2009; · 1.09 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With continued scaling, reliability is emerging as a critical challenge for the designers of digital circuits. The challenge stems in part from the lack of computationally efficient techniques for analyzing and optimizing circuits for reliability. To address this problem, we propose an exact analysis method based on circuit transformations. Also, we propose a hybrid method that combines exact analysis with probabilistic measures to estimate reliability. We use such measures in a rewiring-based optimization framework to optimize reliability. Our hybrid approach offers a speedup of 56X when compared to a pure Monte Carlo simulation-based approach with only a 3.5% loss in accuracy. Our optimization framework improves reliability by about 10% accompanied by a 6.9% reduction in circuit area1.
    10th International Symposium on Quality of Electronic Design (ISQED 2009), 16-18 March 2009, San Jose, CA, USA; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Mounting concerns over variability, defects and noise motivate a new approach for integrated circuits: the design of stochastic logic, that is to say, digital circuitry that operates on probabilistic signals, and so can cope with errors and uncertainty. Techniques for prob- abilistic analysis are well established. We advocate a strategy for synthesis. In this paper, we present a reconfigurable architecture that implements the computation of arbitrary continuous functions with stochastic logic. We analyze the sources of error: approxima- tion, quantization, and random fluctuations. We demonstrate the ef- fectiveness of our method on a collection of benchmarks for image processing. Synthesis trials show that our stochastic architecture requires less area than conventional hardware implementations. It achieves a large speed up compared to software conventional im- plementations. Most importantly, it is much more tolerant of soft errors (bit flips) than these deterministic implementations.
    Proceedings of the 19th ACM Great Lakes Symposium on VLSI 2009, Boston Area, MA, USA, May 10-12 2009; 01/2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: As CMOS devices are scaled down into the nanometer regime, concerns about reliability are mounting. Instead of viewing nano-scale characteristics as an impediment, technologies such as PCMOS exploit them as a source of randomness. The technology generates random numbers that are used in probabilistic algorithms. With the PCMOS approach, different voltage levels are used to generate different probability values. If many different probability values are required, this approach becomes prohibitively expensive. In this work, we demonstrate a novel technique for synthesizing logic that generates new probabilities from a given set of probabilities. Three different scenarios are considered in terms of whether the given probabilities can be duplicated and whether there is freedom to choose them. In the case that the given probabilities cannot be duplicated and are predetermined, we provide a solution that is FPGA-mappable. In the case that the given probabilities cannot be duplicated but can be freely chosen, we provide an optimal choice. In the case that the given probabilities can be duplicated and can be freely chosen, we demonstrate how to generate arbitrary decimal probabilities from small sets - a single probability or a pair of probabilities - through combinational logic.
    Computer-Aided Design - Digest of Technical Papers, 2009. ICCAD 2009. IEEE/ACM International Conference on; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As CMOS devices are scaled down into the nanometer regime, concerns about reliability are mounting. Instead of viewing nano- scale characteristics as an impediment, technologies such as PCMOS exploit them as a source of randomness. The technology generates random numbers that are used in probabilistic algorithms. With the PCMOS approach, different voltage levels are used to generate dif- ferent probability values. If many different probability values are required, this approach becomes prohibitively expensive. In this work, we demonstrate a novel technique for synthesizing logic that generates new probabilities from a given set of probabil- ities. Three different scenarios are considered in terms of whether the given probabilities can be duplicated and whether there is free- dom to choose them. In the case that the given probabilities cannot be duplicated and are predetermined, we provide a solution that is FPGA-mappable. In the case that the given probabilities cannot be duplicated but can be freely chosen, we provide an optimal choice. In the case that the given probabilities can be duplicated and can be freely chosen, we demonstrate how to generate arbitrary deci- mal probabilities from small sets - single probabilities or pairs of probabilities - through combinational logic.
    2009 International Conference on Computer-Aided Design (ICCAD'09), November 2-5, 2009, San Jose, CA, USA; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Future computing systems will feature many cores that run fast, but might show more faults compared to existing CMOS technologies. New software methodologies must be adopted to utilize communication bandwidth and the computational power of few slow, reliable cores that could be employed in such systems to verify the results of the fast, faulty cores. Employing the traditional Triple Module Redundancy (TMR) at core instruction level would not be as effective due to its blind replication of computations. We propose two software development methods that utilize what we call Smart TMR (STMR) and fingerprinting to statistically monitor the results of computations and selectively replicate computations that exhibit faults. Experimental results show significant speedup and reliability improvement over traditional TMR approaches.
    DATE 2009; 01/2009
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: 3D technology is an attractive solution for reducing wirelength in a field programmable gate array (FPGA). However, trough silicon vias (TSV) are limited in number. In this paper, we propose a tilable switch module architecture based on the 3D disjoint switch module for 3D FPGAs. Experimental results over 20 MCNC benchmarks show 62% reduction in the number of TSVs on average and small improvements in horizontal channel width and delay compared to the original 3D disjoint SM.
    IEEE International Conference on 3D System Integration, 3DIC 2009, San Francisco, California, USA, 28-30 September 2009; 01/2009
  • Source
    P. Maidee, N. Hakim, K. Bazargan
    [Show abstract] [Hide abstract]
    ABSTRACT: Field-programmable gate arrays (FPGAs) have gained wide acceptance among low- to medium-volume applications. However, there are gaps between FPGA and custom implementations in terms of area, performance and power consumption. In recent years, specialized blocks - memories and multipliers in particular - have been shown to help reduce this gap. However, their usefulness has not been studied formally on a broad spectrum of designs. As FPGAs are prefabricated, an FPGA family must contain members of various sizes and combinations of specialized blocks to satisfy diverse design resource requirements. We formulate the family selection process as an ldquoFPGA family compositionrdquo problem and propose an efficient algorithm to solve it. The technique was applied to an architecture similar to Xilinx Virtex FPGAs. The results show that smart composition technique can reduce the expected silicon area up to 55%. The benefit of providing multiplier blocks in FPGAs is also shown to reduce total area by 20% using the proposed algorithm.
    Field Programmable Logic and Applications, 2008. FPL 2008. International Conference on; 10/2008

Publication Stats

799 Citations
18.81 Total Impact Points

Institutions

  • 2001–2014
    • University of Minnesota Duluth
      • Department of Electrical Engineering
      Duluth, Minnesota, United States
  • 2003–2013
    • University of Minnesota Twin Cities
      • Department of Electrical and Computer Engineering
      Minneapolis, Minnesota, United States
  • 2009
    • Isfahan University of Technology
      Isfahan, Ostān-e Eşfahān, Iran
    • Amirkabir University of Technology
      Teheran, Tehrān, Iran
  • 1999–2000
    • Northwestern University
      • Department of Electrical Engineering and Computer Science
      Evanston, IL, United States