
Mehdi SafarpourUniversity of Oulu · Department of Computer Science and Engineering
Mehdi Safarpour
Doctor of Philosophy
About
36
Publications
5,323
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
105
Citations
Introduction
Computer Architecture, Embedded Design, Mixed Signal, DSP
Publications
Publications (36)
A method based on compressive sensing theory, for sampling Fourier sparse signals for efficient implementation of analog-to-information converters is proposed. The solution reconstructs Nyquist rate high-resolution signal from Nyquist rat e low-resolution and sub-Nyquist rate high-resolution samples. For implementation, an architecture based on cus...
An energy efficient architecture for TPUs that is based on reduced voltage operation. The errors are captured and corrected by utilizing ABFT and hence aggressive voltage scaling is made possible.
Reduced voltage operation is an effective technique for substantial energy efficiency improvement in digital circuits. This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. Conventional approaches for enabling reduced voltage operation e.g., Timing Er...
This brief introduces an easy-to-integrate approach to enable significant power savings from reduced voltage operation of Graphic Processing Units (GPUs) used for acceleration of Deep Neural Network (DNN) models, with negligible overheads. Conventional approaches for enabling safe reduced voltage operation either compromise reliability or incur sig...
System-on-Chip (SoC) manufacturers use Core Level Redundancy (CLR) scheme to cope with fabrication defects. By providing redundancy with extra cores and logic blocks, CLR ensures delivering performance even if a small number of the functional units are defective. CLR even enables selling lower end products if more cores have failed than needed by t...
System-on-Chip (SoC) manufacturers use Core Level Redundancy (CLR) scheme to cope with fabrication defects. By providing redundancy with extra cores and logic blocks, CLR ensures delivering performance even if a small number of the functional units are defective. CLR even enables selling lower end products if more cores have failed than needed by t...
Operating at reduced voltage is an effective technique for improving the energy efficiency of computing. However, the approach is constrained by its exacerbated sensitivity to Process, Voltage and Temperature (PVT) variations, which under throughput constraints challenges finding the energy minimizing voltage-frequency operating point. Commonly uti...
p> A simple method for enabling low-voltage energy efficiect operation such that is provided by near-threshold and sub-threshold votlage regions. This method supports Fast Fourier Transform to operate at very low voltage and detects if any computational errors occur. When an error is detected, the voltage is adjusted so that errors are removed. In...
Emerging technologies, such as the Internet of Things (IoT), Deep Neural Network(DNN) based machine learning, and 6th generation wireless communications, impose substantial performance and energy efficiency demands for implementations. Thermal dissipation and energy supply constraints alone set stringent limits on the designs. In answer to the requ...
Operating at reduced voltage is an effective technique for improving the energy efficiency of computing. However, the approach is constrained by its exacerbated sensitivity to Process, Voltage, and Temperature (PVT) variations, which under throughput constraints challenges finding the energy minimizing voltage-frequency operating point. Commonly ut...
This paper presents simple techniques to significantly reduced energy consumption of DNNs: Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry b...
Chip manufacturers define voltage margins on top of the “best-case” operational voltage of their chips to ensure reliable functioning in the worst case settings. The margins guarantee correctness of operation, but at the cost of performance and power efficiency. Violating the margins is tempting to save energy, but might lead to timing errors. This...
Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry building blocks in embedded applications. Without an error feedback mechanism, blind voltage...
Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry building blocks in embedded applications. Without an error feedback mechanism, blind voltage...
This paper proposes a solution that makes voltage scaling possible by simply using HLS tools provided by vendor to improve energy efficiency of FPGAs by 2x. Chip manufacturers define voltage margins on top of the “best-case” operational voltage of their chips to ensure reliable worst case functionality. The margins guarantee correct-ness of operati...
In this brief an approach is proposed to achieve energy savings from reduced voltage operation. The solution detects timing-errors by integrating Algorithm Based Fault Tolerance (ABFT) into a digital architecture. The approach has been studied with a systolic array matrix multiplier operating at reduced voltages, detecting errors on-the-fly to avoi...
This paper introduces a novel arithmetic tracking algorithm for successive approximation ADCs, and presents its analysis. The algorithm utilizes low activity signal periods to cut the ADC energy dissipation by reducing the number of required bit-cycles. The approach determines the required step size, and bypasses conversion cycles when signal activ...
MATLAB and C implementation of compressive sensing reconstruction algorithms
MATLAB and C implementation of compressive sensing reconstruction algorithms
MATLAB and C implementation of compressive sensing reconstruction algorithms
MATLAB and C implementation of compressive sensing reconstruction algorithms
In this contribution, it is proposed to limit the
quantization search space of a successive approximation analogto-digital converter through an analytic derivation of maximum
possible sample-to-sample variation. The presented example
design of the proposed ADC is an 8-bit 1MS/s ADC with
SAR logic customized to incorporate this priori information
wh...
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, ultra-low-leakage processes are employed in fabrication. Those limit the clocking rates significantly, reducing the com...
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, near-threshold/sub-threshold operational points or ultra-low-leakage processes in fabrication are employed. Those limit...
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, ultra-low-leakage processes are employed in fabrication. Those limit the clocking rates significantly, reducing the com...
In this contribution, it is proposed to limit the quantization search space of a successive approximation analog-to-digital converter through an analytic derivation of maximum possible sample-to-sample variation. The presented example design of the proposed ADC is an 8-bit 1MS/s ADC with SAR logic customized to incorporate this priori information w...
An Application specific programmable processor is designed based on the analysis of a set of greedy recovery Compressive Sensing (CS) algorithms.The solution is flexible and customizable for a wide range of problem dimensions,as well as algorithms. The versatility of the approach is demonstrated by implementing Orthogonal Matching Pursuits, Approxi...
In most steganographic methods, increasing in the capacity leads to decrease
in the quality of the stego-image, so in this paper, we propose to combine two
existing techniques, Pixel value differencing and Gray Level Modification, to
come up with a hybrid steganography scheme which can hide more information
without having to compromise much on the...
This paper presents a novel image sharing method. The proposed method is based on emerging theory of compressed sensing. We have exploited intrinsic features of compressed sensing theory to come up with a new image sharing scheme which provides scalable recovery and generates smaller shadow images (share images).The other potential application of o...