Mehdi Safarpour

Mehdi Safarpour
University of Oulu · Department of Computer Science and Engineering

Doctor of Philosophy

About

36
Publications
5,323
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
105
Citations

Publications

Publications (36)
Article
Full-text available
A method based on compressive sensing theory, for sampling Fourier sparse signals for efficient implementation of analog-to-information converters is proposed. The solution reconstructs Nyquist rate high-resolution signal from Nyquist rat e low-resolution and sub-Nyquist rate high-resolution samples. For implementation, an architecture based on cus...
Preprint
Full-text available
An energy efficient architecture for TPUs that is based on reduced voltage operation. The errors are captured and corrected by utilizing ABFT and hence aggressive voltage scaling is made possible.
Preprint
Full-text available
Reduced voltage operation is an effective technique for substantial energy efficiency improvement in digital circuits. This brief introduces a simple approach for enabling reduced voltage operation of Deep Neural Network (DNN) accelerators by mere software modifications. Conventional approaches for enabling reduced voltage operation e.g., Timing Er...
Preprint
Full-text available
This brief introduces an easy-to-integrate approach to enable significant power savings from reduced voltage operation of Graphic Processing Units (GPUs) used for acceleration of Deep Neural Network (DNN) models, with negligible overheads. Conventional approaches for enabling safe reduced voltage operation either compromise reliability or incur sig...
Chapter
System-on-Chip (SoC) manufacturers use Core Level Redundancy (CLR) scheme to cope with fabrication defects. By providing redundancy with extra cores and logic blocks, CLR ensures delivering performance even if a small number of the functional units are defective. CLR even enables selling lower end products if more cores have failed than needed by t...
Preprint
Full-text available
System-on-Chip (SoC) manufacturers use Core Level Redundancy (CLR) scheme to cope with fabrication defects. By providing redundancy with extra cores and logic blocks, CLR ensures delivering performance even if a small number of the functional units are defective. CLR even enables selling lower end products if more cores have failed than needed by t...
Preprint
Full-text available
Operating at reduced voltage is an effective technique for improving the energy efficiency of computing. However, the approach is constrained by its exacerbated sensitivity to Process, Voltage and Temperature (PVT) variations, which under throughput constraints challenges finding the energy minimizing voltage-frequency operating point. Commonly uti...
Preprint
Full-text available
p> A simple method for enabling low-voltage energy efficiect operation such that is provided by near-threshold and sub-threshold votlage regions. This method supports Fast Fourier Transform to operate at very low voltage and detects if any computational errors occur. When an error is detected, the voltage is adjusted so that errors are removed. In...
Thesis
Full-text available
Emerging technologies, such as the Internet of Things (IoT), Deep Neural Network(DNN) based machine learning, and 6th generation wireless communications, impose substantial performance and energy efficiency demands for implementations. Thermal dissipation and energy supply constraints alone set stringent limits on the designs. In answer to the requ...
Article
Full-text available
Operating at reduced voltage is an effective technique for improving the energy efficiency of computing. However, the approach is constrained by its exacerbated sensitivity to Process, Voltage, and Temperature (PVT) variations, which under throughput constraints challenges finding the energy minimizing voltage-frequency operating point. Commonly ut...
Preprint
Full-text available
This paper presents simple techniques to significantly reduced energy consumption of DNNs: Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry b...
Article
Full-text available
Chip manufacturers define voltage margins on top of the “best-case” operational voltage of their chips to ensure reliable functioning in the worst case settings. The margins guarantee correctness of operation, but at the cost of performance and power efficiency. Violating the margins is tempting to save energy, but might lead to timing errors. This...
Preprint
Full-text available
Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry building blocks in embedded applications. Without an error feedback mechanism, blind voltage...
Preprint
Full-text available
Operating at reduced voltages offers substantial energy efficiency improvement but at the expense of increasing the probability of computational errors due to hardware faults. In this context, we targeted Deep Neural Networks (DNN) as emerging energy hungry building blocks in embedded applications. Without an error feedback mechanism, blind voltage...
Preprint
Full-text available
This paper proposes a solution that makes voltage scaling possible by simply using HLS tools provided by vendor to improve energy efficiency of FPGAs by 2x. Chip manufacturers define voltage margins on top of the “best-case” operational voltage of their chips to ensure reliable worst case functionality. The margins guarantee correct-ness of operati...
Article
Full-text available
In this brief an approach is proposed to achieve energy savings from reduced voltage operation. The solution detects timing-errors by integrating Algorithm Based Fault Tolerance (ABFT) into a digital architecture. The approach has been studied with a systolic array matrix multiplier operating at reduced voltages, detecting errors on-the-fly to avoi...
Article
Full-text available
This paper introduces a novel arithmetic tracking algorithm for successive approximation ADCs, and presents its analysis. The algorithm utilizes low activity signal periods to cut the ADC energy dissipation by reducing the number of required bit-cycles. The approach determines the required step size, and bypasses conversion cycles when signal activ...
Data
MATLAB and C implementation of compressive sensing reconstruction algorithms
Data
MATLAB and C implementation of compressive sensing reconstruction algorithms
Data
MATLAB and C implementation of compressive sensing reconstruction algorithms
Data
MATLAB and C implementation of compressive sensing reconstruction algorithms
Conference Paper
Full-text available
In this contribution, it is proposed to limit the quantization search space of a successive approximation analogto-digital converter through an analytic derivation of maximum possible sample-to-sample variation. The presented example design of the proposed ADC is an 8-bit 1MS/s ADC with SAR logic customized to incorporate this priori information wh...
Chapter
Full-text available
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, ultra-low-leakage processes are employed in fabrication. Those limit the clocking rates significantly, reducing the com...
Preprint
Full-text available
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, near-threshold/sub-threshold operational points or ultra-low-leakage processes in fabrication are employed. Those limit...
Conference Paper
Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, ultra-low-leakage processes are employed in fabrication. Those limit the clocking rates significantly, reducing the com...
Preprint
Full-text available
In this contribution, it is proposed to limit the quantization search space of a successive approximation analog-to-digital converter through an analytic derivation of maximum possible sample-to-sample variation. The presented example design of the proposed ADC is an 8-bit 1MS/s ADC with SAR logic customized to incorporate this priori information w...
Conference Paper
Full-text available
An Application specific programmable processor is designed based on the analysis of a set of greedy recovery Compressive Sensing (CS) algorithms.The solution is flexible and customizable for a wide range of problem dimensions,as well as algorithms. The versatility of the approach is demonstrated by implementing Orthogonal Matching Pursuits, Approxi...
Conference Paper
Full-text available
In most steganographic methods, increasing in the capacity leads to decrease in the quality of the stego-image, so in this paper, we propose to combine two existing techniques, Pixel value differencing and Gray Level Modification, to come up with a hybrid steganography scheme which can hide more information without having to compromise much on the...
Conference Paper
Full-text available
This paper presents a novel image sharing method. The proposed method is based on emerging theory of compressed sensing. We have exploited intrinsic features of compressed sensing theory to come up with a new image sharing scheme which provides scalable recovery and generates smaller shadow images (share images).The other potential application of o...

Network

Cited By