Tarek M. Taha’s research while affiliated with University of Dayton and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (208)


Deep learning-based explainable approaches for RNA-seq gene expression data analysis
  • Conference Paper

October 2024

Shamima Nasrin

·

·

Simon Khan

·

Tarek M. Taha


Survey of Deep Learning Accelerators for Edge and Emerging Computing
  • Article
  • Full-text available

July 2024

·

91 Reads

·

4 Citations

Electronics

The unprecedented progress in artificial intelligence (AI), particularly in deep learning algorithms with ubiquitous internet connected smart devices, has created a high demand for AI computing on the edge devices. This review studied commercially available edge processors, and the processors that are still in industrial research stages. We categorized state-of-the-art edge processors based on the underlying architecture, such as dataflow, neuromorphic, and processing in-memory (PIM) architecture. The processors are analyzed based on their performance, chip area, energy efficiency, and application domains. The supported programming frameworks, model compression, data precision, and the CMOS fabrication process technology are discussed. Currently, most commercial edge processors utilize dataflow architectures. However, emerging non-von Neumann computing architectures have attracted the attention of the industry in recent years. Neuromorphic processors are highly efficient for performing computation with fewer synaptic operations, and several neuromorphic processors offer online training for secured and personalized AI applications. This review found that the PIM processors show significant energy efficiency and consume less power compared to dataflow and neuromorphic processors. A future direction of the industry could be to implement state-of-the-art deep learning algorithms in emerging non-von Neumann computing paradigms for low-power computing on edge devices.

Download


Survey of Deep Learning Accelerators for Edge and Emerging Computing

July 2024

·

159 Reads

·

1 Citation

The unprecedented progress in Artificial Intelligence (AI), particularly in deep learning algorithms with ubiquitous internet connected smart devices, has created a high demand for AI computing on the edge devices. This review studied commercially available edge processors, and the processors that are still in industrial research stages. We categorized state-of-the-art edge processors based on the underlying architecture, such as dataflow, neuromorphic, and Processing in-Memory (PIM) architecture. The processors are analyzed based on their performance, chip area, energy efficiency, and application domains. The supported programming frameworks, model compression, data precision, and the CMOS fabrication process technology are discussed. Currently, most of the commercial edge processors utilize dataflow architectures. However, emerging non-von Neumann computing architectures have attracted the industry in recent years. Neuromorphic processors are highly efficient for performing computation with fewer synaptic operations, and several neuromorphic processors offer online training for secured and personalized AI applications. This review found that the PIM processors show significant energy efficiency and consume less power compared to dataflow and neuromorphic processors. The future direction of the industry would be to implement state-of-the-art deep learning algorithms in emerging non-Von Neumann computing paradigms for low power computing on the edge devices.


Survey of Deep Learning Accelerators for Edge and Emerging Computing

June 2024

·

447 Reads

The unprecedented progress in Artificial Intelligence (AI), particularly in deep learning algorithms with ubiquitous internet connected smart devices, has created a high demand for AI computing on the edge devices. This review studied commercially available edge processors, and the processors that are still in industrial research stages. We categorized state-of-the-art edge processors based on the underlying architecture, such as dataflow, neuromorphic, and Processing in-Memory (PIM) architecture. The processors are analyzed based on their performance, chip area, energy efficiency, and application domains. The supported programming frameworks, model compression, data precision, and the CMOS fabrication process technology are discussed. Currently, most of the commercial edge processors utilize dataflow architectures. However, emerging non-von Neumann computing architectures have attracted the industry in recent years. Neuromorphic processors are highly efficient for performing computation with fewer synaptic operations, and several neuromorphic processors offer online training for secured and personalized AI applications. This review found that the PIM processors show significant energy efficiency and consume less power compared to dataflow and neuromorphic processors. The future direction of the industry would be to implement state-of-the-art deep learning algorithms in emerging non-Von Neumann computing paradigms for low power computing on the edge devices.




Memristor Based Liquid State Machine With Method for In-Situ Training

January 2024

·

7 Reads

IEEE Transactions on Nanotechnology

SpikiDg neural network (SNN) hardware has gained significant interest due to its ability to process complex data in size, weight, and power (SWaP) constrained environments. Memristors, in particular, offer the potential to enhance SNN algorithms by pro\'iding analog domain acceleration with exceptional energy and throoghput efficiency. Among the current SNN architectures, the liquid State Machine (LSM), a form of Reservoir Computing (RC), stands out due to its low resource utilization and straightforward training process. In this paper, we present a custom memristor-based LSM circuit design with an ouline learuiog methodology. The proposed circuit implementing the LSM is designed using SPICE to ensure precise de\'ice level accurocy. Furthermore, we explore liquid connecti\'ity touing to facilitate a real-time and efficient design process. To assess the performance of our system, we evaluate it on multiple datasets, including MNIST, TI-46 spoken digits, acoustic drone recordings, and musical MIDI Iiles. Our results demonstrate comparable accuracy while achieving significant power and energy sa\'ings when oompured to existing LSM accelerators. Moreover, our design exhibits resilience in the presence of noise and neuron misfires. These findings highlight the potential of a memristor based LSM architecture to rival purely CMOSbased LSM implementations, offering robust and energy-efficient neuromorphic computing capabilities with memristive SNNs.



Citations (66)


... However, inference on devices faces significant constraints related to power consumption and computational capacity [5]. Currently, employing fixed-point (FXP) data for neural network inference has become the standard approach for DNN inference accelerators [6]. Numerous studies have optimized these accelerators, focusing on fixed-point computations to enhance both area efficiency and energy efficiency. ...

Reference:

Optimizing Area and Power of MAC Arrays in DNN Accelerators via Overflow-Aware Partial Sum Management
Survey of Deep Learning Accelerators for Edge and Emerging Computing

Electronics

... GCNs leverage Fourier transforms to analyze graph signals for social network analysis (Yu and Qin 2020). Additionally, FFT and WT are effective in EEG signal processing, emotion analysis, and radar data interpretation (Xu et al. 2024;Henderson et al. 2023). Integrating these methods with deep learning and SNNs offers a robust framework for analyzing complex datasets. ...

Spiking Neural Networks for LPI Radar Waveform Recognition with Neuromorphic Computing
  • Citing Conference Paper
  • May 2023

... There are various neuromemristive RC implementations in the literature [15], for example Programming with Delayed Pulse (PDP) [16], clustering based complex large-scale arrays [17], reservoir framework with bi-stable memristive synapses [18] and doubly twisted toroidal structure based RC [19], chaotic time series prediction [20]. etc. Different frameworks of LSM on memristors are investigated based on various applications such as for mixed signal system [21], network topology realization [22], musical signal classification [23] and temporal signal classification [24]. Similarly there are various ESN based hardware implementations in literature [25]- [27]. ...

Circuit Optimization Techniques for Efficient Ex-Situ Training of Robust Memristor Based Liquid State Machine
  • Citing Conference Paper
  • May 2023

... The study in Henderson et al. (2022) presents an auditory drone detection and identification system using SNNs and LSMs algorithms. The proposed approach achieves an accuracy of 97.13% for detection and 93.25% for identification tasks on a publicly available acoustic drone dataset. ...

Detection and Classification of Drones Through Acoustic Features Using a Spike-Based Reservoir Computer for Low Power Applications
  • Citing Conference Paper
  • September 2022

... At full utilization (all 1024 lanes computing in parallel) and assuming a reasonable switching time per gate of 3ns [31,32], it would take 1024 2 × 10 12 1024 × 1 3×10 −9 = 3, 072, 000 = 35.56 (2) until total failure (when every cell breaks down). Worse, under practical conditions, even a small number of failed devices can cause incorrect operation, so effective failure can occur much sooner. ...

Memristor Based Federated Learning for Network Security on the Edge using Processing in Memory (PIM) Computing
  • Citing Conference Paper
  • July 2022

... There are various neuromemristive RC implementations in the literature [15], for example Programming with Delayed Pulse (PDP) [16], clustering based complex large-scale arrays [17], reservoir framework with bi-stable memristive synapses [18] and doubly twisted toroidal structure based RC [19], chaotic time series prediction [20]. etc. Different frameworks of LSM on memristors are investigated based on various applications such as for mixed signal system [21], network topology realization [22], musical signal classification [23] and temporal signal classification [24]. Similarly there are various ESN based hardware implementations in literature [25]- [27]. ...

Memristor Based Circuit Design for Liquid State Machine Verified with Temporal Classification
  • Citing Conference Paper
  • July 2022

... Additionally, data movement between the processing units and external memory is optimized by the custom memory architecture, minimizing latency and maximizing throughput. Deeply pipelined multi-FPGA designs are explored to handle large models, with ongoing research on optimization [76]. ...

TRIM: A Design Space Exploration Model for Deep Neural Networks Inference and Training Accelerators
  • Citing Article
  • January 2022

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

... For robotic navigation and spatial awareness, investigation of a version of SLAM with lower processing costs that uses dynamic field theory (DFT) was carriedout [12]. Compared to other typical SLAM algorithms, this implementation completes SLAM tasks with a similar level of accuracy but at a memory cost of only 1/5. ...

An Implementation of Simultaneous Localization and Mapping Using Dynamic Field Theory
  • Citing Conference Paper
  • August 2021

... In digital pathology, nuclei segmentation and classification are essential tasks for accurate disease diagnosis and prognosis. These processes allow for the quantitative analysis of complex nucleus properties such as size, shape, and distribution [1]. Recent advances in deep learning-based methods have significantly improved these processes [9,13]. ...

Microscopic nuclei classification, segmentation, and detection with improved deep convolutional neural networks (DCNN)

Diagnostic Pathology

... Ref. [27] followed the same idea by regressively using error-compensated gyroscope data with different TCN network architectures and loss functions to obtain good estimates of attitude. Ref. [28] proposed a CNN which was trained to learn from the responses of a specific inertial sensor, offering near-real-time error correction. ...

Towards Improved Inertial Navigation by Reducing Errors Using Deep Learning Methodology

Applied Sciences