Rastislav Struharik

Rastislav Struharik
  • Professor
  • Professor (Full) at University of Novi Sad

About

45
Publications
5,216
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
259
Citations
Introduction
Rastislav Struharik, full professor, currently works at the Department of Power, Electronics and Telecommunications Engineering, University of Novi Sad. Rastislav does research in Computer Architecture, Artificial Intelligence and Artificial Neural Networks. His current project is 'Hardware Acceleration of Machine Learning Algorithms.'
Current institution
University of Novi Sad
Current position
  • Professor (Full)
Additional affiliations
October 2016 - present
University of Novi Sad
Position
  • Functional Verification
October 2015 - present
University of Novi Sad
Position
  • Discrete Signals and Systems
October 2014 - present
University of Novi Sad
Position
  • Introduction to Hardware Description Languages
Education
October 2005 - December 2009
University of Novi Sad
Field of study
  • Electronics
October 1999 - May 2005
University of Novi Sad
Field of study
  • Electronics
October 1993 - January 1999
University of Novi Sad
Field of study
  • Electronics and Telecommunications

Publications

Publications (45)
Article
Full-text available
Object detection is a popular image-processing technique, widely used in numerous applications for detecting and locating objects in images or videos. While being one of the fastest algorithms for object detection, Single-shot Multibox Detection (SSD) networks are also computationally very demanding, which limits their usage in real-time edge appli...
Article
Full-text available
This study presents a universal reconfigurable hardware accelerator for efficient processing of sparse decision trees, artificial neural networks and support vector machines. The main idea is to develop a hardware accelerator that will be able to directly process sparse machine learning models, resulting in shorter inference times and lower power c...
Article
Full-text available
Paper proposes a two-step Convolutional Neural Network (CNN) pruning algorithm and resource-efficient Field-programmable gate array (FPGA) CNN accelerator named “Argus”. The proposed CNN pruning algorithm first combines similar kernels into clusters, which are then pruned using the same regular pruning pattern. The pruning algorithm is carefully ta...
Article
Data movement between the Convolutional Neural Network (CNN) accelerators and off-chip memory is critical concerning the overall power consumption. Minimizing power consumption is particularly important for low power embedded applications. Specific CNN computes patterns offer a possibility of significant data reuse, leading to the idea of using spe...
Article
Full-text available
In this paper, a hardware accelerator for sparse support vector machines (SVM) is proposed. We believe that the proposed accelerator is the first accelerator of this kind. The accelerator is designed for use in field programmable gate arrays (FPGA) systems. Additionally, a novel algorithm for the pruning of SVM models is developed. The pruned SVM m...
Article
In this paper, we propose a novel Convolutional Neural Network hardware accelerator called CoNNA, capable of accelerating pruned, quantized CNNs. In contrast to most existing solutions, CoNNA offers a complete solution to the compressed CNN acceleration, being able to accelerate all layer types commonly found in contemporary CNNs. CoNNA is designed...
Article
Full-text available
This paper presents a hardware accelerator for sparse decision trees intended for FPGA applications. To the best of authors’ knowledge, this is the first accelerator of this type. Beside the hardware accelerator itself, a novel algorithm for induction of sparse decision trees is also presented. Sparse decision trees can be attractive because they r...
Conference Paper
In this paper we propose a novel Convolutional Neural Network hardware accelerator, called CoNNA, capable of accelerating pruned, quantized, CNNs. In contrast to most existing solutions, CoNNA offers a complete solution to the full, compressed CNN acceleration, being able to accelerate all layer types commonly found in contemporary CNNs. CoNNA is d...
Article
Convolutional Neural Networks (CNNs) are becoming a fundamental tool for machine learning. High performance and energy efficiency are of great importance for deployments of CNNs in many embedded applications. Energy consumption during CNN processing is dominated by memory access and since large networks do not fit in on-chip storage, they require e...
Article
In this paper a system for hardware-aided induction of decision tree ensembles using the evolutionary approach (Decision Tree Ensemble Evolution co-Processor—DTEEP) is proposed. DTEEP is used for hardware acceleration of the fitness evaluation, since it is shown that most of the ensemble inference time is spent on this task. The DTEEP co-processor...
Conference Paper
In this paper we propose a novel CNN hardware accelerator, called AIScale, capable of accelerating convolutional, pooling, fully-connected and adding CNN layers. In contrast to most existing solutions, AIScale offers a complete solution to the full CNN acceleration. AIScale is designed as a coarse-grained reconfigurable architecture, which uses rap...
Conference Paper
In fields like embedded vision, where algorithms are computationally expensive, hardware accelerators play a major role in high throughput applications. These accelerators could be implemented as hardwired IP cores or Application Specific Instruction-set Processors (ASIPs). While hardwired solutions often provide the best possible performance, they...
Conference Paper
Algorithms for data encryption are one of the most important parts of modern communication systems. In this paper the results of hardware implementation of AES256 and TDES algorithms are presented. AES256 and TDES are implemented as an IP core with AXI interface because of constant growth of data transfer requirements in modern embedded systems, in...
Article
In this paper a co-processor for the hardware aided decision tree induction using evolutionary approach (EFTIP) is proposed. EFTIP is used for hardware acceleration of the fitness evaluation task since this task is proven in the paper to be the execution time bottleneck. The EFTIP co-processor can significantly improve the execution time of a novel...
Conference Paper
This paper presents a novel algorithm for induction of full oblique decision trees (EFTI). Proposed algorithm is based on special, single individual evolutionary algorithm, which evolves full decision tree by modifying its structure and node coefficients during the evolution process. EFTI algorithm is particularly well suited to be used in embedded...
Article
In this paper a universal reconfigurable computing architecture for hardware implementation of homogeneous and heterogeneous ensemble classifiers composed from decision trees (DTs), artificial neural networks (ANNs), and support vector machines (SVMs) is proposed. The following types of ensemble classifiers have been implemented in FPGA using propo...
Conference Paper
IP cores for direct and inverse discrete cosine transformation are important part of compression core implementation. In this paper the results for the implementation of two dimensional binary discrete cosine transformation (2D binDCT) and two dimensional binary discrete cosine inverse transformation (2D binIDCT) algorithm which are discrete cosine...
Conference Paper
This paper presents four different architectures for the hardware acceleration of axis-parallel, oblique and non-linear decision tree ensemble classifier systems. Hardware architectures for the implementation of a number of ensemble combination rules are also presented. The proposed architectures are optimized for size, making them particularly int...
Conference Paper
This paper proposes four different hardware architectures for parallel implementation of decision trees forming an ensemble classifier are presented. Proposed architectures can accelerate ensemble classifiers composed of axis-parallel, oblique and nonlinear decision tree (DTs). Hardware architectures for the implementation of a number of combinatio...
Article
Full-text available
This paper proposes universal coarse-grained reconfigurable computing architecture for hardware implementation of decision trees (DTs), artificial neural networks (ANNs), and support vector machines (SVMs), suitable for both field programmable gate arrays (FPGA) and application specific integrated circuits (ASICs) implementation. Using this univers...
Conference Paper
In this paper an application of evolutionary algorithm to oblique decision tree inference is presented. In the core of new decision tree inducing algorithm is the specific evolutionary algorithm called HereBoy. Performance of proposed HBDT algorithm is studied and compared with eight existing decision tree building algorithms using standard benchma...
Conference Paper
This paper presents several IP cores for the hardware evolution of ensembles comprised from oblique or non-linear decision trees. These cores can be implemented using FPGA or ASIC technology. Results of experiments obtained using 29 datasets from the standard UCI Machine Learning Repository database suggest that the FPGA implementations offer signi...
Conference Paper
We propose a new digital reconfigurable architecture for the implementation of machine learning classifiers. The architecture can be configured to implement a neural network, decision tree or a support vector machine type of classifier.
Article
Full-text available
In this paper, several hardware architectures for the realization of ensembles of axis-parallel, oblique and nonlinear decision trees (DTs) are presented. Hardware architectures for the implementation of a number of ensemble combination rules are also presented. These architectures are universal and can be used to combine predictions from any type...
Conference Paper
In paper is presented method for generation of control signals sequences, allowing one to compute an arbitrary n-input 1-output Boolean function, using only two working memristors. Described approach is based on the use of recursive Boolean formula, which brings the way for implementation of Boolean function over functionaly complete basis {imply,...
Conference Paper
This paper introduces a novel programmable reconfigurable architecture, called vCell Matrix. Proposed architecture is based on the Cell Matrix architecture with several important modifications that allow simpler and fast reconfiguration. The new architecture is optimized for the implementation using Xilinx FPGA devices. A special embedded system, b...
Conference Paper
This paper proposes several IP cores for the hardware implementation of the complete decision tree inference algorithm. Evolving decision trees in hardware is motivated by the significant improvement in the evolution time compared to the time needed for software evolution. Several architectures for the hardware evolution of single oblique or nonlin...
Conference Paper
We propose a new digital architecture for a SVM classification. The architecture uses a kernel which is suited for an implementation as a digital architecture in embedded systems. It is then tested on a channel equalization problem where real-time performances are important and hardware implementation of the classification is needed.
Conference Paper
Full-text available
In this paper several hardware implementations of decision trees (axis-parallel, oblique and non-linear) based on the concept of universal node and sequence of universal nodes are presented. Proposed hardware architectures are suitable for the implementation in both Field Programmable Gate Arrays (FPGA) and Application Specific Integrated Circuits...
Article
Full-text available
This paper, according to the best of our knowledge, provides the very first solution to the hardware implementation of the complete decision tree inference algorithm. Evolving decision trees in hardware is motivated by a significant improvement in the evolution time compared to the time needed for software evolution and efficient use of decision tr...
Article
Several soft intellectual property (IP) core implementations of decision trees (axis-parallel, oblique and nonlinear) based on the concept of universal node (UN) and sequence of UNs are presented. Proposed IP cores are suitable for implementation in both field programmable gate arrays and application specific integrated circuits. Deveoped IP cores...
Article
In this paper an algorithm for construction of neural networks from decision trees is presented. First decision tree is constructed using some standard algorithm, then equivalent set of rules is extracted from that tree. Neural network is than formed using that set of rules. This neural network is than used as a basis for further learning that will...
Conference Paper
In this paper a novel voltage-controlled memristor model accounting for exponential ionic drift with respect to memristor voltage and nonlinear ionic drift with respect to barrier position is proposed. The model is implemented in Simscape™ language and the functionality of the model is tested using simulations of memristive digital logic and adapti...
Conference Paper
This paper presents a survey of the work done so far in designing hybrid CMOS/nanoelectronic computing architectures. Although there have been previous attempts in making such a survey [8-9], they seem to be incomplete at current time, mostly due to the fact that three additional years have passed since their publishing and several new computing ar...
Conference Paper
Full-text available
In this paper design of an Huffman decoder FPGA core that is part of an motion JPEG system is presented. Core is designed and verified using VHDL hardware description language. It is implemented and tested on an Virtex2 FPGA platform from Xilinx Inc.
Conference Paper
Full-text available
In this paper architectures for the 2D DCT/IDCT (Discrete Cosine Transform, Inverse Discrete Cosine Transform) are presented. These architectures were developed for the FPGA implementation. First, algorithms for the efficient 2D DCT/IDCT calculation are presented. Using these algorithms micro-architectures for the efficient FPGA implementation are...
Conference Paper
This paper presents a design of soft-core for industry standard 8051 microcontroller intended for FPGA implementation. It can be used in a wide range of applications such as embedded microcontroller systems, data computation and transfer, communication systems and professional audio and video. This core will be used as a part of video transmission...
Conference Paper
In this paper is presented a typical procedure for designing and verification of complex hardware systems using languages for hardware description. VHDL is used as one of two exciting standards. All steps are illustrated on the designing of the DLX CPU. The DLX is a RISC designed for teaching principles of computer architecture. As an implementatio...

Network

Cited By