Aleksandr Cariow

Aleksandr Cariow
  • Phd, D.Sc., professor (full)
  • Head of the Algorithm Design Group at West Pomeranian University of Technology

About

79
Publications
17,283
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
345
Citations
Current institution
West Pomeranian University of Technology
Current position
  • Head of the Algorithm Design Group
Additional affiliations
September 1999 - present
West Pomeranian University of Technology
Position
  • Professor (Full)

Publications

Publications (79)
Article
Full-text available
The subject of this paper is the development of rationalized algorithms of discrete sinusoidal transform of type I for short sequences of length N = 2, 3, 4, 5, 6, 7, and 8. Here, by the word “rationalization”, we mean the reduction of the number of arithmetic operations required to implement the algorithms. The arithmetic complexity of the develop...
Article
Full-text available
The quaternion discrete Fourier transform (QDFT) is a powerful tool in modern digital signal processing, even though until recently this transformation seemed exotic. In recent years, quite a lot of publications have appeared devoted to effective ways to calculate this transformation. In particular, in one of our previous publications, we presented...
Article
Full-text available
This paper presents the type-II fast discrete Hartley transform (DHT-II) algorithms for input data sequences of lengths from 2 to 8. The starting point for developing the eight algorithms is the representation of DHT-II as a matrix–vector product. The underlying matrices usually have a good block structure. These matrices must then be successfully...
Article
Full-text available
Fast algorithms for type-five discrete cosine transform (DCT-V) for sequences of input data of short length in the range of two to eight are elaborated in the paper. A matrix–vector product representation of the DCT-V is the starting point for designing the algorithms. In each specific case, the DCT-V matrices have remarkable structural properties...
Article
Full-text available
The paper presents the original structure of a processing unit for multiplying quaternions. The idea of organizing the device is based on the use of fast Hadamard transform blocks. The operation principles of such a device are described. Compared to direct quaternion multiplication, the developed algorithm significantly reduces the number of multip...
Article
Full-text available
Toeplitz matrix–vector products are used in many digital signal processing applications. Direct methods for calculating such products require N2 multiplications and N(N−1) additions, where N denotes the order of the Toeplitz matrix. In the case of large matrices, this operation becomes especially time intensive. However, matrix–vector products with...
Article
Full-text available
The subject of this work is the development of fast algorithms for the discrete sinusoidal transformation of the second type (DST-II) for sequences of input data of small length N = 2, 3, 4, 5, 6, 7, 8. The starting point for the development of algorithms is the well-known possibility of representing any discrete transformation in the form of a mat...
Article
The brief presents the results of synthesizing efficient algorithms for implementing the basic data-processing macro operations used in tessarine-valued neural networks. These macro operations primarily include the macro operation of multiplication of two tessarines: the macro operation of calculating the inner product of two tessarine-valued vecto...
Article
Full-text available
This paper proposes a new method for calculating the quaternion discrete Fourier transform for one-dimensional data. Although the computational complexity of the proposed method still belongs to the O(Nlog2N) class, it allows us to reduce the total number of arithmetic operations required to perform it compared to other known methods for computing...
Article
Full-text available
The paper introduces a range of efficient algorithmic solutions for implementing the fundamental filtering operation in convolutional layers of convolutional neural networks on fully parallel hardware. Specifically, these operations involve computing M inner products between neighbouring vectors generated by a sliding time window from the input dat...
Article
Full-text available
The evolution of human society is inevitably associated with the widespread development of computer technologies and methods, and the constant evolution of the theory and practice of data processing, as well as the need to solve increasingly complex problems in computational intelligence, have inspired the use of complex and advanced mathematical m...
Article
Full-text available
Discrete cosine transforms (DCTs) are widely used in intelligent electronic systems for data storage, processing, and transmission. The popularity of using these transformations, on the one hand, is explained by their unique properties and, on the other hand, by the availability of fast algorithms that minimize the computational and hardware comple...
Article
Full-text available
This paper proposes fast algorithms for computing the discrete Fourier transform for real-valued sequences of lengths from 3 to 9. Since calculating the real-valued DFT using the complex-valued FFT is redundant regarding the number of needed operations, the developed algorithms do not operate on complex numbers. The algorithms are described in matr...
Article
Full-text available
Winograd’s algorithms are an effective tool for calculating the discrete Fourier transform (DFT). These algorithms described in well-known articles are traditionally represented either with the help of sets of recurrent relations or with the help of products of sparse matrices obtained on the basis of various methods of the DFT matrix factorization...
Article
Full-text available
A set of efficient algorithmic solutions suitable to the fully parallel hardware implementation of the short-length circular convolution cores is proposed. The advantage of the presented algorithms is that they require significantly fewer multiplications as compared to the naive method of implementing this operation. During the synthesis of the pre...
Article
This brief presents the results of a study of the possibilities of reducing the arithmetic complexity of computing basic operations in octonionic neural networks and also proposes new algorithmic solutions for efficiently performing these operations. Here, we primarily mean the operation of multiplying octonions, the operation of computing the dot...
Article
Full-text available
The article presents a parallel hardware-oriented algorithm designed to speed up the division of two octonions. The advantage of the proposed algorithm is that the number of real multiplications is halved as compared to the naive method for implementing this operation. In the synthesis of the discussed algorithm, the matrix representation of this o...
Article
Full-text available
In this article, we introduce a new discrete fractional transform for data sequences whose size is a composite number. The main kernels of the introduced transform are small-size discrete fractional Fourier transforms. Since the introduced transformation is not, in the generally known sense, a classical discrete fractional transform, we call it dis...
Article
Full-text available
This paper presents a new algorithm for multiplying two Kaluza numbers. Performing this operation directly requires 1024 real multiplications and 992 real additions. We presented in a previous paper an effective algorithm that can compute the same result with only 512 real multiplications and 576 real additions. More effective solutions have not ye...
Chapter
In this paper, we present several resource-efficient algorithmic solutions regarding the fully parallel hardware implementation of the basic filtering operation performed in the convolutional layers of convolution neural networks. In fact, these basic operations calculate two inner products of neighboring vectors formed by a sliding time window fro...
Article
Full-text available
This article presents an efficient algorithm for computing a 10-point DFT. The proposed algorithm reduces the number of multiplications at the cost of a slight increase in the number of additions in comparison with the known algorithms. Using a 10-point DFT for harmonic power system analysis can improve accuracy and reduce errors caused by spectral...
Article
Full-text available
In this article, we propose a set of efficient algorithmic solutions for computing short linear convolutions focused on hardware implementation in VLSI. We consider convolutions for sequences of length N= 2, 3, 4, 5, 6, 7, and 8. Hardwired units that implement these algorithms can be used as building blocks when designing VLSI -based accelerators f...
Article
Discrete cosine transforms are widely used in smart radioelectronic systems for the processing and analysis of incoming information. The popularity of using these transform is explained by the presence of fast algorithms that minimize the computational and hardware complexity of their implementation. A special place in the list of transformations t...
Preprint
In this work, a rationalized algorithm for calculating the quotient of two quaternions is presented which reduces the number of underlying real multiplications. Hardware for fast multiplication is much more expensive than hardware for fast addition. Therefore, reducing the number of multiplications in VLSI processor design is usually a desirable ta...
Article
In this paper a new discrete fractional transform for data vectors whose size N is a power of two is proposed. The basic operation of the introduced transform is a discrete fractional Hadamard transform. Since the described transform is not a classical discrete fractional Hadamard transform in its pure form, we called it a pseudo-fractional discr...
Preprint
Full-text available
In this paper, we present several resource-efficient algorithmic solutions regarding the fully parallel hardware implementation of the basic filtering operation performed in the convolutional layers of convolution neural networks. In fact, these basic operations calculate two inner products of neighboring vectors formed by a sliding time window fro...
Article
In this article, we analyze algorithmic ways to reduce the arithmetic complexity of calculating quaternion-valued linear convolution and also synthesize a new algorithm for calculating this convolution. During the synthesis of the discussed algorithm, we use the fact that quaternion multiplication may be represented as a matrix-vector product. The...
Article
Discrete orthogonal transforms including the discrete Fourier transform, the discrete Walsh transform, the discrete Hartley transform, the discrete Slant transform, etc. are extensively used in radio-electronic and telecommunication systems for data processing and transmission. The popularity of using these transform is explained by the presence of...
Article
Full-text available
Discrete orthogonal transforms such as the discrete Fourier transform, discrete cosine transform, discrete Hartley transform, etc., are important tools in numerical analysis, signal processing, and statistical methods. The successful application of transform techniques relies on the existence of efficient fast algorithms for their implementation. A...
Article
This paper proposes a new fast algorithm for calculating the discrete fractional Hadamard transform for data vectors whose size N is a power of two. A direct method for the calculation of the discrete fractional Hadamard transform requires O(N <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) m...
Chapter
This paper proposes an algorithm for computing the discrete fractional Fourier transform. This algorithm takes advantages of a special structure of the discrete fractional Fourier transformation matrix. This structure allows to reduce the number of arithmetic operations required to calculate the discrete fractional Fourier transform.
Chapter
In this paper, we have proposed a novel VLSI-oriented parallel algorithm for quaternion-based rotation in 3D space. The advantage of our algorithm is a reduction the number of multiplications through replacing part of them by less costly squarings. The algorithm uses Logan’s trick, which proposes to replace the calculation of the product of two num...
Preprint
Full-text available
This paper presents a structural design of the hardware-efficient module for implementation of convolution neural network (CNN) basic operation with reduced implementation complexity. For this purpose we utilize some modification of the Winograd minimal filtering method as well as computation vectorization principles. This module calculate inner pr...
Chapter
In this work, a fast algorithm for quaternion-based 4D rotation is presented which reduces the number of underlying real multiplications. Performing a quaternion-based rotation using rotation matrix takes 32 multiplications and 60 additions of real numbers while the proposed algorithm can compute the same result in only 16 real multiplications (or...
Conference Paper
In this chapter an algorithm for computing the Vandermonde matrix-vector product is presented. The main idea of constructing this algorithm is based on the using of Winograd’s formula for inner product computation. Multiplicative complexity of the proposed algorithm is less than multiplicative complexity of the schoolbook (naïve) method of calculat...
Article
Full-text available
This paper proposes an effective approach to the computation of the discrete fractional Fourier transform for an input vector of any length N. This approach uses specific structural properties of the discrete fractional Fourier transformation matrix. Thanks to these properties, the fractional Fourier transformation matrix can be decomposed into a s...
Article
Full-text available
In this paper, new schemes for a squarer, multiplier and divider of complex numbers are proposed. Traditional structural solutions for each of these operations require the presence some number of general-purpose binary multipliers. The advantage of our solutions is a removing of multiplications through replacing them by less costly squarers. We use...
Article
Full-text available
In this paper, we offer and discuss three efficient structural solutions for the hardware-oriented implementation of discrete quaternion Fourier transform basic operations with reduced implementation complexities. The first solution: a scheme for calculating sq product, the second solution: a scheme for calculating qt product, and the third solutio...
Article
This paper describes the methods for finding fast algorithms for computing matrix–vector products including the procedures based on the block-structured matrices. The proposed methods involve an analysis of the structural properties of matrices. The presented approaches are based on the well-known optimization techniques: the simulated annealing an...
Article
Full-text available
In this paper, we have proposed a novel VLSI-oriented approach to computing the rotation matrix entries from the quaternion coefficients. The advantage of this approach is the complete elimination of multiplications and replacing them by less costly squarings. Our approach uses Logan's identity, which proposes to replace the calculation of the prod...
Article
Full-text available
In this work a rationalized algorithm for calculating the quotient of two complex numbers is presented which reduces the number of underlying real multiplications. The performing of a complex number division using the naive method takes 4 multiplications, 3 additions, 2 squarings and 2 divisions of real numbers while the proposed algorithm can comp...
Article
In this paper we introduce efficient algorithm for the multiplication of biquaternions. The direct multiplication of two biquaternions requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of the Pauli numbers with 24 real multiplications and 64 real additions. During s...
Article
Full-text available
The article presents a computationally effective algorithm for calculating the multiresolution discrete Fourier transform (MrDFT). The algorithm is based on the idea of reducing the computational complexity which was introduced by Wen and Sandler [10] and utilizes the vectorization of calculating process at each stage of the considered transformati...
Article
Full-text available
This paper presents the derivation of a new algorithm for multiplying of two Kaluza numbers. Performing this operation directly requires 1024 real multiplications and 992 real additions. The proposed algorithm can compute the same result with only 512 real multiplications and 576 real additions. The derivation of our algorithm is based on utilizing...
Article
We present an efficient algorithm to multiply two arbitrary biquaternions. The schoolbook multiplication of two biquaternions requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of the biquaternions with 24 real multiplications and 56 real additions. During synthesis...
Article
In this paper we introduce an efficient algorithm for the multiplication of biquaternions. The direct multiplication of two biquaternions requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of biquaternions with 24 real multiplications and 64 real additions. During sy...
Article
Full-text available
In this paper we introduce efficient algorithm for the multiplication of split-octonions. The direct multiplication of two split-octonions requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of the split-octonions with 28 real multiplications and 92 real additions. Du...
Article
Full-text available
We present an efficient algorithm to multiply two hyperbolic octonions. The direct multiplication of two hyperbolic octonions requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of the hyperbolic octonions with 26 real multiplications and 92 real additions. During syn...
Article
Full-text available
In this work a rationalized algorithm for Dirac numbers multiplication is presented. This algorithm has a low computational complexity feature and is well suited to FPGA implementation. The computation of two Dirac numbers product using the na\"ive method takes 256 real multiplications and 240 real additions, while the proposed algorithm can comput...
Article
Full-text available
In this paper we present a hardware-oriented algorithm for constant matrix-vector product calculating, when the all elements of vector and matrix are complex numbers. The proposed algorithm versus the naive method of analogous calculations drastically reduces the number of multipliers required for FPGA implementation of complex-valued constant matr...
Article
Full-text available
In this paper we introduce efficient algorithm for the multiplication of trigintaduonions. The direct multiplication of two trigintaduonions requires 1024 real multiplications and 992 real additions. We show how to compute a trigintaduonion product with 498 real multiplications and 943 real additions. During synthesis of the discussed algorithm we...
Article
Full-text available
This article offers the strategies for the synthesis of fast algorithms for computing the matrix-vector products. It considers the specific example of synthesis of fast algorithm for matrix by the vector multiplication. The example offered allows tracking all the stages of construction of the algorithm which was rationalized from the point of view...
Article
Full-text available
In this paper two different approaches to the rationalization of FDWT and IDWT basic operations execution with the reduced number of multiplications are considered. With regard to the well-known approaches, the direct implementation of the above operations requires 2L multiplications for the execution of FDWT and IDWT basic operation plus 2(L-1) ad...
Article
Full-text available
This paper proposes a fast algorithm for computing the discrete fractional Hadamard transform for the input vector of length N, being a power of two. A direct calculation of the discrete fractional Hadamard transform requires N2 real multiplications, while in our algorithm the number of real multiplications is reduced to Nlog2N.
Article
Full-text available
In this paper we introduce an efficient algorithm for the multiplication of Pauli numbers. The direct multiplication of two Pauli numbers requires 64 real multiplications and 56 real additions. More effective solutions still do not exist. We show how to compute a product of the Pauli numbers with 24 conventional multiplications, 8 multiplications b...
Article
We propose an original algorithmic solution for multiplication of octonions. In previously published algorithms for computing the product of octonions the number of multiplications has been reduced by significantly increasing number of additions and shifts. A dignity of the proposed solutions is to reduce by 25% the number of multiplications needed...
Article
In this work a rationalized algorithm for Dirac numbers multiplication is presented. This algorithm has a low computational complexity feature and is well suited to parallelization of computations. The computation of two Dirac numbers product using the naïve method takes 256 real multiplications and 240 real additions, while the proposed algorithm...
Article
Full-text available
This paper presents a high-speed parallel 3x3 matrix multiplier structure. To reduce the hardware complexity of the multiplier structure, we propose to modify the Makarov’s algorithm for 3x3 by 3x3 matrix multiplication The process of matrix product calculation is successively decomposed so that a minimal set of multipliers and fewer adders are use...
Conference Paper
Full-text available
In paper a rationalized approaches to Discrete Wavelet Transform (DWT) co-efficients computing, based on modified algorithm of DWT base operation execution, with less multiplication operations are presented. Those several intuitive ways allow to decrease computational complexity of hardware operational units for DWT basic operation computation and...
Article
Full-text available
In this note we present the algorithm for vector-matrix product calculating for vectors and matrices whose elements are complex numbers. Streszczenie. W artykule został przedstawiony zracjonalizowany algorytm wyznaczania iloczynu wektorowo-macierzowego, dla danych będących liczbami zespolonymi. Proponowany algorytm wyróżnia się w stosunku do metody...
Article
Full-text available
In work the vectorized algorithm for Strassen’s matrix product calculating is presented. Unlike the proposed in other works of “some recommendations”relating to the Strassen’s matrix multiplication implementation, we offer specific computational procedures that allow correctly describe the entire sequence of transformations needed to obtain the fin...
Article
Full-text available
The paper presents practical and effective algorithms to calculate the Toeplitz/Hankel matrix by a vector product that are recursiveless modification of Karatsuba's method. Unlike traditional algorithms, in this case using the FFT is not required. Realization of the developed algorithms involves the use of unconventional ways of choosing the elemen...
Article
Full-text available
The paper presents a fast algorithm for the calculation of a multiresolution discrete Fourier transform. The presented approach is based on the realization of the Fast Fourier Transform for each frequency resolution level. This algorithm allows reducing the number of complex multiplications and additions compared to the method consisting in the mul...
Article
Full-text available
This paper presents an algorithm for discrete fractional Hadamard transform computing for the input vector of length 2 n . This algorithm allows for significant reduction in the number of arithmetic operations by taking advantage of the specific structure of discrete fractional Hadamard transformation matrix. Streszczenie. W artykule przedstawiony...
Conference Paper
In this paper a new approach to the optimized implementation of the Discrete Wavelet Transform (DWT) is presented, which is based on the original algorithm for performing basic DWT procedure with reduced number of multiplications. This approach allows reduction of requirements for hardware cost and computing time and creates usable conditions for e...
Article
In this work a rationalized algorithm for calculating the product of sedenions is presented which reduces the number of underlying multiplications. Therefore, reducing the number of multiplications in VLSI processor design is usually a desirable task. The computation of a sedenion product using the naive method takes 256 multiplications and 240 add...
Data
Full-text available
This paper presents an algorithm for discrete fractional Hadamard transform computing for the input vector of length 2 n . This algorithm allows for significant reduction in the number of arithmetic operations by taking advantage of the specific structure of discrete fractional Hadamard transformation matrix. Keywords: discrete fractional Hadamard...
Article
Full-text available
In this paper we introduce efficient algorithm for the multiplication of sedenions. The direct multiplication of two sedenions requires 256 real multiplications and 240 real additions. We show how to compute a sedenions product with 120 real multiplications and 344 real additions.
Article
We consider algorithmic aspects of improving calculations of octonion product. Octonions together with quaternions represent a variety of hypercomplex numbers. An advantage of the suggested algorithm consists in decreased twice number of calculated real number products needed to compute the octonion product if compared to a straightforward naive wa...
Article
The rationalized algorithm for two quaternion multiplication which require in the common case of a fewer number of multiplication operations then naive way of computing is presented.(Rationalized algorithm for two quaternion multiplication).
Chapter
New hardware-oriented supervectorized algorithmic models for numerical sequences discrete Fourier transform, convolution and correlation computing are described. The models being fully parallelized give possibility to achieve minimal calculation time in comparison with known ones. Key wordsdigital signal processing–discrete Fourier Transform–super...
Article
Full-text available
Streszczenie: W pracy został przedstawiony syntetyzowany przez autorów zracjonalizowany algorytm mnożenia dwóch kwaternionów wymagający w najbardziej ogólnym przypadku wykonania mniejszej liczby operacji mnożenia w stosunku do bezpośredniego, naiwnego sposobu liczenia. Słowa kluczowe: mnożenie kwaternionów, szybkie algorytmy, notacja macierzowa 1....

Network

Cited By