Conference Paper

FPGA Implementation of Support Vector Machine Based Isolated Digit Recognition System

Authors:
  • PES University
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper, two schemes for FPGA implementation of multi-class SVM based isolated digit recognition system are proposed, one using only logic elements and another using both soft-core processor and logic elements(LEs). One of the major contributions of this paper is the proposal for implementation of the decision function using only fixed point arithmetic without compromising the recognition accuracy. Compared to the scheme which uses floating point arithmetic, the proposed scheme reduces the number of LEs required by a factor of 3.29. The second scheme proposed results in about 25 times lower area compared to the first scheme. For the soft-core processor approach, a custom instruction is proposed for floating point arithmetic. Speaker dependent TI46 database of isolated digits is used for training and testing. Features are extracted using both Linear Predictive Coefficients (LPC) and Mel Frequency Cepstral Coefficients(MFCC) and features are compressed using Self Organized Feature Mapping (SOFM). This in turn is used by the SVM classifier to evaluate the recognition accuracy and the hardware resources utilized. Both the schemes proposed result in 100% recognition accuracy when implemented on Altera Cyclone II FPGA. The proposed schemes can also be used for speaker verification and speaker authentication applications. Since the scheme which uses soft-core processor requires lower area, it can be used for systems which require a large vocabulary size.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This accuracy was to the same degree of accuracy of software implementations using the same classification mechanism. In [80], Manikandan et al. proposed a FPGA implementation of multi-class SVM classification for isolated digit recognition. The implementation achieved 100% recognition accuracy for the speaker dependent TI46 database. ...
... By discussing the feasibility of implementation, we find that FPGA is the most suitable embedded system to realize the complete signal processing of the stress recognition. Since the previously mentionned researches [59,80,96] have shown that the SVM classifier can be well implemented in FPGA, we can adopt the approaches proposed in their researches to implement the SVM in FPGA for the classification of the stress levels. However, we found that there are few articles about implementing the ECG based HR computation in FPGA. ...
... We have find that these operations can be implemented in FPGA with a fast and area efficient approach. The previously mentioned researches [59,80,96] have shown that the SVM classifier can be well implemented in FPGA. The decision fusion with the voting method can be implemented by using a counter. ...
Thesis
In modern society, the stress of an individual has been found to be a common problem. Continuous stress can lead to various mental and physical problems and especially for the people who always face emergency situations (e.g., fireman): it may alter their actions and put them in danger. Therefore, it is meaningful to provide the assessment of the stress of an individual. Based on this idea, the Psypocket project is proposed which is aimed at making a portable system able to analyze accurately the stress state of an individual based on his physiological, psychological and behavioural modifications. It should then offer solutions for feedback to regulate this state.The research of this thesis is an essential part of the Psypocket project. In this thesis, we discuss the feasibility and the interest of stress recognition from heterogeneous data. Not only physiological signals, such as Electrocardiography (ECG), Electromyography (EMG) and Electrodermal activity (EDA), but also reaction time (RT) are adopted to recognize different stress states of an individual. For the stress recognition, we propose an approach based on a SVM classifier (Support Vector Machine). The results obtained show that the reaction time can be used to estimate the level of stress of an individual in addition or not to the physiological signals. Besides, we discuss the feasibility of an embedded system which would realize the complete data processing. Therefore, the study of this thesis can contribute to make a portable system to recognize the stress of an individual in real time by adopting heterogeneous data like physiological signals and RT
... This accuracy was to the same degree of accuracy of software implementations using the same classification mechanism. In [13], Manikandan et al. proposed a FPGA implementation of multi-class SVM classification for isolated digit recognition. The implementation achieved 100% recognition accuracy for a speaker-dependent TI46 database. ...
... Previously mentioned researches [8,13,18] have shown that SVM can be well implemented in FPGA and achieve good performance to solve a classification problem. However, in the literature, there are few articles about implementing the ECG-based heart rate (HR) computation in FPGA. ...
Article
Full-text available
This work is part of the Psypocket project which aims to conceive an embedded system able to recognize the stress state of an individual based on physiological and behavioural modifications. In this paper, one of the physiological data, the electrocardiographic (ECG) signal, is focused on. The QRS complex is the most significant segment in this signal. By detecting its position, the heart rate can be learnt. In this paper, a field-programmable gate array (FPGA) architecture for QRS complex detection is proposed. The detection algorithm adopts the integer Haar transform for ECG signal filtering and a maximum finding strategy to detect the location of R peak of the QRS complex. The ECG data are originally recorded by double-precision decimal with the sampling frequencies of 2000 Hz. For the FPGA implementation, they should be converted to integers with rounding operation. To find the best multiplying factor for rounding, the comparison is performed in MATLAB. Besides, to reduce the computation load in FPGA, the feasibility of the reduction in the sampling frequency is tested in MATLAB. The FPGA Cyclone EP3C5F256C6 is used as the target chip, and all the components of the system are implemented in VHSIC hardware description language. The testing results show that the proposed FPGA architecture achieves a high detection accuracy (98.41%) and a good design efficiency in terms of silicon consumption and operation speed. The proposed architecture will be adopted as a core unit to make a FPGA system for stress recognition.
... FPGAs present a level of flexibility that general purpose CPUs can not reach. Also, their computational power by energy consumption rate surpasses GPUs', with the newest models offering even more computational power than late 20 GPU models [1,2]. In the last years, the main producers introduced SoC/FPGA boards, incorporating in the same encapsulation a regular processor, an FPGA and a highspeed communication bus between them. ...
... FPGA promises have attracted implementation of several classification methods. An SVM application for single digit recognition using an FPGA implementation can be found in [20]. Parallel to improvements in the SVM algorithm and evolution of the FPGA devices themselves, 150 new propositions for SVM implementations appeared in [21,22,23]. ...
Article
Full-text available
Classification techniques development constitutes a foundation for machine learning evolution, which has become a major part of the current mainstream of Artificial Intelligence research lines. However, the computational cost associated with these techniques limits their use in resource constrained embedded platforms. As the classification task is often combined with other high computational cost functions, efficient performance of the main modules is fundamental requirements to achieve hard real-time speed for the whole system. Graph-based machine learning techniques offer a powerful framework for building classifiers. Optimum-Path Forest (OPF) is a graph-based classifier presenting the interesting ability to provide nonlinear classes separation surfaces. This work proposes a SoC/FPGA based design and implementation of an architecture for embedded applications, presenting a hardware converted algorithm for an OPF classifier. Comparison of the achieved results with an embedded processor software implementation shows accelerations of the OPF classification from 2.18 to 9 times, which permits to expect real-time performance to embedded applications.
... ASR systems need to be really fast because their applications have a specific characteristic which is the real time response in practice [5]. However, knowing that SVM model has high computational load and it is time-consuming, especially for large-scale problems, it is needed an optimization algorithm to accelerate the recognition process [26]. Then, in order to reduce the processing time, it is used the Particle Swarm Optimization (PSO) technique which is a bio-inspired algorithm for searching the best position of an individual's group [15,30,37]. ...
Article
Full-text available
This paper proposes the implementation of an Automatic Speech Recognition (ASR) process through extraction of Mel-Frequency Cepstral Coefficients (MFCCs) from voice signal commands, application of the Discrete Cosine Transform (DCT) in these coefficients, Support Vector Machine (SVM) training optimized by the Particle Swarm Optimization (PSO) technique in order to speed up the whole process and using One Against All (OAA) multiclass SVM classification. The main contribution is in training phase that it is the combination of SVM with PSO algorithm, resulting in computational load and processing time reduction. This novel algorithm is called here as PSO-SVM hybrid training application and its performance is shown as the experimental results of voice signal commands in Brazilian Portuguese language. Such commands comprise 10 isolated digits (from zero to nine) and 20 action commands such as “go ahead”, “finish”, “pause”, etc.; that is, there are 30 different pattern types (classes) to be separated (recognized). The process is speaker independent type, that is, the voice bank used in training is different from the one used in tests. The obtained results presented success rates of 92% to 99% during the tests for the classifier using RBF kernel function. Besides, the comparison section shows that this technique is 25 times faster than the recognition without optimization and also, it presents 10% of improvement in recognition success rate when compared to the well-known technique, Gaussian Mixture Models (GMM) algorithm. In addition, the proposed algorithm can be applied in any data processing board for voice signals (DSP, FPGA, DSPIC, ...).
... For example, some resource-limited devices are not equipped with floating-point processing units to avoid battery draining, speed-up computation, or reduce chip sizes. Consequently, it is quite interesting to learn models, which can be described in fixed point format with a restricted number of bits [11,12,[50][51][52][53][54][55][56][57][58][59]. This implies revising and restructuring the hypothesis spaces (in the case of the conventional Vapnik's Structural Risk Minimization [SRM] or PAC-Bayes Theory) or the learning procedures (in the case of Algorithmic Stability), which will rely only on a fixed number of bits κ: the smaller is κ, the simpler is the resulting model. ...
... refer to results on Minimum Description Length [110,111]). For example, some resource-limited devices are not equipped with floating point processing units, to avoid battery draining, to speed-up computation or to reduce the size of the chip: consequently, it is important to learn models which can be described with few bits in a fixed point format [112,113,114,115,116,75,117,118,119,120,121,122]. ...
Chapter
Quantum computing represents a promising paradigm for solving complex problems, such as large-number factorization, exhaustive search, optimization, and mean and median computation. On the other hand, supervised learning deals with the classical induction problem where an unknown input-output relation is inferred from a set of data that consists of examples of this relation. Lately, because of the rapid growth of the size of datasets, the dimensionality of the input and output space, and the variety and structure of the data, conventional learning techniques have started to show their limits. Considering these problems, the purpose of this chapter is to illustrate how quantum computing can be useful for addressing the computational issues of building, tuning, and estimating the performance of a model learned from data.
... FPGAs. This is a significant difference from SVM-based classification since FPGA implementation of SVM is known to be a complex task in itself [31], [32], [33]. ...
... Feature extraction of speech is done using several diferent techniques, the most common being Linear Predictive Coding (LPC), Perceptual Linear predictive (PLP) and Mel frequency Cepstral Coeicients (MFCC). LPC is a time-domain technique which varies in the amplitude of the speech signal due to noise. he preferred technique for feature extraction is MFCC with 13 coeicients 4,5 where the features are generated by transforming the signal into frequency domain. In general, Cepstral features are more compact and de-correlated. he experiment based on TIDIGITS corpus demonstrates the efectiveness of proposed techniques leading to higher speech recognition accuracy. ...
Article
Full-text available
Background/Objectives: Recent research focuses on low-power design techniques. This has been mainly inspired by the demand of hand-held electronic devices which have to consume less power. Neural network based classifiers are widely used in speech recognition which needs higher power. Methods/Statistical Analysis: A hybrid approach for designing architecture of Multi-Layer Perceptron (MLP) based Neural Network (NN) for speech recognition is proposed using bipartite tabular method and banking organization method. This approach is prototyped in Xilinx xc3s1200. The Back propagation neural network is trained in MATLAB using TIDIGITS corpus. Using the optimized weights from MATLAB, the proposed prototyped low-power architecture is evaluated in Xilinx. Findings: The outcome shows a considerable reduction both in area and power. Conclusion/Improvements: The reduction in switching power is by 33% and the average power is by 25% are noted. There was 2% reduction in the resources used by the proposed architecture.
... refer to results on Minimum Description Length [110,111]). For example, some resource-limited devices are not equipped with floating point processing units, to avoid battery draining, to speed-up computation or to reduce the size of the chip: consequently, it is important to learn models which can be described with few bits in a fixed point format [112,113,114,115,116,75,117,118,119,120,121,122]. ...
... FPGA is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence field programmable. FPGAs not only offer parallelism but also flexible design, saving in cost and design cycle [19]. The realization of DEMS on FPGA is governed by the testing of unknown data patterns with already trained classifiers. ...
Article
Full-text available
A Support Vector Machine based Dynamic Energy Management System (DEMS)to handle the Smart Micro Grid (SMG) in islanding mode is proposed and developed in this paper. DEMS controls the charge discharge transactions of the energy storage modules installed in the SMG, thereby handling the supply-demand imbalance. The proposed system also performs Demand Response Program based Load Management in the island as frequency control becomes crucial for an islanded SMG. DEMS being a self decisive system provides a unique solution enabling the distributed control of SMG to be performed autonomously. Field Programmable Gate Array (FPGA) is used to realize the proposed system as the response time is critical particularly when the SMG is islanded. The DEM Scheme is validated in a simulated MATLAB environment.
... Speaker recognition is a biometric modality that uses an individual's voice for recognition purposes [1]. The speaker recognition process relies on features influenced by both the physical structure of an individual's vocal tract and the behavioral characteristics of the individual [2]. Speaker recognition technology as a non-contact identification technology, in the judicial, military, and information services, etc. Embedded speaker recognition system, at present, is usually based on DSP processors and other hardware platforms, training and recognition time-consuming [3], bad real-time. ...
Article
Full-text available
For the hardcore processor such as DSP, the existence of embedded speech recognition system taking more time on train and recognition, this paper presents an FPGA-based platform with the principle of vector quantization speech recognition system implementations. In vector quantization using genetic algorithm for speaker recognition systems, the parallel hardware structure of the program can greatly reduce the calculation the time-consuming. After testing, the implementation program under the premise of ensuring the recognition rate, which can effectively reduce the time-consuming of the training and recognition. (C) 2011 Published by Elsevier Ltd.
... Its main purpose was providing the basis to detect accuracy and reliability of telemetry converter. The test platform was one large application equipment gathered with signal generator, automatic detection, automatic measurement and data analysis [1,2]. The core of the missile equipments were digital quantity telemetry converter, instructions converter, comprehensive measuring controller. ...
Article
Full-text available
On the basis of automatic test system of the status in domestic and foreign, by analysis of the various functions and performance of the integrated test system, a design of the integrated test system is proposed, FPGA as the core logic controller of the hardware circuit. The system of the hardware design include: digital signal source output modules, analog output module and PCM codec module. Design of hardware circuit are mainly described. In addition, a detailed analysis of some key technologies in the design process was given. Overall, its data exchange with host computer is through the PCI card, data link and bandwidth can be expanded in accordance with the actual needs. The entire system designed in the modular principle, which has a strong scalability.
... The system architecture consists of a general-purpose 32-bit microprocessor and several slave coprocessors that accelerate the most intensive calculations, achieving significant reduction in execution time when compared with a conventional softwarebased application. FPGAs have also been used for implementing some specific part of a speech or speaker recognition algorithm1819202122 , although, none of them integrate, jointly, the extraction and the matching stages. For example, in [21] an efficient extractor of MFCC parameters for automatic speech recognition is proposed using a low-cost FPGA. ...
Article
Full-text available
Nowadays, biometrics is considered as a promising solution in the market of security and personal verification. Applications such as financial transactions, law enforcement or network management security are already benefitting from this technology. Among the different biometric modalities, speaker verification represents an accurate and efficient way of authenticating a person’s identity by analyzing his/her voice. This identification method is especially suitable in real-life scenarios or when a remote recognition over the phone is required. The processing of a signal of voice, in order to extract its unique features, that allows distinguishing an individual to confirm or deny his/her identity is, usually, a process characterized by a high computational cost. This complexity imposes that many systems, based on microprocessor clocked at hundreds of MHz, are unable to process samples of voice in real-time. This drawback has an important effect, since in general, the response time needed by the biometric system affects its acceptability by users. The design based on FPGA (Field Programmable Gate Arrays) is a suited way to implement systems that require a high computational capability and the resolution of algorithms in real-time. Besides, these devices allow the design of complex digital systems with outstanding performance in terms of execution time. This paper presents the implementation of a MFCC (Mel-Frequency Cepstrum Coefficients)—SVM (Support Vector Machine) speaker verification system based on a low-cost FPGA. Experimental results show that our system is able to verify a person’s identity as fast as a high-performance microprocessor based on a Pentium IV personal computer.
... Three classification techniques used successfully i.e. HMM, ANN and SVM were reported in [18,25,28,37]. Unfortunately, this type of system suffers from limitation in vocabulary size. ...
Article
Full-text available
In this paper speech theories and some methodological concerns about feature extraction and classification techniques widely used in speech recognition system are surveyed and discussed. The shortage of isolated word speech recognition is addressed as compared to its phoneme-based counterpart. This paper could be regarded as a very early stage towards methodology establishment in searching for better accuracy and less complexity system which has more generic properties. It is hoped that the system can classify speech regardless of the varieties across languages or accents. Speaker independency (SI) manner speech recognition system is required for this application and in fact, in many other potential applications as much as a telephonic network (large database consists of many different speakers) is a primary requirement. Isolated-word ASR for fixed vocabularies has been successfully implemented using HMM, ANN and SVM but suffers from lack of adaptability to other languages and increase in complexity as number of vocabularies increases. Conversely, phonemes, the smallest unit of human speech sounds are apparently more feasible to represent the basic building block for cross-language mapping. In fact, the phonetic transcription systems such as IPA and SAMPA are widely recognized and standardized for several languages in the world. This paper intends to investigate the phoneme-based potential as language independent phonetic units to overcome the lack of available training data so as to achieve a more generic speech recognizer. Keywords-component; Speech recognition system; Isolated word; Phoneme; Multilingual detection; Speaker independent;
... The hardware we propose has been designed based on Support Vector Machines (SVM) learning paradigm, which have a solid theoretical background and more clear formulation compared to neural networks [4][5]. Most significant contributions to this field, report FPGA implementation of SVMs for specific target applications [6][7][8] and problems dealing with relatively simple dataset and binary classification problems. However there are a few works focused into generic applications: In [9] ongoing research into a generic and versatile architecture for SVM classification is described. ...
Conference Paper
Full-text available
We present a successful design for a high-performance, low-resource-consuming hardware for Support Vector Classification and Support Vector Regression. The system has been implemented on a low cost FPGA device and exploits the advantages of parallel processing to compute the feed forward phase in support vector machines. In this paper we show that the same hardware can be used for classification problems and regression problems, and we show satisfactory results on an image recognition problem by SV multiclass classification and on a function estimation problem by SV regression.
... There is only very little work investigating the effects of reduced working precision on SVM classification in floating-point arithmetic. In [17, 18] no significant loss in classification accuracy compared to the IEEE double precision floating-point classification has been reported when experimentally reducing the floating-point mantissa length to ≈ 30% and 50% respectively, but also here this is done by experiments only. Similar results have been reported for SVM classification performed in a logarithmic number system [19, 20]. ...
Article
Full-text available
There is growing interest in performing ever more complex classification tasks on mobile and embedded devices in real-time, which results in the need for e_cient implementations of the respective algorithms. Support vector machines (SVMs) represent a powerful class of nonlinear classifiers, and reducing the working precision represents a promising approach to achieving e_cient implementations of the SVM classification phase. However, the relationship between SVM classification accuracy and the arithmetic precision used is not yet su_ciently understood. We investigate this relationship in floating-point arithmetic and illustrate that often a large reduction in the working precision of the classification process is possible without loss in classification accuracy. Moreover, we investigate the adaptation of bounds on allowable SVM parameter perturbations in order to estimate the lowest possible working precision in floating-point arithmetic. Among the three representative data sets considered in this paper, none requires a precision higher than 15 bit, which is a considerable reduction from the 53 bit used in double precision floating-point arithmetic. Furthermore, we demonstrate analytic bounds on the working precision for SVMs with Gaussian kernel providing good predictions of possible reductions in the working precision without sacrificing classification accuracy.
Article
In the process of neutron detection, due to the inelastic scattering and slow neutron capture, a neutron-gamma hybrid radiation field is formed, which increases the complexity of neutron detection. Organic scintillators are widely used in fast neutron detection because of their high scintillation efficiency, short decay time, and good detection efficiency. Pulse shape discrimination (PSD) is a critical technology to discriminate neutrons and gamma rays in organic scintillators according to the decay time of different particles. Traditional PSD methods include time-domain methods and frequency-domain methods. In recent years, various machine learning (ML) models, such as artificial neural network (ANN), support vector machine (SVM), and gaussian mixture model (GMM) have also been applied to neutron-gamma discrimination. In this work, we briefly analyze the luminescence mechanism of organic scintillators, the PSD principle, and some critical physical characteristics of organic scintillators. Later, we review organic scintillators and neutron-gamma discrimination methods by classification and describe various evaluation indexes of neutron-gamma discrimination methods’ performance in organic scintillators. Finally, the development trend of organic scintillators and PSD methods is prospected.
Article
Most state-of-the-art machine-learning (ML) algorithms do not consider the computational constraints of implementing the learned model on embedded devices. These constraints are, for example, the limited depth of the arithmetic unit, the memory availability, or the battery capacity. We propose a new learning framework, the Algorithmic Risk Minimization (ARM), which relies on Algorithmic-Stability, and includes these constraints inside the learning process itself. ARM allows one to train advanced resource-sparing ML models and to efficiently deploy them on smart embedded systems. Finally, we show the advantages of our proposal on a smartphone-based Human Activity Recognition application by comparing it to a conventional ML approach.
Article
In the field of Test & Control Instrument, multi-channel signal sources are needed as the excitation in the electrical parameters measurement. This paper presents a realization method of multiplexed signal source which combined the advantages of DDS principle with high speed programmable logic device FPGA, fully reflect the characteristics of fast switching frequency, high precision with DDS technology, and abundant FPGA logic resources. This design realized the optimization of DDS resources primarily through FPGA, made use of timing multi-threaded design ROM shared access, innovated an adjustable amplitude method, namely, applied FPGA internal procedures to modulate the normalized waveform sampling digital.
Article
Smartphones emerge from the incorporation of new services and features into mobile phones, allowing to implement advanced functionalities for the final users. The implementation of Machine Learning (ML) algorithms on the smartphone itself, without resorting to remote computing systems, allow to achieve such goals without expensive data transmission. However, smartphones are resource-limited devices and, as such, suffer from many issues, which are typical of stand-alone devices, such as limited battery capacity and processing power. We show in this paper how to build a thrifty classifier by exploiting bit-based hypothesis spaces and local Rademacher Complexities. The resulting classifier is tested on a real-world Human Activity Recognition application, implemented on a Samsung Galaxy S II smartphone.
Article
This paper presents the implementation of a speaker-verification system on field programmable gate array. The algorithm is executed by software over an embedded system that includes a MicroBlaze microprocessor connected to a vector floating-point unit (VFPU). The VFPU is designed to speed up the resolution of any vector floating-point operation involved in the verification algorithm, whereas the microprocessor manages the control of the process and executes the rest of operations. With a clock frequency of 40 MHz, the system is capable of executing the complete algorithm in real time, processing a voice frame in 9.1 ms. The same verification process was carried out for two different systems: 1) an ARM Cortex A8 microprocessor; and 2) configuring MicroBlaze with the scalar floating-point unit provided by Xilinx. The experimental results show that when comparing our proposed system against both systems, the number of clock cycles is reduced by a factor of 11.2x and 15.4x, respectively. The main advantage of the VFPU is its flexibility, which allows quick adaptation of the software to the potential changes produced in both the system and the user requirements. The algorithm was tested over a public database that contains the utterances of different users acquired under different environmental conditions, providing good recognition rates.
Article
Mobile devices are resource-limited systems that provide a large number of services and features. Smartphones, for example, implement advanced functionalities and services for the final user, in addition to conventional communication capabilities. Machine Learning algorithms can help in providing such advanced functionalities, but mobile systems suffer from issues related to their resource-limited nature like, for example, limited battery capacity and processing power and, therefore, even simple pattern recognition activities can become too demanding, in this respect. We propose here a method to design a Human Activity Recognition algorithm, which takes in account the fact that only limited resources are available for its execution. In particular, we restrict the hypothesis space of possible recognition models by applying some advanced concepts from Statistical Learning Theory, so to force the selection of models with good generalization ability but low computational complexity. Then, the learned model can be effectively implemented on a mobile and resource-limited device: the experiments, carried out on a current-generation smartphone, show the benefits of the proposed approach in terms of both model accuracy and battery duration.
Conference Paper
A Dynamic Energy Management (DEM) controller which is capable of taking decisions based on the status of the grid-connected smart microgrid has been developed using Support Vector Machine (SVM) and Artificial Neural Networks (ANN). The proposed control strategy involves the decisions for the dynamic charge-discharge transactions in the energy storage systems like battery and pumped hydro (PH) units connected to the smart microgrid in order to maintain a real time balance of generation and load. A comparison has been made based on the realizations of both SVM model and ANN model on SPARTAN 3AN Field Programmable Gate Array (FPGA) and the results show that SVM implementation is better than ANN implementation. The projected DEM system when tested with the existing laboratory model of a smart microgrid results in sustainable supply of power as the SVM based DEM controller monitors power flow in the lines and provides an optimal solution.
Conference Paper
We present a design scheme for SVM decision function based on the hardware-friendly kernel on FPGA device. This scheme is suitable for classification and regression problems. We adopt ModelSim simulation platform for SVM classification and regression experiments. The hardware implementation obtains the same classification accuracy as the LIBSVM package by using the appropriate fixed-point number precision in classification experiments. We had done the preliminary study on the precision of input parameters in SVC by choosing fixed-point arithmetic; and the minimum number of bits of SVR input parameters was obtained in the case of not reducing the performance of SVM classifier. The mean square error of the hardware implementation is less than 0.004 in regression experiments, with good regression performance.
Conference Paper
This paper presents the design of a digital hardware implementation based on Support Vector Machines (SVMs), for the task of multi-speaker phoneme recognition. The One-against-one multiclass SVM method, with the Radial Basis Function (RBF) kernel was considered. Furthermore, a priority scheme was also included in the architecture, in order to forecast the three most likely phonemes. The designed system was synthesised on a Xilinx Virtex-II XC2V3000 FPGA, and evaluated with the TIMIT corpus. This phoneme recognition system is intended to be implemented on a dedicated chip, along with the Discrete Wavelet Transforms (DWTs) for feature extraction, to further improve the resultant performance.
Article
Support vector machine (SVM) is one of the state-of-the-art tools for linear and nonlinear pattern classification. One of the design issues in SVM classifier is reducing the number of support vectors without compromising the classification accuracy. A technique denoted as diminishing learning (DL) is already proposed in literature for an SVM based multi-class isolated digit recognition system using speaker dependent TI46 database of isolated digits. In this paper, the computational complexity for SVM and SVM-DL based isolated digit recognition system is studied and the computation time for both the classifiers is evaluated by system-on-programmable-chip (SOPC) implementation of the recognition system onto an Altera Cyclone II Series FPGA using Nios II Soft-core processor. The number of support vectors is reduced by 38.28–90.25 % on using SVM-DL for isolated digit recognition problem. This in turn reduces the classification time for SVM-DL by 31.45–91.78 % over SVM. Recognition accuracies of 97 and 98 % are achieved for SVM classifier with and without DL technique, respectively. The study confirms the effect of, the order in which the classes are classified, on the recognition accuracy. For the TI46 database, about 2–4 % increase in recognition accuracy is obtained by choosing the optimum order for SVM-DL classifier. The proposed SOPC implementation of SVM-DL based recognition system can be employed for various other pattern recognition applications too such as face recognition, character recognition and target recognition.
Article
The sequential minimal optimization (SMO) algorithm has been extensively employed to train the support vector machine (SVM). This work presents an efficient application specific integrated circuit chip design for sequential minimal optimization. This chip is implemented as an intellectual property core, suitable for use in an SVM-based recognition system on a chip. The proposed SMO chip was tested and found to be fully functional, using a prototype system based on the Altera DE2 board with a Cyclone II 2C70 field-programmable gate array.
Article
In this paper, a novel approach for phoneme classification using binary feature vector and correlation based classifier is proposed. The input speech segmentation is carried out using the Average Level Crossing Rate (ALCR) information. A 513-point binary feature vector is generated for each of the phoneme segment detected by ALCR boundaries. The phoneme recognition is based on the uniqueness of the frequency content of each of the phoneme. Instead of using a Hidden Markov Model, Artificial Neural Network or Support Vector Machine based classifier, a simple correlation classifier using the correlation between feature vectors and the set of feature vectors generated with training data is employed. The proposed approach is simpler and requires lesser computational resources when compared with other pattern classification techniques. The performance of proposed phoneme recognition system has been evaluated using real-time speech input and the recognition performance of the proposed phoneme recognition system is satisfactory.
Conference Paper
A number of techniques have been proposed in the literature for phoneme based speech recognition system. In this paper, a technique for automatic phoneme recognition using zero-crossings (ZC) and magnitude sum function (MSF) is proposed. The number of zero-crossings and magnitude sum function per frame are extracted and a minimum distance classifier is proposed to recognize the phonemes in each frame with these features. In order to increase the recognition accuracy of phonemes, a finite state machine is also proposed. The performance of the proposed phoneme recognition system is evaluated using TTS database and compared with the system using Linear Predictive Coefficients(LPC) feature inputs. Phoneme recognition accuracies of 70.93% and 55.25% are obtained for the system using LPC and the one using ZC along with MSF respectively. However, using the finite state machine proposed in this paper, 100% recognition accuracy is obtained for both the techniques. The computational costs required for recognizing various sentences using both of the feature extraction techniques are evaluated. It is observed that the proposed technique requires about 9.3 times lower computational cost than the one using LPC. The proposed phoneme recognition system is also implemented on an Altera Cyclone II FPGA using Nios II soft-core processor and custom instructions. The custom instructions for floating point arithmetic and Minimum distance classifier provide an acceleration factor of 41 and 1.87 respectively. The technique proposed here is also applicable for speech inputs from other database.
Conference Paper
A number of techniques have been proposed in the literature for phoneme based speech recognition system. In this paper, a technique for automatic phoneme recognition using zero-crossings (ZC) and magnitude sum function (MSF) is proposed. The number of zero-crossings and Magnitude sum function per frame are extracted and a Minimum Distance Classifier is proposed to recognize the phonemes in each frame with these features. In order to increase the recognition accuracy of phonemes, a finite state machine is also proposed. The performance of the proposed phoneme recognition system is evaluated using TTS database and compared with the system using Linear Predictive Coefficients (LPC) feature inputs. Phoneme recognition accuracies of 70.93% and 55.25% are obtained for the system using LPC and the one using ZC along with MSF respectively. However, using the finite state machine proposed in this paper, 100% recognition accuracy is obtained for both the techniques. The computational costs required for recognizing various sentences using both of the feature extraction techniques are evaluated. It is observed that the proposed technique requires about 9.3 times lower computational cost than the one using LPC. The proposed technique is adopted for the implementation of the phoneme recognition system on Texas Instruments TMS320C6713 floating point processor. The different ways to reduce the recognition time for the target device is explored and reported in this paper. The technique proposed here is also applicable for speech inputs from other database.
Article
To facilitate the application of support vector machines (SVMs) in embedded systems, we propose and test a parallel and scalable digital architecture based on the sequential minimal optimization (SMO) algorithm for training SVMs. By taking advantage of the mature and popular SMO algorithm, the numerical instability issues that may exist in traditional numerical algorithms are avoided. The error cache updating task, which dominates the computation time of the algorithm, is mapped into multiple processing units working in parallel. Experiment results show that using the proposed architecture, SVM training problems can be solved effectively with inexpensive fixed-point arithmetic and good scalability can be achieved. This architecture overcomes the drawbacks of the previously proposed SVM hardware that lacks the necessary flexibility for embedded applications, and thus is more suitable for embedded use, where scalability is an important concern.
Article
In this paper, Texas Instruments TMS320C6713 DSP based real-time speech recognition system using Modified One Against All Support Vector Machine (SVM) classifier is proposed. The major contributions of this paper are: the study and evaluation of the performance of the classifier using three feature extraction techniques and proposal for minimizing the computation time for the classifier. From this study, it is found that the recognition accuracies of 93.33%, 98.67% and 96.67% are achieved for the classifier using Mel Frequency Cepstral Coefficients (MFCC) features, zerocrossing (ZC) and zerocrossing with peak amplitude (ZCPA) features respectively. To reduce the computation time required for the systems, two techniques – one using optimum threshold technique for the SVM classifier and another using linear assembly are proposed. The ZC based system requires the least computation time and the above techniques reduce the execution time by a factor of 6.56 and 5.95 respectively. For the purpose of comparison, the speech recognition system is also implemented using Altera Cyclone II FPGA with Nios II soft processor and custom instructions. Of the two approaches, the DSP approach requires 87.40% less number of clock cycles. Custom design of the recognition system on the FPGA without using the soft-core processor would have resulted in less computational complexity. The proposed classifier is also found to reduce the number of support vectors by a factor of 1.12–3.73 when applied to speaker identification and isolated letter recognition problems. The techniques proposed here can be adapted for various other SVM based pattern recognition systems.
Article
Full-text available
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
Article
Full-text available
Support vector machines (SVMs) have improved generalization performance over other classical optimization techniques. Here, we introduce an SVM-based approach for linear array processing and beamforming. The development of a modified cost function is presented and it is shown how it can be applied to the problem of linear beamforming. Finally, comparison examples are included to show the validity of the new minimization approach.
Conference Paper
Full-text available
In this paper we present an analysis of the minimal hardware precision required to implement support vector machine (SVM) classification within a logarithmic number system architecture. Support vector machines are fast emerging as a powerful machine-learning tool for pattern recognition, decision-making and classification. Logarithmic number systems (LNS) utilize the property of logarithmic compression for numerical operations. Within the logarithmic domain, multiplication and division can be treated simply as addition or subtraction. Hardware computation of these operations is significantly faster with reduced complexity. Leveraging the inherent properties of LNS, we are able to achieve significant savings over double-precision floating point in an implementation of a SVM classification algorithm.
Article
Full-text available
A new support vector machine (SVM) algorithm for coherent robust demodulation in orthogonal frequency-division multiplexing (OFDM) systems is proposed. We present a complex regression SVM formulation specifically adapted to a pilots-based OFDM signal. This novel proposal provides a simpler scheme than an SVM classification method. The feasibility of our approach is substantiated by computer simulation results obtained for IEEE 802.16 broadband fixed wireless channel models. These experiments allow to scrutinize the performance of the OFDM-SVM system and the suitability of the epsiv-Huber cost function, in the presence of non-Gaussian impulse noise interfering with OFDM pilot symbols
Article
Full-text available
First Page of the Article
Article
Full-text available
Algorithms that produce classifiers with large margins, such as support vector machines (SVMs), AdaBoost, etc, are receiving more and more attention in the literature. A real application of SVMs for synthetic aperture radar automatic target recognition (SAR/ATR) is presented and the result is compared with conventional classifiers. The SVMs are tested for classification both in closed and open sets (recognition). Experimental results showed that SVMs outperform conventional classifiers in target classification. Moreover, SVMs with the Gaussian kernels are able to form a local “bounded” decision region around each class that presents better rejection to confusers
Article
The purpose of this book is to explain the theoretical issues and implementational techniques related to the fascinating field of speech coding. This chapter is organized as follows: an overview of speech coding is provided first, with the structure, properties, and applications of speech coders explained; the different classes of speech coders are described next, followed by speech production and modeling, covering properties of speech signals and a very simple coding system. High-level explanation of the human auditory system is given, where the system properties are used to develop efficient coding schemes. Activities of standard bodies and many standardized coders are discussed in the next section, followed by issues related to analysis and implementation of algorithms. A brief summary is given at the end of the chapter.
Conference Paper
This paper describes preliminary performance results of a reconfigurable hardware implementation of a support vector machine classifier, aimed at brain-computer interface applications, which require real-time decision making in a portable device. The main constraint of the design was that it could perform a classification decision within the time span of an evoked potential recording epoch of 300 ms, which was readily achieved for moderate-sized support vector sets. Regardless of its fixed-point implementation, the FPGA-based model achieves equivalent classification accuracies to those of its software-based, floating-point counterparts
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
From the publisher: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc., and are now established as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and its applications. The concepts are introduced gradually in accessible and self-contained stages, while the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally, the book and its associated web site will guide practitioners to updated literature, new applications, and on-line software.
Conference Paper
We propose a novel equalization scheme for the direct sequence (DS) based ultra wideband (UWB) indoor system, utilizing support vector machine (SVMs). SVMs are effective learning technique which gains more consideration in pattern classification. In this work, a bank of SV classifiers are trained by received signal chips in order to extract the subsequent corresponding symbols. Simulation results confirm the superb efficiency of the proposed technique for many scenarios of transmission, especially the performance in Light-of-Sight cases where only few pilot blocks are used for training.
Conference Paper
In this paper, a novel method for voiced/unvoiced/silence of speech classification using the support vector machine (SVM) is proposed. This classifier can correctly classify speech frames into voiced frame, unvoiced frame and silence frame. The comparison of experimental results show that the proposed method outperforms other traditional methods. The performance of SVM for different kernel functions in the experiment was analyzed and discussed as well.
Article
A neural network system which combines a self-organizing feature map and multilayer perception for the problem of isolated word speech recognition is presented. A new method combining self-organization learning and K -means clustering is used for the training of the feature map, and an efficient adaptive nearby-search coding method based on the `locality' of the self-organization is designed. The coding method is shown to save about 50% computation without degradation in recognition rate compared to full-search coding. Various experiments for different choices of parameters in the system were conducted on the TI 20 word database with best recognition rates as high as 99.5% for both speaker-dependent and multispeaker-dependent tests
Support Vector Machines for DS-UWB Channel Equalisation, Department of Electrical Engineering & Electronics, University of Liverpool
  • S Mohamed
  • Xu Musbah
  • Zhu
Speech Coding Algorithms
  • C Wai
  • Chu