Conference PaperPDF Available

Ensemble of Hybrid CNN-ELM Model for Image Classification

Authors:

Abstract and Figures

To leverage feature representation of CNN and fast classification learning of ELM, Ensemble of Hybrid CNN-ELM model is proposed for image classification. In this model, image representation features are learned by Convolutional Neural Network (CNN) and fed to Extreme Learning Machine (ELM) for classification. Three hybrid CNN-ELMs are ensemble in parallel and final output is computed by majority voting ensemble of these classifier’s outputs. The experiments show this ensemble model improves the classifier’s classification confidence and accuracy. This model has been benchmarked on MNIST benchmark dataset and effectively improved the accuracy in comparison of single hybrid CNN-ELM classifier with accuracy upto 99.33%. Proposed ensemble model has been also compared with Core CNN, Core ELM, Hybrid CNN-ELM and achieves competitive accuracy.
Content may be subject to copyright.
Ensemble of Hybrid CNN-ELM Model for Image
Classification
Suresh Prasad Kannojia
ICT Research Lab, Department of Computer Science
University of Lucknow
Lucknow, India
Gaurav Jaiswal
ICT Research Lab, Department of Computer Science
University of Lucknow
Lucknow, India
Abstract
To leverage feature representation of CNN and
fast classification learning of ELM, Ensemble of Hybrid CNN-
ELM model is proposed for image classification. In this model,
image representation features are learned by Convolutional
Neural Network (CNN) and fed to Extreme Learning Machine
(ELM) for classification. Three hybrid CNN-ELMs are ensemble
in parallel and final output is computed by majority voting
ensemble of these classifier’s outputs. The experiments show this
ensemble model improves the classifier’s classification confidence
and accuracy. This model has been benchmarked on MNIST
benchmark dataset and effectively improved the accuracy in
comparison of single hybrid CNN-ELM classifier with accuracy
upto 99.33%. Proposed ensemble model has been also compared
with Core CNN, Core ELM, Hybrid CNN-ELM and achieves
competitive accuracy.
Keywords— Enesemble model; CNN-ELM; Convolutional
neural network; Extreme learning machine; Image classification;
I.
I
NTRODUCTION
Image classification is basic and yet active problem of
computer vision which attract many researchers. The
performance of classifier is depended on feature representation
of image. Convolutional neural network is one of the best
feature representation techniques which represents abstracted
higher to lower features in its higher to lower layers [1]. CNN
(Convolutional neural network) is inspired by visual receptor
of cat’s visual cortex experiment done by hubel and wiesel [2].
CNN (Convolutional Neural Network) is introduced by Lecun
et al. [3] in 1998 for improving feature extraction and
maintaining shape information along with other features. After
work of A Krizhevsky et al. [4], CNN method became popular
in research community. Large amount of training data takes
large amount of time for Convolutional neural network to
learn. Reducing learning time constraint, Convolutional neural
network is used for extracting Convolutional features and
Extreme learning machine is used as multiclass classifier.
Extreme Learning machine is fast trainable single hidden layer
feed forward neural network and invented by G. B. Huang [5].
To improve the performances of classification, several
fusion methods have been developed by combining different
feature extraction and classification methods. CNN-ELM
classifier has been proposed by Y. Zeng et al. [6] for traffic
sign recognition and achieved human level accuracy. L Guo et
al [7] have proposed Hybrid CNN-ELM with improved
accuracy and validate its performance on MNIST dataset. F.
Gurpinar et al. [8] have replaced ELM to Kernel ELM for
classification that minimizes the mean absolute error. Face
features have been extracted from a pre-trained deep
convolutional network. Kernel extreme learning machines are
used for classification. Y. Yoo et al. [9] proposed a novel fast
learning architecture of CNN-ELM. Their core architecture is
based on a local image (local receptive field) version of the
ELM adopting random feature learning. Q. Weng et al. [10]
combines ELM classifier with the CNN-learned features
instead of the fully connected layers of CNN to land-use
classification.
To leverage feature representation of CNN, fast learning of
ELM and to increase classification confidence of classifier,
ensemble of hybrid CNN-ELM model has been proposed for
Image classification. This model composed of three main
components: Convolutional neural network, ELM classifier
and Majority voting ensemble. Each base classifier consists of
CNN and ELM and trained on different divided dataset. Such
three base classifiers are combined by majority voting
ensemble on their outputs. This ensemble improves the
classification confidence and accuracy. Figure 1 shows the
block diagram of proposed ensemble model for image
classification.
The main idea behind this proposed method is to make
decision by majority voting of different classifiers which are
learned in different environment with different dataset. Each
base classifier learned on different divided dataset of MNIST
[3] and ensemble method has outperformed with comparison of
single CNN-ELM method trained on whole MNIST dataset.
This also indicates proposed model can learn very effectively
on distributed dataset and achieve improved accuracy.
Figure 1: Block diagram of ensemble of Hybrid CNN-ELM model for
image classification
2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)
978-1-5386-3045-7/18/$31.00 ©2018 IEEE 538
Remaining paper is organized as follow: Section 2 describe
the proposed Ensemble of hybrid CNN-ELM model for image
classification with each component. In section 3, proposed
ensemble model is implemented and benchmarked on
benchmark dataset MNIST. Experimental results have been
evaluated on performance metrics and compared with core
CNN, ELM and CNN-ELM methods. Section 4 concludes the
paper.
II. P
ROPOSED
M
ODEL
This ensemble model composed of three main
components: Convolutional neural network, ELM classifier
and Majority voting ensemble.
A. Convolutional Neural Network:
Convolutional neural network consists of mainly three
types of layers: Convolutional layer, Pooling layer and
Softmax layer. In convolutional layer, input image is
convolved with multiple kernels. CNN always preserve the
spatial information and generate multiple feature maps.
Pooling layer reduce the size of feature map by spatial
invariance average or maximum operation. Both convolutional
layer and pooling layer compose feature extraction module. In
this paper, Convolutional neural network is exploited as
feature extractor. In softmax layer, softmax activation function
is used to classify input feature map into class value.
Convolutional network architecture is shown in Figure 2.
The main benefit of CNNs is that they are easier to train
and have many fewer parameters than fully connected
networks with the same number of hidden units.
B. Extreme Learning Machine:
Extreme learning machine (ELM) was proposed for
training single hidden layer feed forward neural network
(SLFNs) [5]. Figure 3 shows the architecture of Extreme
learning machine. In ELM, the hidden nodes of hidden layer
are randomly initialised and then fixed without iteratively
tuning. The only free parameters need to be learned are
connections (or weights) between the hidden layer and output
layer. ELM trains an SLFN in two main stages: random
feature mapping and linear parameters solving.
C. Majority Voting Ensemble:
Majority voting ensemble is used as final decision making
for this model. Majority voting scheme first calculate the total
vote received by each base classifiers. Majority of vote is
calculated for predicted class label. Final prediction is assign
to majority voted base classifier prediction. Number of base
classifier should be always in odd number. In case of equal
vote, mode operation is applied.
D. Proposed Model:
This paper proposes ensemble of hybrid CNN-ELM
model for image classification. Figure 4 shows the architecture
of this ensemble model. Each base classifier consists of CNN
and ELM and trained on different divided dataset.
Convolutional neural network (CNN) implicitly learns and
extracts the convolutional features form train image dataset.
ELM is exploited as multi-class classifier and trained on
convolutional features. Such three base classifiers are
combined by majority voting
ensemble on their outputs. The
training algorithm of proposed ensemble of hybrid CNN-ELM
method is given:
Figure 3: Extreme learning machine classifier architecture
Figure 2: Block diagram of typical convolutional neural network
Training Algorithm for Ensemble of N-Hybrid CNN-ELM
Model:
INPUT: Training images [X
train
, Y
train
], Test images X
test
.
OUTPUT: Trained model, Predicted test image class label Ypred
PROCEDURE:
Divide training images [X
train
, Y
train
] in N part
For each base classifier C
i
in N
Extract Convolutional features of train images [X
itrain
, Y
itrain
]
Train ELM classifier with divided train images [X
itrain
, Y
itrain
]
Compute predicted class label Ypred
i
for Test images [X
itest
,
Y
itest
]
End For
Perform majority voting on [Ypred
1
, ... , Ypred
i
, ... , Ypred
N
]
Return Ypred
2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)
539
III. E
XPERIMENT AND
R
ESULTS
A. Experimental Setup:
In this ensemble method, CNN is implemented in python
based deep learning framework Keras with Theano backend
and ELM is implemented in python. For ensemble, we first
train CNN for feature extraction. CNN contains 3
convolutional layers (with convolutional filter size (3, 3) and
number of filters 100 for each convolutional layer), 3 max
pooling layers (with pooling size (2,2)) that extract the low
level features, mid level features and high level features. Then
extracted convolutional features are normalized in range [-1,
+1] and fed into extreme learning machine for classification.
Each ELM classifiers are trained on divided MNIST dataset
and fine-tuned on different numbers of hidden layer neuron.
Such three hybrid CNN-ELMs are ensemble in parallel and
combined the output by majority voting ensemble to increase
the classification confidence and to improve accuracy. This
ensemble model has been evaluated on MNIST test dataset
over standard performance metrics. This experiment has been
performed on cluster with 16 CPU core and 64 GB Ram.
B. Results:
Proposed ensemble method is evaluated on MNIST
benchmark dataset over standard performance metrics. Table 1
shows the experimental result of classification of each digit
from MNIST dataset. These performance metrics are
calculated on predicted class label and actual class label of test
image of MNIST dataset. Precision score is the probability of
exactness while recall score is the probability of completeness.
F1 score is the harmonic mean of precision and recall.
Average scores of performance metrics show that model
perform accurate as well as exact and completeness.
C. Comparison:
To measure the accuracy improvement, proposed model has
compared with CNN, ELM and CNN-ELM. Proposed
ensemble model achieve the improved accuracy up to 99.33%.
Table 2 shows comparison of experimental result of CNN,
ELM, CNN-ELM and Ensemble model. The main benefit of
this method is to achieve maximum accuracy by combining
classifier.
Figure 4: Architecture of Ensemble of Hybrid CNN-ELM model
CL: Convolutional Layer, MP: Max-pooling Layer, IL: Input Layer, HL: Hidden Layer, OL: Output Layer
2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)
540
Table I. Evaluation of ensemble of hybrid CNN-ELM model on MNIST
dataset
Table II. Comparison of accuracy on MNIST dataset with CNN, ELM, CNN-
ELM
Methods Accuracy
CNN 99.20%
ELM 97.54%
CNN-ELM 99.24%
Ensemble of CNN-ELM (ours) 99.33%
IV. C
ONCLUSION
This paper has proposed ensemble of hybrid CNN-ELM
model for image classification. This model leverage the
feature representation learning of convolutional neural
network, fast learning of Extreme learning machine and final
decision making by majority voting ensemble. The
effectiveness of this model is evaluated on MNIST benchmark
dataset over standard performance metrics. The experimental
results shows ensemble method outperformed in comparison
of core CNN, core ELM and Hybrid CNN-ELM. This also
indicates proposed model can learn very effectively on
distributed dataset and achieve improved accuracy.
A
CKNOWLEDGMENT
We thankful to Central facility of Computational Research,
University of Lucknow for providing access to KRISHNA
cluster. Second author is also thankful to UGC for providing
UGC-SRF fellowship to sustain his research.
R
EFERENCES
[1]
Bengio, Y. (2009). Learning deep architectures for AI.
Foundations and trends® in Machine Learning, 2(1), 1-127.
[2]
Hubel, D. H., Wiesel, T. N., (1959), Receptive fields of single
neurones in the cat's striate cortex. The Journal of Physiology,
148 doi: 10.1113/jphysiol.1959.sp006308.
[3]
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998).
Gradient-based learning applied to document recognition.
Proceedings of the IEEE, 86(11), 2278-2324.
[4]
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet
classification with deep convolutional neural networks. In
Advances in neural information processing systems (pp. 1097-
1105).
[5]
Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme
learning machine: theory and applications. Neurocomputing,
70(1), 489-501.
[6]
Zeng, Y., Xu, X., Fang, Y., & Zhao, K. (2015, June). Traffic
sign recognition using deep convolutional networks and extreme
learning machine. In International Conference on Intelligent
Science and Big Data Engineering (pp. 272-280). Springer
International Publishing.
[7]
Guo, L., & Ding, S. (2015). A hybrid deep learning cnn-elm
model and its application in handwritten numeral recognition.
Journal of Computational Information Systems, 11(7), 2673-
2680.
[8]
Gurpinar, F., Kaya, H., Dibeklioglu, H., & Salah, A. (2016).
Kernel ELM and CNN based facial age estimation. In
Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition Workshops (pp. 80-86).
[9]
Yoo, Youngwoo, and Se-Young Oh. "Fast training of
convolutional neural network classifiers through extreme
learning machines." Neural Networks (IJCNN), 2016
International Joint Conference on. IEEE, 2016.
[10]
Weng, Q., Mao, Z., Lin, J., & Guo, W. (2017). Land-Use
Classification via Extreme Learning Classifier Based on Deep
Convolutional Features. IEEE Geoscience and Remote Sensing
Letters.
Class Precision Recall F1 Score Accuracy
0 0.9980 0.9919 0.9949 0.9919
1 0.9991
0.9930 0.9960 0.9930
2 0.9903
0.9951 0.9927 0.9952
3 0.9921
0.9940 0.9931 0.9941
4 0.9949
0.9919 0.9934 0.9919
5 0.9922
0.9899 0.9910 0.9900
6 0.9885
0.9968 0.9927 0.9969
7 0.9951
0.9913 0.9932 0.9913
8 0.9938
0.9969 0.9954 0.9970
9 0.9881
0.9920 0.9901 0.9921
Average 0.9932 0.9933 0.9932 0.9933
2018 5th International Conference on Signal Processing and Integrated Networks (SPIN)
541
... -Inception-v3 [41,32,17,16] -InceptionResNetv2 [40,2,32] -ResNet152 [19,2,32] -DenseNet201 [22,32,8] -MobileNet [21,17] -AlexNet [24,10] -VGG16 [44,34] -Custom architectures [36,23] ...
... The first solution consists of a neural network trained on each model's output. In the second one [23,29] voting was used. Sun et al. [39] proposed an approach consisting of a dedicated ensemble for each class predicted by a classifier. ...
Chapter
Full-text available
Substantial body of work has been devoted to skin cancer recognition in high-resolution medical images. However, nowadays photos of skin lesions can be taken by mobile phones, where quality of image is reduced. The aim of this contribution is report on skin cancer recognition when applying machine learning to “low resolution” images. Experiments have been performed on the dataset from the ISIC 2018 Challenge. KeywordsSkin cancerMachine LearningDeep Neural Network
... Long et al. [20] applied double pseudo-inverse weights to improve ELM in tumor diagnosis, reducing the risk of delay due to misdiagnosis. Kannojia et al. [21] tested CNN-ELM on MINIST and got the accuracy of 99.33%. Zhu et al. [22] employed ELM to predict the impact low-pressure data of a mine with GA to optimize initial weights. ...
Article
Full-text available
This paper constructs an adaptive Kalman estimator for SOC of lithium-ion batteries based on Ah integration method and genetic algorithm (GA) optimized extreme learning machine (ELM). Firstly, the ELM is introduced to SOC estimation as it has a small amount of computation and perfect generalization performance. Secondly, GA is selected to optimize ELM so as to avoid overfitting problems. Thirdly, an adaptive Kalman estimator is established to denoise and predict the SOC with results from Ah integration and ELM. Finally, training and testing datasets are collected from standard driving cycles and mixed cycles under charging processes at different temperatures. Experimental results show that the GA-ELM-based Kalman estimator outperforms the single ELM strategy and Kalman filter in terms of SOC estimation accuracy. Specifically, both the root mean square error and the mean absolute error are less than 1.2%.
... In this method, the number of base classifiers should always be odd. In the case of an equal vote, the mode function is applied [30]. e proposed B-ALL disease detection and classification model, which is an ensemble framework of four models, is displayed in Figure 4. e four selected networks are in parallel and combine the output module with the ensemble technique to improve the classification confidence and accuracy. ...
Article
Full-text available
Introduction. Acute lymphoblastic leukemia (ALL) is the most common type of leukemia, a deadly white blood cell disease that impacts the human bone marrow. ALL detection in its early stages has always been riddled with complexity and difficulty. Peripheral blood smear (PBS) examination, a common method applied at the outset of ALL diagnosis, is a time-consuming and tedious process that largely depends on the specialist’s experience. Materials and Methods. Herein, a fast, efficient, and comprehensive model based on deep learning (DL) was proposed by implementing eight well-known convolutional neural network (CNN) models for feature extraction on all images and classification of B-ALL lymphoblast and normal cells. After evaluating their performance, four best-performing CNN models were selected to compose an ensemble classifier by combining each classifier’s pretrained model capabilities. Results. Due to the close similarity of the nuclei of cancerous and normal cells, CNN models alone had low sensitivity and poor performance in diagnosing these two classes. The proposed model based on the majority voting technique was adopted to combine the CNN models. The resulting model achieved a sensitivity of 99.4, specificity of 96.7, AUC of 98.3, and accuracy of 98.5. Conclusion. In classifying cancerous blood cells from normal cells, the proposed method can achieve high accuracy without the operator’s intervention in cell feature determination. It can thus be recommended as an extraordinary tool for the analysis of blood samples in digital laboratory equipment to assist laboratory specialists.
... In contrast, weighted voting means that different weights or rights are attached to different models so that the models with higher weight could affect more on the determination of prediction [33]. Kannojia proposed their hybrid classification model combining CNN and Extreme Learning Machine as an image classifier [34]. By applying ensemble learning with the majority voting rule, the accuracy of their model was increased to 99.33% compared with 99.24% by a single model and benchmarked on the MNIST dataset. ...
Preprint
i>Abstract — One of the most prevalent diseases, skin cancer, has been proven to be treatable at an early stage. Thus, techniques that allow individuals to identify skin cancer symptoms early are in great demand. This paper proposed an interactive skin lesion diagnosis system based on the ensemble of multiple sophisticated CNN models for image classification. The performance of ResNet50, ResNeXt50, ResNeXt101, EfficientNetB4, Mobile-NetV2, MobileNetV3, and MnasNet are investigated separately as ensemble components. Then, using various criteria, we constructed ensembles and compared the accuracy they achieved. Moreover, we designed a method to update the ensemble for new data and examined its performance. In addition, a few natural language processing (NLP) techniques were used to make our system more user-friendly. To integrate all the functionalities, we built a user interface with PyQt5. As a result, MobileNetV3 achieved 91.02% as the best accuracy among all single models; ensemble weighted by cubic precision achieved 92.84% accuracy as the highest one in this study; a notable improvement in accuracy demonstrated the effectiveness of the model updating approach, and a system with all of the desired features was successfully developed. These findings benefit in two aspects. For model performance, applying cubic precisions can increase ensemble learning classification accuracy. For the developed diagnosis system, it can aid in the
Article
Machine learning and deep learning methods have become exponentially more accurate. These methods are now as precise as experts of respective fields, so it is being used in almost all areas of life. Nowadays, people have more faith in machines than men, so, in this vein, deep learning models with the concept of transfer learning of CNN are used to detect and classify diabetic retinopathy and its different stages. The backbone of various CNN-based models such as InceptionResNetV2, InceptionV3, Xception, MobileNetV2, VGG19, and DenceNet201 are used to classify this vision loss disease. In these base models, transfer learning has been applied by adding some layers like batch normalization, dropout, and dense layers to make the model more effective and accurate for the given problem. The training of the resulting models has been done for the Kaggle retinopathy 2019 dataset with about 3662 fundus fluorescein angiography colored images. Performance of all six trained models have been measured on the test dataset in terms of precision, recall, F1 score, macro average, weighted average, confusion matrix, and accuracy. A confusion matrix is based on maximum class probability prediction that is the incapability of the confusion matrix. The ROC-AUC of different classes and the models are analyzed. ROC-AUC is based on the actual probability of different categories. The results obtained from this study show that InceptionResNetV2 is proven the best model for diabetic retinopathy detection and classification, among other models considered here. It can work accurately in case of less training data. Thus, this model may detect and classify diabetic retinopathy automatically and accurately at an early stage. So it would be beneficial for humans to reduce the effects of diabetes. As a result of this, the impact of diabetes on vision loss can be minimized, and that would be a blessing in the medical field.
Article
Full-text available
One of the challenging issues in high-resolution remote sensing images is classifying land-use scenes with high quality and accuracy. An effective feature extractor and classifier can boost classification accuracy in scene classification. This letter proposes a deep-learning-based classification method, which combines convolutional neural networks (CNNs) and extreme learning machine (ELM) to improve classification performance. A pretrained CNN is initially used to learn deep and robust features. However, the generalization ability is finite and suboptimal, because the traditional CNN adopts fully connected layers as classifier. We use an ELM classifier with the CNN-learned features instead of the fully connected layers of CNN to obtain excellent results. The effectiveness of the proposed method is tested on the UC-Merced data set that has 2100 remotely sensed land-use-scene images with 21 categories. Experimental results show that the proposed CNN-ELM classification method achieves satisfactory results.
Article
Full-text available
Theoretical results strongly suggest that in order to learn the kind of complicated functions that can repre- sent high-level abstractions (e.g. in vision, language, an d other AI-level tasks), one needs deep architec- tures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult opti mization task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This paper d iscusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.
Article
Full-text available
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day
Article
It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: (1) the slow gradient-based learning algorithms are extensively used to train neural networks, and (2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these conventional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide good generalization performance at extremely fast learning speed. The experimental results based on a few artificial and real benchmark function approximation and classification problems including very large complex applications show that the new algorithm can produce good generalization performance in most cases and can learn thousands of times faster than conventional popular learning algorithms for feedforward neural networks.1
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
Traffic sign recognition is an important but challenging task, especially for automated driving and driver assistance. Its accuracy depends on two aspects: feature exactor and classifier. Current popular algorithms mainly use convolutional neural networks (CNN) to execute feature extraction and classification. Such methods could achieve impressive results but usually on the basis of an extremely huge and complex network. What’s more, since the fully-connected layers in CNN form a classical neural network classifier, which is trained by conventional gradient descent-based implementations, the generalization ability is limited. The performance could be further improved if other favorable classifiers are used instead and extreme learning machine (ELM) is just the candidate. In this paper, a novel CNN-ELM model is proposed, which integrates the CNN’s terrific capability of feature learning with the outstanding generalization performance of ELM. Firstly CNN learns deep and robust features and then ELM is used as classifier to conduct a fast and excellent classification. Experiments on German traffic sign recognition benchmark (GTSRB) demonstrate that the proposed method can obtain competitive results with state-of-the-art algorithms with less computation time.
Article
Convolutional Neural Network (CNN) is a typical algorithm structure of deep learning and it has applied in image recognition field widely. Based on CNN, this paper puts forward a novel hybrid deep learning model CNN-ELM. In this model, CNN is in charge of feature extraction and Extreme Learning Machine (ELM) performs as a classifier to complete the final classification so as to integrate the synergy of two classifiers. We have took experiments on MNIST digit database. Comparisons with CNN and ELM on the same database, CNN-ELM achieved an error rate of 0.67%, which indicate that CNN-ELM has achieved better generalization performance.