ArticlePDF Available

Abstract and Figures

Low resolution images contain less visual information, so classifications of these images are difficult. For overcoming this problem, cascade CNN framework for low resolution image classification is proposed. In this framework, super resolution CNN (SRCNN) enlarges low resolution image into super resolution image. The convolutional features of these super resolution images are fused with low resolution convolutional features which are extracted by low resolution CNN (LRCNN) feature extractor. A deep neural network classifier is trained on these fused CNN features. This classifier classifies low resolution images using learned fused features. Proposed cascade framework has been evaluated on different benchmark image dataset MNIST, CIFAR10 and achieves competitive accuracy results
Content may be subject to copyright.
JoAIRA (2019) 39-43 © STM Journals 2019. All Rights Reserved Page 39
Journal of Artificial Intelligence Research & Advances
ISSN: 2395-6720 (Online)
Volume 6, Issue 1
www.stmjournals.com
Cascade CNN Framework for Low Resolution Image
Classification
Suresh Prasad Kannojia, Gaurav Jaiswal*
ICT Research Lab, Department of Computer Science, University of Lucknow, Lucknow, Uttar Pradesh, India
Abstract
Low resolution images contain less visual information, so classifications of these images are
difficult. For overcoming this problem, cascade CNN framework for low resolution image
classification is proposed. In this framework, super resolution CNN (SRCNN) enlarges low
resolution image into super resolution image. The convolutional features of these super
resolution images are fused with low resolution convolutional features which are extracted by
low resolution CNN (LRCNN) feature extractor. A deep neural network classifier is trained on
these fused CNN features. This classifier classifies low resolution images using learned fused
features. Proposed cascade framework has been evaluated on different benchmark image
dataset MNIST, CIFAR10 and achieves competitive accuracy results.
Keywords: Low resolution, cascade CNN, super resolution, feature fusion, image
classification
*Author for Correspondence E-mail: gauravjais88@gmail.com
INTRODUCTION
With the growth of generated data and
limitation of computing power, low resolution
images are generated and processed by many
real-world applications, e.g. biometric,
pedestrian recognition, object recognition [1
3]. Low resolution image contains less pixels
and visual information in comparison to super
resolution image. Low resolution degrades the
visual information of images while they
contain simple or complex visual information.
The degradation of visual information is
directly related to the classification of these
images. For finding the relation between the
degradation of visual information and
classification, several research studies have
been done. The performance of classification
is reduced with the degradation of resolution
of image [4, 5].
In literature, Chevalier et al. proposed
LRCNN which leverages the benefit of super
resolution layer to improve fine grain
classification [6]. Further, they improve low
resolution classification by integrating
privileged information of CNN to fine grain
classification [7]. Cai et al. also proposed a
super resolution CNN framework to handle the
classification of low resolution image [8]. Wei
et al. presented an algorithm that learned a
sparse image transformation by coupling the
sparse structures of image pairs from both HR
and LR spaces and successfully achieved
competitive classification accuracy [9]. The
idea of feature fusion of low and super
resolution image has not been implemented for
low resolution image classification.
For handling the low resolution image and its
classification, Cascade CNN framework for
low resolution image classification is proposed
in this paper. The main idea behind this
framework is to exploit the super resolution
CNN for generating super resolution image
from low resolution image by improving
visual information and feature fusion. Here,
the combination of low resolution and super
resolution features contains more visual
information while preserving its original low
features. Thus cascade framework improves
the classification of low resolution images.
This framework is evaluated on standard
MNIST [10], and CIFAR10 [11] image dataset
and successfully achieves the competitive
performance.
Journal of Artificial Intelligence Research & Advances
Volume 6, Issue 1
ISSN: 2395-6720 (Online)
JoAIRA (2019) 39-43 © STM Journals 2019. All Rights Reserved Page 40
Remaining paper is organized as follows: Next
part of the paper describes the proposed cascade
framework for low image resolution image
classification and its components. After that, the
proposed framework is experimentally
evaluated on standard image dataset MNIST
and CIFAR10. This is folllowed by the part
which concludes the paper.
PROPOSED FRAMEWORK
Proposed cascade CNN framework for low
resolution image classification mainly consists
of two modules: Feature extraction and
Classification. Feature extraction module
extracts the low resolution feature using
LRCNN feature extractor and super resolution
feature using SRCNN and CNN feature
extractor. Extracted super resolution feature
and low resolution feature are concatenated
and fed into the classifier module. Classifier
learns and classifies image accordingly. The
architecture of this proposed framework is
shown in Figure 1. The description of its
components are given below.
Super Resolution CNN Feature
For extracting super resolution CNN feature from
low resolution image, first image is converted into
super resolution by super resolution methods.
Several state of arts super resolution methods are
available e.g. Deep convolutional super resolution
[12, 13], sparse coding super resolution etc. [14,
15]. Here, SRCNN is adopted for super resolution
image generation in proposed framework [12].
SRCNN directly learns an end to end mapping
between the low resolution and high resolution
images. The architecture of SRCNN is shown in
Figure 2.
Super resolution feature is extracted from super
resolution image by the trained convolutional
neural network. This super resolution feature
contains the extra visual feature. The benefit of
this feature is to improve the classification
predictability of classifier.
Fig. 1: Architecture of Cascade CNN Framework for Low Resolution Image Classification.
Fig. 2: Architecture of SRCNN [12].
Journal of Artificial Intelligence Research & Advances
Volume 6, Issue 1
ISSN: 2395-6720 (Online)
JoAIRA (2019) 39-43 © STM Journals 2019. All Rights Reserved Page 41
Low Resolution CNN Feature
Low resolution feature is extracted from low
resolution image by the trained convolutional
neural network. Convolutional neural network
learns automatically image spatial information
and stores these spatial feature maps into their
higher to lower layers. The benefit of this low
resolution feature is to preserve original
feature for the training of classifier.
Feature Fusion and Classification
To exploit the benefit of both low and super
resolution, both features are fused with feature
fusion method. Feature fusion is inspired by
feature concatenation of subspace learning
[16]. Here, both low and super resolution
features are concatenated without dimension
reduction and fed to neural network classifier.
The classifier classifies it into its output class.
EXPERIMENT AND RESULTS
Experimental Setup
For generating low resolution image dataset
from original image dataset (MNIST,
CIFAR10), both dataset are resized into
multiple low resolution dataset. For MNIST
dataset, images of (28×28, 21×21, 14×14, 7×7)
resolution are generated. For CIFAR10
dataset, images of (32×32, 24×24, 16×16, 8×8)
resolution are generated. Figure 3 shows one
sample image with its generated low
resolution images from both dataset.
Here, for the sake of computing power and
training time constraint, trained model of these
dataset is used for feature extraction. Layer-
wise architecture details of CNN feature
extractor for both dataset are tabulated in
Table 1. Proposed cascade CNN framework is
implemented in Python library Keras and
TensorFlow. This framework is trained with
each generated low resolution image dataset of
MNIST and CIFAR10.
RESULTS AND DISCUSSION
The proposed framework is evaluated on each
generated low resolution image dataset of
MNIST and CIFAR10, over performance
metric accuracy. The performance of this
framework is also compared with the
performance of simple CNN model over the
same dataset. The details of performance
comparison of simple CNN framework and
cascade CNN framework over both dataset are
tabulated in Tables 2 and 3 respectively.
Now, comparison graph between simple CNN
and proposed cascade CNN are generated for
both datasets which is shown in Figure 4. This
graph clearly shows that the proposed cascade
CNN framework outperforms over simple
CNN. The proposed framework successfully
classifies lower resolution images with high
accuracy rates.
Fig. 3: Original Image with Their Low Resolution Image of MNIST [10] and CIFAR10 [11].
Journal of Artificial Intelligence Research & Advances
Volume 6, Issue 1
ISSN: 2395-6720 (Online)
JoAIRA (2019) 39-43 © STM Journals 2019. All Rights Reserved Page 42
Table 1: Layer-wise Architectural Details of CNN for MNIST and CIFAR10 Dataset.
CNN Architecture for MNIST
CNN Architecture for CIFAR10
Layers
Layers Parameter
Layers
Layers Parameter
Activation Function
Conv2D
Conv2D
Maxpooling2D
32, size=(3, 3)
32, size=(3, 3)
Size=(2, 2)
Conv2D
Conv2D
Conv2D
Maxpooling2D
Dropout
32, size=(3, 3)
32, size=(3, 3)
32, size=(3, 3)
Size=(2, 2)
0.25
Relu
Relu
Relu
Conv2D
Conv2D
Maxpooling2D
64, size=(3, 3)
64, size=(3, 3)
Size=(2, 2)
Conv2D
Conv2D
Conv2D
Maxpooling2D
Dropout
64, size=(3, 3)
64, size=(3, 3)
64, size=(3, 3)
Size=(2, 2)
0.25
Relu
Relu
Relu
Dense
512
Dense
512
Relu
Table 2: Performance Comparison of Simple CNN Framework and Proposed Cascade CNN
Framework over MNIST Dataset.
Image Resolution
Accuracy
Simple CNN Framework
Proposed Cascade CNN Framework
28×28
0.9952
0.9954
21×21
0.9938
0.9950
14×14
0.9920
0.9939
7×7
0.9687
0.9915
Table 3: Performance Comparison of Simple CNN Framework and Proposed Cascade CNN
Framework over CIFAR10 Dataset.
Image Resolution
Accuracy
Simple CNN Framework
Proposed Cascade CNN Framework
32×32
0.7686
0.7712
24×24
0.7247
0.7697
16×16
0.6678
0.7263
8×8
0.5796
0.6771
0.955
0.96
0.965
0.97
0.975
0.98
0.985
0.99
0.995
1
28x28
21x21
14x14
7x7
Performance Score
Resolution
Performance over MNIST Dataset
Accuracy (Simple CNN)
Accuracy (Proposed Cascade CNN framework)
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
32x32
24x24
16x16
8x8
Performance Score
Resolution
Performance over CIFAR10 Dataset
Accuracy (Simple CNN)
Accuracy (Proposed Cascade CNN framework)
Fig. 4: Comparison Graph between the Simple CNN and Proposed Cascade CNN Framework over
MNIST and CIFAR10 Dataset.
Journal of Artificial Intelligence Research & Advances
Volume 6, Issue 1
ISSN: 2395-6720 (Online)
JoAIRA (2019) 39-43 © STM Journals 2019. All Rights Reserved Page 43
CONCLUSION
Proposed cascade framework is implemented
and evaluated on MNIST and CIFAR10
dataset with their lower resolution dataset.
This framework successfully classifies lower
resolution images with high accuracy rates.
Experimental results show that this framework
performs better than the classification of low
resolution images over simple CNN based
image classifier.
ACKNOWLEDGEMENT
This work is supported by UGC SRF
Fellowship (3894/Net June 2013). We are
thankful to university grant commission for
providing fellowship to sustain research.
REFERENCES
1. Mudunuri, Sivaram Prasad, and Biswas,
Soma. Low Resolution Face Recognition
across Variations in Pose and Illumination.
IEEE Trans Pattern Anal Mach Intell.
2016; 38(5): 103440p.
2. Liu Yun-Fu, Jing-Ming Guo, Che-Hao
Chang. Low Resolution Pedestrian
Detection Using Light Robust Features
and Hierarchical System. Pattern Recogn.
2014; 47(4): 161625p.
3. Wang Zhangyang, Shiyu Chang, Yingzhen
Yang, et al. Studying Very Low
Resolution Recognition Using Deep
Networks. In Proceedings of the IEEE
Conference on Computer Vision and
Pattern Recognition. 2016; 47924800p.
4. Kannojia, Suresh Prasad and Jaiswal, Gaurav.
Effects of Varying Resolution on Performance
of CNN based Image Classification: An
Experimental Study. International Journal of
Computer Sciences and Engineering (IJCSE).
2018; 6(9): 45156p.
5. Ullman, Shimon, et al. Atoms of Recognition
in Human and Computer Vision. Proceedings
of the National Academy of Sciences. 2016;
113(10): 274449p.
6. Chevalier M, Thome N, Cord M, et al.
LR-CNN for Fine-Grained Classification
with Varying Resolution. IEEE
International Conference of Image
Processing. 2015; 31013105p.
7. Chevalier Marion, Nicolas Thome, Gilles
Hénaff, et al. Classifying Low-Resolution
Images by Integrating Privileged
Information in Deep CNNs. Pattern
Recogn Lett. 2018; 116: 2935p.
8. Cai D, Chen K, Qian Y, et al.
Convolutional Low-Resolution Fine-
Grained Classification. Pattern Recogn
Lett. 2019; 119: 166-171p
9. Wei Xian, Yuanxiang Li, Hao Shen, et al.
Joint Learning Sparsifying Linear
Transformation for Low-Resolution Image
Synthesis and Recognition. Pattern
Recognit. 2017; 66(C): 412424p.
10. Le Cun Y, Bottou L, Bengio Y, et al.
Gradient-Based Learning Applied to
Document Recognition. Proceedings of
the IEEE. 1998; 86(11): 22782324p.
11. Krizhevsky Alex, Geoffrey Hinton.
Learning Multiple Layers of Features from
Tiny Images. Technical Report, University
of Toronto. 2009; 1(4): 7p.
12. Dong Chao, Chen Change Loy, Kaiming
He, et al. Learning a Deep Convolutional
Network for Image Super-Resolution. In
European Conference on Computer
Vision, Springer, Cham. 2014; 184199p.
13. Dong C, Loy CC, He K, et al. Image
Super-Resolution Using Deep
Convolutional Networks. IEEE Trans
Pattern Anal Mach Intell. 2016; 38(2):
295307p.
14. Gu Shuhang, Wangmeng Zuo, Qi Xie, et
al. Convolutional Sparse Coding for
Image Super-Resolution. In Proceedings
of the IEEE International Conference on
Computer Vision. 2015; 18231831p.
15. Osendorfer Christian, Hubert Soyer,
Patrick Van Der Smagt. Image Super-
Resolution with Fast Approximate
Convolutional Sparse Coding. In
International Conference on Neural
Information Processing. Springer, Cham.
2014; 250257p.
16. Fu Yun, Liangliang Cao, Guodong Guo, et
al. Multiple Feature Fusion by Subspace
Learning. In Proceedings of the 2008
International Conference on Content-
Based Image and Video Retrieval. ACM,
2008; 127134p.
Cite this Article
Suresh Prasad Kannojia, Gaurav Jaiswal.
Cascade CNN Framework for Low
Resolution Image Classification. Journal
of Artificial Intelligence Research &
Advances. 2019; 6(1): 3943p.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Image quality is affected by different types of quality factors (such as resolution, noise, contrast, blur, compression). Resolution of the image affects the visual information of the image. Image with higher resolution contains more visual information details while an image with lower resolution contains less visual details. Convolutional neural network (CNN) based image classifiers always take input as an image, automatically learn image features and classify into its output class. If input image resolution varies, then it hinders classification performance of CNN based image classifier. This paper proposes a methodology (training testing methods TOTV, TVTV) and presents the experimental study on the effects of varying resolution on CNN based image classification for standard image dataset MNIST and CIFAR10. The experimental result shows that degradation in image resolution from higher to lower decreases performance score (accuracy, precision and F1 score) of CNN based Image classification.
Article
Full-text available
Successful fine-grained image classification methods learn subtle details between visually similar (sub-)classes, but the problem becomes significantly more challenging if the details are missing due to low resolution. Encouraged by the recent success of Convolutional Neural Network (CNN) architectures in image classification, we propose a novel resolution-aware deep model which combines convolutional image super-resolution and convolutional fine-grained classification into a single model in an end-to-end manner. Extensive experiments on the Stanford Cars and Caltech-UCSD Birds 200-2011 benchmarks demonstrate that the proposed model consistently performs better than conventional convolutional net on classifying fine-grained object classes in low-resolution images.
Article
Full-text available
Many computer vision problems involve exploring the synthesis and classification models that map images from the observed source space to a target space. Recently, one popular and effective method is to transform images from both source and target space into a shared single sparse domain, in which a synthesis model is established. Motivated by such a technique, this research attempts to explore an effective and robust linear function that maps the sparse representations of images from the source space to the target space, and simultaneously develop a linear classifier on such a coupled space with both supervised and semi-supervised learning. In order to capture the sparse structure shared by each class, we represent this mapping using a linear transformation with the constraint of sparsity. The performance of our proposed method is evaluated on several benchmark image datasets for low-resolution faces/digits classification and super-resolution, and the experimental results verify the effectiveness of the proposed method.
Article
Full-text available
Significance Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recent successes in computational models of visual recognition naturally raise the question: Do computer systems and the human brain use similar or different computations? We show by combining a novel method (minimal images) and simulations that the human recognition system uses features and learning processes, which are critical for recognition, but are not used by current models. The study uses a “phase transition” phenomenon in minimal images, in which minor changes to the image have a drastic effect on its recognition. The results show fundamental limitations of current approaches and suggest directions to produce more realistic and better-performing models.
Article
As introduced by [1], the privileged information is a complementary datum related to a training example that is unavailable for the test examples. In this paper, we consider the problem of recognizing low-resolution images (targeted task), while leveraging their high-resolution version as privileged information. In this context, we propose a novel framework for integrating privileged information in the learning phase of a deep neural network. We present a natural multi-class formulation of the addressed problem, while providing an end-to-end training framework of the internal deep representations. Based on a detailed analysis of the state-of-the-art approaches, we propose a novel loss function, combining two different ways of computing indicators of an example's difficulty, based on its privileged information. We experimentally validate our approach in various contexts, proving the interest of our model for different tasks such as fine-grained image classification or image recognition from a dataset containing annotation noise.
Conference Paper
We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers.