ArticlePDF Available

Deep learning review and discussion of its future development

Authors:

Abstract and Figures

This paper is a summary of the algorithms for deep learning and a brief discussion of its future development. In the first part, the concept of deep learning and the advantages and disadvantages of deep learning are introduced. The second part demonstrates several algorithms for deep learning. The third part introduces the application areas of deep learning. Then combines the above algorithms and applications to explore the subsequent development of deep learning. The last part makes a summary of the full paper.
Content may be subject to copyright.
Deep learning review and discussion of its future
development
Zhiying Hao*
University of Electronic Science and Technology of China, No.4, Block 2, North Jianshe Road,
Chenghua District, Chengdu, Sichuan, China
Abstract. This paper is a summary of the algorithms for deep learning and
a brief discussion of its future development. In the first part, the concept of
deep learning and the advantages and disadvantages of deep learning are
introduced. The second part demonstrates several algorithms for deep
learning. The third part introduces the application areas of deep learning.
Then combines the above algorithms and applications to explore the
subsequent development of deep learning. The last part makes a summary
of the full paper.
1 Introduction
As early as 1952, IBM's Arthur Samuel designed a program for learning checkers. It can
build new models by observing the moves of the pieces and use them to improve their playing
skills. In 1959, the concept of machine learning was proposed as a field of study that could
give a machine a certain skill without the need for deterministic programming. In the process
of machine learning development, various machine learning models have been proposed,
including deep learning. Due to its complicated structure and the need for a large amount of
calculation, the computing cost is very high, so it had not been paid attention to at the
beginning. However, with the great improvement in computer performance, the excellent
performance of deep learning makes it rose rapidly and has become one of the hottest
research areas. In this paper, the main deep learning models will be briefly summarized and
the development prospects of deep learning will be analyzed and discussed at the end.
2 Introduction to deep learning
2.1 What is Deep Learning
Deep learning is a branch of machine learning [1]. It is an algorithm that attempts to use the
high-level abstraction of data using multiple processing layers consisting of complex
structures or multiple nonlinear transforms. In machine learning, deep learning is an
algorithm based on characterizing learning data. The concept of deep learning is relative to
shallow learning. Shallow machine learning models such as Support Vector Machines and
* Corresponding author: zhiyinghao@std.uestc.edu.cn
© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons
Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
Logistic Regression were introduced in the 1990s. These shallow machine learning models
have only one layer or no hidden layer nodes, as shown in Fig 1. Deep learning is based on
multiple hidden layer nodes. The essence of deep learning is multi-layer neural network.
Deep learning uses the input of the previous layer as the output of the next layer to learn
highly abstract data features.
Fig. 1. A single-layer neural network
Like machine learning, deep learning can be categorized into supervised learning, semi-
supervised learning, and unsupervised learning. At present, the classical deep learning
framework includes Convolutional Neural Networks, Restricted Boltzmann Machines [2],
Deep Belief Networks [3], and Generative Adversarial Networks [4]. In the next section,
these algorithms will be introduced briefly.
2.2 Advantages and disadvantages of deep learning
Deep learning has shown better performance than traditional neural networks. After a deep
neural network is trained and properly adjusted for certain task like image classification, it
saves a lot of calculations, and can complete a lot of work in a short time. . Deep learning is
also malleable. Usually, for traditional algorithms, if you need to adjust the model, you may
need to make copious changes to the code. For the determined network framework used for
deep learning, if you need to adjust the model, you only need to adjust the parameters, thus
deep learning has great flexibility. The deep learning framework can be continuously
improved and then reached the almost perfect state. Deep learning is also more general, it
can be modelled based on problems, not limited to a fixed problem.
Deep learning has some shortcomings as well. First of all, its training cost is relatively
high. Now, the performance of computer hardware has been improved a lot, and some simple
neural networks can be trained on some of the common computing modules. However, the
training of some more complex neural networks still requires relatively expensive high-
performance computing modules. Although the price of such modules has been greatly
reduced compared with the previous ones, the demand for such hardware still makes the
training cost of deep learning relatively high. At the same time, not only the economic cost,
the training of neural networks requires a large amount of data to be trained to achieve a
satisfactory level, but it is often difficult to obtain a sufficient amount of data. Secondly, deep
learning can't directly learn knowledge. Although models such as AlphaGo Zero can learn
without prior knowledge have emerged, most deep learning frameworks still need to rely on
manual feature marking for training. The workload is enormous for marking large-scale
datasets, which also increases the training cost of deep learning. Another point is that deep
learning lacks sufficient theoretical support. Although deep learning has achieved good
2
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
Logistic Regression were introduced in the 1990s. These shallow machine learning models
have only one layer or no hidden layer nodes, as shown in Fig 1. Deep learning is based on
multiple hidden layer nodes. The essence of deep learning is multi-layer neural network.
Deep learning uses the input of the previous layer as the output of the next layer to learn
highly abstract data features.
Fig. 1. A single-layer neural network
Like machine learning, deep learning can be categorized into supervised learning, semi-
supervised learning, and unsupervised learning. At present, the classical deep learning
framework includes Convolutional Neural Networks, Restricted Boltzmann Machines [2],
Deep Belief Networks [3], and Generative Adversarial Networks [4]. In the next section,
these algorithms will be introduced briefly.
2.2 Advantages and disadvantages of deep learning
Deep learning has shown better performance than traditional neural networks. After a deep
neural network is trained and properly adjusted for certain task like image classification, it
saves a lot of calculations, and can complete a lot of work in a short time. . Deep learning is
also malleable. Usually, for traditional algorithms, if you need to adjust the model, you may
need to make copious changes to the code. For the determined network framework used for
deep learning, if you need to adjust the model, you only need to adjust the parameters, thus
deep learning has great flexibility. The deep learning framework can be continuously
improved and then reached the almost perfect state. Deep learning is also more general, it
can be modelled based on problems, not limited to a fixed problem.
Deep learning has some shortcomings as well. First of all, its training cost is relatively
high. Now, the performance of computer hardware has been improved a lot, and some simple
neural networks can be trained on some of the common computing modules. However, the
training of some more complex neural networks still requires relatively expensive high-
performance computing modules. Although the price of such modules has been greatly
reduced compared with the previous ones, the demand for such hardware still makes the
training cost of deep learning relatively high. At the same time, not only the economic cost,
the training of neural networks requires a large amount of data to be trained to achieve a
satisfactory level, but it is often difficult to obtain a sufficient amount of data. Secondly, deep
learning can't directly learn knowledge. Although models such as AlphaGo Zero can learn
without prior knowledge have emerged, most deep learning frameworks still need to rely on
manual feature marking for training. The workload is enormous for marking large-scale
datasets, which also increases the training cost of deep learning. Another point is that deep
learning lacks sufficient theoretical support. Although deep learning has achieved good
results in various application fields, there is still no complete and rigorous theoretical
derivation to explain the deep learning model at this stage, which limits the follow-up study
and the improvement of deep learning.
3 Main Deep Learning Algorithm Introduction
3.1 Convolutional Neural Network
The convolutional neural network, as seen in Fig 2, is a feedforward neural network whose
convolution operation allows its neurons to cover peripheral units within the convolution
kernel and has excellent performance in large image processing. A convolutional neural
network typically consists of one or more convolutional layers and a fully connected layer,
which also includes a pooling layer for integration. Convolutional neural networks give better
results in terms of image and speech recognition. It requires fewer parameters to consider
than other deep neural networks. The advantages of convolutional neural networks make it
one of the most commonly used deep learning models. The basic structure of the
convolutional neural network is briefly introduced below.
Fig. 2. Convolutional Neural Network, LeNet-5[5]
3.1.1 Convolutional layer.
The convolutional neural network convolves data using multiple convolution kernels in the
convolutional layer to generate a plurality of feature maps corresponding to the convolution
kernel.
The convolution operation has the following advantages:
1. The weight sharing mechanism on the same feature map reduces the number of
parameters;
2. Local connectivity enables convolutional neural networks to take into account the
characteristics of adjacent pixels when processing images;
3. There is no object in the image recognition due to the position of the object on the
image.
These advantages also make it possible to use a convolutional layer instead of a fully
connected layer in some models to speed up the training process.
3.1.2 Pooling layer
After obtaining the features by convolution, we hope to use these features to do the
classification. However, the amount of data that is often obtained is very large, and it is prone
to over-fitting. Therefore, we aggregate statistics on features at different locations. This
aggregation operation is called pooling. In the convolutional neural network, the pooling
3
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
layer is used for feature filtering after image convolution to improve the operability of the
classification.
3.1.3 Fully connected layer
After pooling layer is the fully connected layer, its role is to pull the feature map into a one-
dimensional vector. The working mode of the fully connected layer is similar to that of a
traditional neural network. The fully connected layer contains parameters in approximately
90% of the convolutional neural network, which allows us to map the neural network forward
into a vector of fixed length. We can grant this vector to a particular image class or use it as
a feature vector in subsequent processes.
3.2 Deep Belief Network
The deep belief network is a probability generation model. Compared with the neural network
which is a traditional discriminative model, the generated model is to establish a joint
distribution between observation data and labels, and to evaluate both P (Observation|Label)
and P (Label|Observation) while the discriminative model has only evaluated the latter, that
is, P (Label|Observation).
The deep confidence network consists of multiple restricted Boltzmann layers, a typical
neural network type as shown. These networks are "restricted" to a visible layer and a hidden
layer, with connections between the layers, but there are no connections between the cells
within the layer. The hidden layer unit is trained to capture the correlation of higher order
data represented in the visible layer.
3.3 Restricted Boltzmann Machine
A Restricted Boltzmann Machine is a randomly generated neural network that can learn the
probability distribution through the input data set. It is a Boltzmann Machine's problem, but
the qualified model must be a bipartite graph. The model contains visible cells corresponding
to the input parameters and hidden cells corresponding to the training results. Each edge of
the figure must be connected to a visible unit and a hidden unit. In contrast, the Boltzmann
machine (unrestricted) contains the edges between hidden cells, making it a recurrent neural
network. This limitation of the constrained Boltzmann machine makes it possible to have a
more efficient training algorithm than the general Boltzmann machine, especially for the
gradient divergence algorithm.
The Boltzmann machine and its model have been successfully applied to tasks such as
collaborative filtering, classification, dimensionality reduction, image retrieval, information
retrieval, language processing, automatic speech recognition, time series modeling, and
information processing. Restricted Boltzmann machines have been used in dimensionality
reduction, classification, collaborative filtering, feature learning, and topic modeling.
Depending on the task, the restricted Boltzmann machine can be trained using supervised
learning or unsupervised learning.
3.4 Generative Adversarial Network
The Generated Adversarial Network was proposed in 2014. The Generative Adversarial
Network uses two models, a generative model and a discriminative model. The
discriminative model determines whether the given picture is a real picture, and the
generative model creates a picture as close to the ground truth as possible. The generated
model is designed to generate a picture that can spoof the discriminative model, and the
discriminative model distinguishes the picture generated by the generated model from the
4
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
layer is used for feature filtering after image convolution to improve the operability of the
classification.
3.1.3 Fully connected layer
After pooling layer is the fully connected layer, its role is to pull the feature map into a one-
dimensional vector. The working mode of the fully connected layer is similar to that of a
traditional neural network. The fully connected layer contains parameters in approximately
90% of the convolutional neural network, which allows us to map the neural network forward
into a vector of fixed length. We can grant this vector to a particular image class or use it as
a feature vector in subsequent processes.
3.2 Deep Belief Network
The deep belief network is a probability generation model. Compared with the neural network
which is a traditional discriminative model, the generated model is to establish a joint
distribution between observation data and labels, and to evaluate both P (Observation|Label)
and P (Label|Observation) while the discriminative model has only evaluated the latter, that
is, P (Label|Observation).
The deep confidence network consists of multiple restricted Boltzmann layers, a typical
neural network type as shown. These networks are "restricted" to a visible layer and a hidden
layer, with connections between the layers, but there are no connections between the cells
within the layer. The hidden layer unit is trained to capture the correlation of higher order
data represented in the visible layer.
3.3 Restricted Boltzmann Machine
A Restricted Boltzmann Machine is a randomly generated neural network that can learn the
probability distribution through the input data set. It is a Boltzmann Machine's problem, but
the qualified model must be a bipartite graph. The model contains visible cells corresponding
to the input parameters and hidden cells corresponding to the training results. Each edge of
the figure must be connected to a visible unit and a hidden unit. In contrast, the Boltzmann
machine (unrestricted) contains the edges between hidden cells, making it a recurrent neural
network. This limitation of the constrained Boltzmann machine makes it possible to have a
more efficient training algorithm than the general Boltzmann machine, especially for the
gradient divergence algorithm.
The Boltzmann machine and its model have been successfully applied to tasks such as
collaborative filtering, classification, dimensionality reduction, image retrieval, information
retrieval, language processing, automatic speech recognition, time series modeling, and
information processing. Restricted Boltzmann machines have been used in dimensionality
reduction, classification, collaborative filtering, feature learning, and topic modeling.
Depending on the task, the restricted Boltzmann machine can be trained using supervised
learning or unsupervised learning.
3.4 Generative Adversarial Network
The Generated Adversarial Network was proposed in 2014. The Generative Adversarial
Network uses two models, a generative model and a discriminative model. The
discriminative model determines whether the given picture is a real picture, and the
generative model creates a picture as close to the ground truth as possible. The generated
model is designed to generate a picture that can spoof the discriminative model, and the
discriminative model distinguishes the picture generated by the generated model from the
real picture. The two models are trained at the same time, and the performance of the two
models becomes stronger and stronger in the confrontation process between the two models,
and will eventually reach a steady state.
The use of generating a network is very versatile, not only for the generation and
discrimination of images, but also for other kinds of data.
4 Deep Learning Application
4.1 Image processing
Manually selecting features is a very laborious approach. Its adjustment takes a lot of time.
Due to the instability of manual selection, we consider letting the computer automatically
learn the features. The automatic learning of the computer can be realized by deep learning.
In image recognition, deep learning utilizes patterns of multi-layer neural networks to
pre-process, feature extract, and feature processing images.
Taking the convolutional neural network as an example, the convolutional neural network
establishes a multi-layer neural network, and uses the convolutional layer to perform
convolution operations to extract feature values, and then performs data processing and
training through the pooling layer and the fully connected layer. The detailed process is
explained in detail in the Technical Implementation section of 2.2 Neural Network below.
Although neural network image recognition can't reach the accuracy of the human eye at
present, the neural network can process a large amount of image data, and the efficiency is
much better than manual recognition. Facing huge amount of data that cannot be processed
manually, using the neural network method will lead to magnificent improvement.
In addition, deep learning provides an idea for face recognition technology. Face
recognition is a biometric recognition technology based on human facial feature information
for identification. Face recognition products have been widely used in finance, justice,
military, public security, border inspection, government, aerospace, electric power, factories,
education, medical care and many enterprises and institutions. And with the further maturity
of technology and the improvement of social recognition, face recognition technology will
be applied in more fields and has an expectable development prospects. The characteristics
of the neural network make it possible to avoid overly complex feature extraction when
applied to face recognition, which is beneficial to hardware implementation.
4.2 Audio data processing
Deep learning has a profound impact on speech processing. Almost every solution in the field
of speech recognition may contain one or more embedding algorithms based on neural
models.
Speech recognition is basically divided into three main parts, namely signal level, noise
level and language level. The signal level extracts the speech signal and enhances the signal,
or performs appropriate pre-processing, cleaning, and feature extraction. The noise level
divides the different features into different sounds. The language level combines the sounds
into words and then combines them into sentences.
In the signal level, there are different techniques for extracting and enhancing the speech
itself from the signal based on the neural model. At the same time, it is able to replace the
classical feature extraction method with a more complex and efficient neural network-based
method, which greatly improves the efficiency and accuracy. Noise and language levels also
include a variety of different depth learning techniques, and different types of neural model-
based architectures are used in both sound level classification and language level
classification.
5
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
5 Discussion of the future development of deep learning
5.1 Representation Learning
The core of deep learning is the abstraction and understanding of features. Therefore, feature
learning plays a very important role in deep learning. Since the essence of deep learning is a
multi-layer neural network, some useful information is lost in the process of extracting
features and transmitting to the lower layer. However, if too much image features are
extracted, it may lead to over-fitting. Therefore, the study of representational learning may
be one of the core research issues in deep learning research studying how to accurately extract
the required features while avoiding over-fitting. The research progress on this problem will
be of great help to the classification and generalization task of neural networks.
5.2 Unsupervised Learning
As mentioned above, training a supervised neural network requires a large amount of labelled
data. The workload is very large, adding a lot of extra cost to the training of the neural
network. Therefore, if the machine completes the work instead of human, the cost of network
training will be greatly reduced. Unsupervised learning can also be used not only for the
classification of markers, but also for the Go evaluation program such as AlphaGo Zero. The
emergence of AlphaGo Zero proves that in some applications, even without the foundation
of human prior knowledge, machines can achieve excellent training results. In this case, the
application of unsupervised learning in some areas to automate the learning of the machine
from the human knowledge base without the limitations of current state of the art, may
contribute to the updating and breakthrough of technology in these fields. Moreover, the
current research on unsupervised learning is not intensive now while most people's research
focuses on supervised learning. Therefore, unsupervised learning has a rich research potential.
In fact, unsupervised learning has recently become one of the hottest research areas. In my
opinion, unsupervised learning is also one of the most valuable directions for deep learning
in the future.
5.3 Theory Complement
One of the shortcomings of deep learning is that there is no complete theoretical support,
which brings a lot of controversy to deep learning and the lack of theory does hinder the
development of deep learning. With the increasing attention of deep learning this year, the
research on deep learning has become more and more in-depth, and the theory of deep
learning is constantly improving.
Nevertheless, the theory behind it is still not enough to rigorously prove the inner
principles of deep learning. At present, the research relies on part of the theory combined
with the actual test and the experiment-based research method. While the theory can't get
further breakthroughs, it only depends on adjusting parameters to improve the models
performance, which may easily lead to the bottleneck of the research.
Therefore, it seems that for the future development of deep learning, it is very important
to obtain complete theoretical support. In the process of research, the theory of deep learning
needs to be continuously improved, and finally reach a level sufficient to explain the structure
of the inner principle.
5.4 Perspective of Deep Learning Application
6
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
5 Discussion of the future development of deep learning
5.1 Representation Learning
The core of deep learning is the abstraction and understanding of features. Therefore, feature
learning plays a very important role in deep learning. Since the essence of deep learning is a
multi-layer neural network, some useful information is lost in the process of extracting
features and transmitting to the lower layer. However, if too much image features are
extracted, it may lead to over-fitting. Therefore, the study of representational learning may
be one of the core research issues in deep learning research studying how to accurately extract
the required features while avoiding over-fitting. The research progress on this problem will
be of great help to the classification and generalization task of neural networks.
5.2 Unsupervised Learning
As mentioned above, training a supervised neural network requires a large amount of labelled
data. The workload is very large, adding a lot of extra cost to the training of the neural
network. Therefore, if the machine completes the work instead of human, the cost of network
training will be greatly reduced. Unsupervised learning can also be used not only for the
classification of markers, but also for the Go evaluation program such as AlphaGo Zero. The
emergence of AlphaGo Zero proves that in some applications, even without the foundation
of human prior knowledge, machines can achieve excellent training results. In this case, the
application of unsupervised learning in some areas to automate the learning of the machine
from the human knowledge base without the limitations of current state of the art, may
contribute to the updating and breakthrough of technology in these fields. Moreover, the
current research on unsupervised learning is not intensive now while most people's research
focuses on supervised learning. Therefore, unsupervised learning has a rich research potential.
In fact, unsupervised learning has recently become one of the hottest research areas. In my
opinion, unsupervised learning is also one of the most valuable directions for deep learning
in the future.
5.3 Theory Complement
One of the shortcomings of deep learning is that there is no complete theoretical support,
which brings a lot of controversy to deep learning and the lack of theory does hinder the
development of deep learning. With the increasing attention of deep learning this year, the
research on deep learning has become more and more in-depth, and the theory of deep
learning is constantly improving.
Nevertheless, the theory behind it is still not enough to rigorously prove the inner
principles of deep learning. At present, the research relies on part of the theory combined
with the actual test and the experiment-based research method. While the theory can't get
further breakthroughs, it only depends on adjusting parameters to improve the models
performance, which may easily lead to the bottleneck of the research.
Therefore, it seems that for the future development of deep learning, it is very important
to obtain complete theoretical support. In the process of research, the theory of deep learning
needs to be continuously improved, and finally reach a level sufficient to explain the structure
of the inner principle.
5.4 Perspective of Deep Learning Application
The fourth part of this paper referred to two application areas of deep learning, image
recognition and speech processing, which are the two main application areas of deep learning.
In addition, recent deep learning has also been used in natural language processing.
Specific applications include, for example, autonomous driving, intelligent dialogue
robots such as Siri, image classification, medical image processing, etc. Different deep
learning frameworks tend to have slightly different application scenarios, such as
convolutional neural networks, which are mainly used for image processing. In the work of
medical image processing, for example, the use of convolutional neural networks for brain
tumour segmentation has achieved an accuracy of more than 90%. Also, in medical
applications, convolutional neural networks can be used to recognize Alzheimer's disease
brain image, and more accurate diagnostic results can be obtained combined with manual
judgement. Medicine is only a part of deep learning applications. In the application of deep
learning, well-trained machines can often calculate some details that are hard to be obtained
by human, saving people's workload and improving the quality of results. For example, in
order to prevent the traffic light violation at the intersection, the usual way is to take picture
by the camera and then manually recognize the license plate to give subsequent punishment.
Manually viewing the images and recording them is a very boring job and is not so efficient.
If the captured image is recognized by the deep neural network, and the license plate number
entry system is automatically extracted, it not only saves manpower, but also improves the
efficiency greatly.
And so on, I believe that deep learning applications will be developed in the future in
transportation, medical, language, automation, etc., not to mention examples here. Although
it seems that it can't completely replace human work now, deep learning and artificial
combination can greatly improve the work efficiency.
6 Conclusion
In this paper, we introduced some of the main algorithms and some conjectures for future
development of deep learning. Deep learning has already had in-depth research and a wide
range of application scenarios, and has been put into practical use in real life with excellent
performance. However, there’s still a lot to exploit in the area of deep learning and neural
network and it has great follow-up research space and considerable application potential.
References
1. Guo Y , Liu Y , Oerlemans A , et al. Deep learning for visual understanding: A review[J].
Neurocomputing, 2016, 187(C):27-48.
2. Chun-Xia Z , Nan-Nan J I , Guan-Wei W . Restricted Boltzmann Machines[J]. Chinese
Journal of Engineering Mathematics, 2015.
3. Hinton G E , Osindero S , Teh Y W . A Fast Learning Algorithm for Deep Belief Nets[J].
Neural Computation, 2014, 18(7):1527-1554.
4. Goodfellow I J , Pouget-Abadie J , Mirza M , et al. Generative Adversarial Networks[J].
Advances in Neural Information Processing Systems, 2014, 3:2672-2680.
5. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. (1998). “Gradient based learning
applied to document recognition”. Proceedings of the IEEE, 86(11):22782324.
7
MATEC Web of Conferences 277, 02035 (2019) https://doi.org/10.1051/matecconf/201927702035
JCMME 2018
... In recent years, deep learning has developed rapidly in the field of computer vision, and object detection based on deep learning has been widely used [10] in many fields. An essential structure in deep learning is the convolutional neural network (CNN), which can automatically learn and extract image features through training. ...
Article
Full-text available
Stripe noise is considered one of the largest issues in space-borne remote sensing. The features of stripe noise in high-resolution remote sensing images are varied in different spatiotemporal conditions, leading to limited detection capability. In this study, we proposed a new detection algorithm (LSND: a linear stripe noise detection algorithm) considering stripe noise as a typical linear target. A large-scale stripe noise dataset for remote sensing images was created through linear transformations, and the target recognition of stripe noise was performed using deep convolutional neural networks. The experimental results showed that for sub-meter high-resolution remote sensing images such as GF-2 (GaoFen-2), our model achieved a precision of 98.7%, recall of 93.8%, F1-score of 96.1%, AP of 92.1%, and FPS of 35.71 for high resolution remote sensing images. Furthermore, our model exceeded ~40% on the accuracy and ~20% on the speed of the general models. Stripe noise detection would be helpful to detect the qualities of space-borne remote sensing and improve the quality of the images.
... With the rapid development of artificial intelligence in recent years, such an approach has also been utilized in the reconstruction of ship trajectory, e.g. (Hao, 2019). The principle of such an approach is to train a deep learning model based on a large amount of historical data and to estimate the most probable data point of the target trajectory data set. ...
Article
Ship trajectory information has made a significant contribution to the data-based research in analyzing maritime transportation and has facilitated the improvement of maritime safety. However, the AIS data, which consists of ship trajectory, inevitably contains noises or missing data that can interfere with the conclusion. In this paper, an improved kinematic interpolation is presented for AIS trajectory reconstruction, which integrates data preprocessing and interpolation that considers the ships' kinematic information. The improved kinematic reconstruction method includes four steps: (1) data preprocessing, (2) analysis of time interval distribution, (3) abnormal data detection and removal, (4) kinematic interpolation that takes the kinematic feature of ships (i.e., velocity and acceleration) into account, adding forward and backward track points to help correct the acceleration function of reconstruction points. The proposed method is tested on the AIS dataset of Zhoushan Port and was compared with traditional ship trajectory reconstruction methods. The comparison indicates that the proposed method can effectively reconstruct the ship trajectory with higher performance on a single ship trajectory and a large AIS data set of certain water areas.
... Due to the complexity of the structure and the high computational cost, DL has not been payed much attention. With the development of computer technology and the remarkable performance of DL over traditional machine learning (ML), DL has attracted much more attention and developed rapidly and widely in recent years [6]. The successful applications of DL have been created by many international companies, such as Google's AlphaGo, Deep Dream, Facebook's Deep Text, Baidu's unmanned ground vehicle and IFLYTEK's speech recognition system. ...
Article
Full-text available
As an emerging and applicable method, deep learning (DL) has attracted much attention in recent years. With the development of DL and the massive of publications and researches in this direction, a comprehensive analysis of DL is necessary. In this paper, from the perspective of bibliometrics, a comprehensive analysis of publications of DL is deployed from 2007 to 2019 (the first publication with keywords “deep learning” and “machine learning” was published in 2007). By preprocessing, 5722 publications are exported from Web of Science and they are imported into the professional science mapping tools: VOS viewer and Cite Space. Firstly, the publication structures are analyzed based on annual publications, and the publication of the most productive countries/regions, institutions and authors. Secondly, by the use of VOS viewer, the co-citation networks of countries/regions, institutions, authors and papers are depicted. The citation structure of them and the most influential of them are further analyzed. Thirdly, the cooperation networks of countries/regions, institutions and authors are illustrated by VOS viewer. Time-line review and citation burst detection of keywords are exported from Cite Space to detect the hotspots and research trend. Finally, some conclusions of this paper are given. This paper provides a preliminary knowledge of DL for researchers who are interested in this area, and also makes a conclusive and comprehensive analysis of DL for these who want to do further research on this area.
Article
Full-text available
In the digitalization era, efforts to enhance efficiency and accuracy in assessing household eligibility for social assistance have become important. We plan to develop a household eligibility assessment system using convolutional neural network methods. Our system uses artificial intelligence to analyze uploaded house photos and provide eligibility assessments based on predefined criteria. The goal is to automate the assessment process, increase efficiency and accuracy, ensure social assistance is well-targeted, and reduce administrative workload. Through the implementation of this system, we hope to improve the effectiveness of social assistance and community development. The result of this research is a household eligibility assessment program that can be implemented in the future.
Chapter
Computers have become ubiquitous and play an important role in our lives. To be usable, any computing device must allow some form of interaction with its user. Human-computer interaction is the point of communication between human users and computers. AI is gradually being integrated into the human-computer interaction. Designing traditional human-computer interaction courses faces new challenges with breakthroughs in third-generation AI technology. New interaction scenarios between humans and computers, such as smart homes and self-driving cars, are constantly emerging. As AI systems become more widespread, it will be essential to understand them from a human perspective. This chapter will provide an overview to the AI-based intelligent human-computer interaction.
Article
It is estimated that around 70% of mobile phone users have an Android device. Due to this popularity, the Android operating system attracts a lot of malware attacks. The sensitive nature of data present on smartphones means that it is important to protect against these attacks. Classic signature-based detection techniques fall short when they come up against a large number of users and applications. Machine learning, on the other hand, appears to work well, and also helps in identifying zero-day attacks, since it does not require an existing database of malicious signatures. In this paper, we critically review past works that have used machine learning to detect Android malware. The review covers supervised, unsupervised, deep learning and online learning approaches, and organises them according to whether they use static, dynamic or hybrid features.
Article
Full-text available
We show how to use “complementary priors” to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.
Article
Full-text available
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day
Article
Deep learning algorithms are a subset of the machine learning algorithms, which aim at discovering multiple levels of distributed representations. Recently, numerous deep learning algorithms have been proposed to solve traditional artificial intelligence problems. This work aims to review the state-of-the-art in deep learning algorithms in computer vision by highlighting the contributions and challenges from over 210 recent research papers. It first gives an overview of various deep learning approaches and their recent developments, and then briefly describes their applications in diverse vision tasks, such as image classification, object detection, image retrieval, semantic segmentation and human pose estimation. Finally, the paper summarizes the future trends and challenges in designing and training deep neural networks.
  • Z Chun-Xia
Chun-Xia Z, Nan-Nan J I, Guan-Wei W. Restricted Boltzmann Machines[J]. Chinese Journal of Engineering Mathematics, 2015.