Article

Sensor-based human activity recognition using fuzzified deep CNN architecture with λ max method

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Purpose This work aims to develop a novel fuzzy associator rule-based fuzzified deep convolutional neural network (FDCNN) architecture for the classification of smartphone sensor-based human activity recognition. This work mainly focuses on fusing the λ max method for weight initialization, as a data normalization technique, to achieve high accuracy of classification. Design/methodology/approach The major contributions of this work are modeled as FDCNN architecture, which is initially fused with a fuzzy logic based data aggregator. This work significantly focuses on normalizing the University of California, Irvine data set’s statistical parameters before feeding that to convolutional neural network layers. This FDCNN model with λ max method is instrumental in ensuring the faster convergence with improved performance accuracy in sensor based human activity recognition. Impact analysis is carried out to validate the appropriateness of the results with hyper-parameter tuning on the proposed FDCNN model with λ max method. Findings The effectiveness of the proposed FDCNN model with λ max method was outperformed than state-of-the-art models and attained with overall accuracy of 97.89% with overall F1 score as 0.9795. Practical implications The proposed fuzzy associate rule layer (FAL) layer is responsible for feature association based on fuzzy rules and regulates the uncertainty in the sensor data because of signal inferences and noises. Also, the normalized data is subjectively grouped based on the FAL kernel structure weights assigned with the λ max method. Social implications Contributed a novel FDCNN architecture that can support those who are keen in advancing human activity recognition (HAR) recognition. Originality/value A novel FDCNN architecture is implemented with appropriate FAL kernel structures.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Utilizing various neural network architectures in HAR systems has been pivotal in advancing the field. Convolutional Neural Networks (CNNs) excel in capturing spatial hierarchies [49], [50], Recurrent Neural Networks (RNNs) adeptly handle sequential data [51] and Graph Convolutional Networks (GCNs) [52] effectively model relational information. Hybrid approaches often combine these networks to leverage their complementary strengths, addressing the multifaceted nature of human activities that are temporal, spatial, and contextually rich [53]. ...
Conference Paper
Full-text available
The advent of Beyond 5G (B5G) and the anticipated arrival of 6G have spurred a remarkable impact on various aspects of human life. Next-generation Human Activity Recognition (HAR) systems are poised to advance healthcare, create smart environments, and enhance overall well-being. The imperative for next-gen HAR systems lies in their capability to be intelligent, privacy-preserving, and deeply accurate. These systems, leveraging the cutting-edge capabilities of B5G and 6G, such as Re-configurable Intelligent Surface (RIS), aim to revolutionize the monitoring process and intelligently discern various human activities. Hence, this paper introduces B5gActiv, a smart RIS-enhanced HAR system. B5gActiv utilizes fractional wavelet transform to effectively highlight time and frequency features of activities from the measured channel state information (CSI) reflected from RIS. Afterward, these features are utilized to train a recurrent neural network that records the temporal characteristics of the input, hence promoting activity recognition. Moreover, B5gActiv integrates diverse modules that enhance the deep model's overall observation and ability to withstand and perform well in the presence of noise. B5gActiv evaluations contain two different realistic scenarios, including both non-line-of-sight and multi-floor setups, proofing its efficacy. Particularly, B5gActiv overcomes benchmark techniques and delivers an activity recognition accuracy of 89.7%.
... Similarly, a wrist-worn accelerometer was used in 37 to distinguish eight different activities. Gomathi et al. 38 developed a Fuzzy associator rule-based fuzzified deep convolutional neural network architecture to classify wearable sensor-based human activity recognition. The lambda max method was fused for weight initialization to ensure data normalization and faster convergence. ...
Article
Full-text available
With the development of deep learning, numerous models have been proposed for human activity recognition to achieve state‐of‐the‐art recognition on wearable sensor data. Despite the improved accuracy achieved by previous deep learning models, activity recognition remains a challenge. This challenge is often attributed to the complexity of some specific activity patterns. Existing deep learning models proposed to address this have often recorded high overall recognition accuracy, while low recall and precision are often recorded on some individual activities due to the complexity of their patterns. Some existing models that have focused on tackling these issues are always bulky and complex. Since most embedded systems have resource constraints in terms of their processor, memory and battery capacity, it is paramount to propose efficient lightweight activity recognition models that require limited resources consumption, and still capable of achieving state‐of‐the‐art recognition of activities, with high individual recall and precision. This research proposes a high performance, low footprint deep learning model with a squeeze and excitation block to address this challenge. The squeeze and excitation block consist of a global average‐pooling layer and two fully connected layers, which were placed to extract the flattened features in the model, with best‐fit reduction ratios in the squeeze and excitation block. The squeeze and excitation block served as channel‐wise attention, which adjusted the weight of each channel to build more robust representations, which enabled our network to become more responsive to essential features while suppressing less important ones. By using the best‐fit reduction ratio in the squeeze and excitation block, the parameters of the fully connected layer were reduced, which helped the model increase responsiveness to essential features. Experiments on three publicly available datasets (PAMAP2, WISDM, and UCI‐HAR) showed that the proposed model outperformed existing state‐of‐the‐art with fewer parameters and increased the recall and precision of some individual activities compared to the baseline, and the existing models.
Article
Purpose This study aims to reduce data bias during human activity and increase the accuracy of activity recognition. Design/methodology/approach A convolutional neural network and a bidirectional long short-term memory model are used to automatically capture feature information of time series from raw sensor data and use a self-attention mechanism to learn select potential relationships of essential time points. The proposed model has been evaluated on six publicly available data sets and verified that the performance is significantly improved by combining the self-attentive mechanism with deep convolutional networks and recursive layers. Findings The proposed method significantly improves accuracy over the state-of-the-art method between different data sets, demonstrating the superiority of the proposed method in intelligent sensor systems. Originality/value Using deep learning frameworks, especially activity recognition using self-attention mechanisms, greatly improves recognition accuracy.
Article
Full-text available
The vast proliferation of sensor devices and Internet of Things enables the applications of sensor-based activity recognition. However, there exist substantial challenges that could influence the performance of the recognition system in practical scenarios. Recently, as deep learning has demonstrated its effectiveness in many areas, plenty of deep methods have been investigated to address the challenges in activity recognition. In this study, we present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition. We first introduce the multi-modality of the sensory data and provide information for public datasets that can be used for evaluation in different challenge tasks. We then propose a new taxonomy to structure the deep methods by challenges. Challenges and challenge-related deep methods are summarized and analyzed to form an overview of the current research progress. At the end of this work, we discuss the open issues and provide some insights for future directions.
Article
Full-text available
Human activity recognition (HAR) remains a challenging yet crucial problem to address in computer vision. HAR is primarily intended to be used with other technologies, such as the Internet of Things, to assist in healthcare and eldercare. With the development of deep learning, automatic high-level feature extraction has become a possibility and has been used to optimize HAR performance. Furthermore, deep-learning techniques have been applied in various fields for sensor-based HAR. This study introduces a new methodology using convolution neural networks (CNN) with varying kernel dimensions along with bi-directional long short-term memory (BiLSTM) to capture features at various resolutions. The novelty of this research lies in the effective selection of the optimal video representation and in the effective extraction of spatial and temporal features from sensor data using traditional CNN and BiLSTM. Wireless sensor data mining (WISDM) and UCI datasets are used for this proposed methodology in which data are collected through diverse methods, including accelerometers, sensors, and gyroscopes. The results indicate that the proposed scheme is efficient in improving HAR. It was thus found that unlike other available methods, the proposed method improved accuracy, attaining a higher score in the WISDM dataset compared to the UCI dataset (98.53% vs. 97.05%).
Article
Full-text available
Recently, human activity recognition (HAR) has been beginning to adopt deep learning to substitute for traditional shallow learning techniques that rely on hand-crafted features. CNNs, in particular, have set latest state-of-the-art on various HAR datasets. However, deep model often requires more computing resources, which limits its applications in embedded HAR. Although many successful methods have been proposed to reduce memory and FLOPs of CNNs, they often involve special network architectures for visual tasks, which are not suitable for deep HAR tasks with time series sensor signals, due to remarkable discrepancy. Therefore, it is necessary to develop lightweight deep models to perform HAR. As filter is the basic unit in constructing CNNs, we must ask whether redesigning smaller filters is applicable for deep HAR. In the paper, inspired by the idea, we proposed a lightweight CNN using redesigned Lego filters for the use of HAR. A set of lower-dimensional filters is used as Lego bricks to be stacked for conventional filters, which does not rely on any special network structure. To our knowledge, this is the first paper that proposes lightweight CNN for HAR in ubiquitous and wearable computing arena. The experiment results on five public HAR datasets, UCI-HAR dataset, OPPORTUNITY dataset, UNIMIB-SHAR dataset, PAMAP2 dataset, and WISDM dataset, indicate that our novel Lego-CNN approach can greatly reduce memory and computation cost over CNN, while maintaining comparable accuracy. We believe that the proposed approach could combine with the existing state-of-the-art HAR architecture and easily deployed onto wearable devices for real HAR applications.
Article
Full-text available
In the past years, traditional pattern recognition methods have made great progress. However, these methods rely heavily on manual feature extraction, which may hinder the generalization model performance. With the increasing popularity and success of deep learning methods, using these techniques to recognize human actions in mobile and wearable computing scenarios has attracted widespread attention. In this paper, a deep neural network that combines convolutional layers with long short-term memory (LSTM) was proposed. This model could extract activity features automatically and classify them with a few model parameters. LSTM is a variant of the recurrent neural network (RNN), which is more suitable for processing temporal sequences. In the proposed architecture, the raw data collected by mobile sensors was fed into a two-layer LSTM followed by convolutional layers. In addition, a global average pooling layer (GAP) was applied to replace the fully connected layer after convolution for reducing model parameters. Moreover, a batch normalization layer (BN) was added after the GAP layer to speed up the convergence, and obvious results were achieved. The model performance was evaluated on three public datasets (UCI, WISDM, and OPPORTUNITY). Finally, the overall accuracy of the model in the UCI-HAR dataset is 95.78%, in the WISDM dataset is 95.85%, and in the OPPORTUNITY dataset is 92.63%. The results show that the proposed model has higher robustness and better activity detection capability than some of the reported results. It can not only adaptively extract activity features, but also has fewer parameters and higher accuracy.
Article
Full-text available
Wearable sensor-based human activity recognition has been widely used in many fields. Considering that a multi-sensor based recognition system is not suitable for practical applications and long-term activity monitoring, this paper proposes a single wearable accelerometer-based human activity recognition approach. In order to improve the reliability of the recognition system and remove redundant features that have no effect on recognition accuracy, wavelet energy spectrum features and a novel feature selection method are introduced. For each activity sample, wavelet energy spectrum features of the acceleration signal are extracted and the activity is represented by a feature set including wavelet energy spectrum features and features of other attributes. Then, considering the limitation of single filter feature selection method, this paper proposes an ensemble-based filter feature selection (EFFS) approach to optimize the feature set. Features that are robust to sensor placement and highly distinguishable for different activities are selected. In the experiment, the acceleration data around waist is collected and two classifiers: k-nearest neighbour (KNN) and support vector machine (SVM) are utilized to verify the effectiveness of the proposed features and EFFS method. Experiment results show that the wavelet energy spectrum features can increase the discrimination between different activities and significantly and improve the activity recognition accuracy. Compared with other four popular feature selection methods, the proposed EFFS approach provides higher accuracy with fewer features.
Article
Full-text available
Deep Learning (DL), a successful promising approach for discriminative and generative tasks, has recently proved its high potential in 2D medical imaging analysis; however, physiological data in the form of 1D signals have yet to be beneficially exploited from this novel approach to fulfil the desired medical tasks. Therefore, in this paper we survey the latest scientific research on deep learning in physiological signal data such as electromyogram (EMG), electrocardiogram (ECG), electroencephalogram (EEG), and electrooculogram (EOG). We found 147 papers published between January 2018 and October 2019 inclusive from various journals and publishers. The objective of this paper is to conduct a detailed study to comprehend, categorize, and compare the key parameters of the deep-learning approaches that have been used in physiological signal analysis for various medical applications. The key parameters of deep-learning approach that we review are the input data type, deep-learning task, deep-learning model, training architecture, and dataset sources. Those are the main key parameters that affect system performance. We taxonomize the research works using deep-learning method in physiological signal analysis based on: (1) physiological signal data perspective, such as data modality and medical application; and (2) deep-learning concept perspective such as training architecture and dataset sources.
Article
Full-text available
In this paper, we focus on data-driven approaches to human activity recognition (HAR). Data-driven approaches rely on good quality data during training, however, a shortage of high quality, large-scale, and accurately annotated HAR datasets exists for recognizing activities of daily living (ADLs) within smart environments. The contributions of this paper involve improving the quality of an openly available HAR dataset for the purpose of data-driven HAR and proposing a new ensemble of neural networks as a data-driven HAR classifier. Specifically, we propose a homogeneous ensemble neural network approach for the purpose of recognizing activities of daily living within a smart home setting. Four base models were generated and integrated using a support function fusion method which involved computing an output decision score for each base classifier. The contribution of this work also involved exploring several approaches to resolving conflicts between the base models. Experimental results demonstrated that distributing data at a class level greatly reduces the number of conflicts that occur between the base models, leading to an increased performance prior to the application of conflict resolution techniques. Overall, the best HAR performance of 80.39% was achieved through distributing data at a class level in conjunction with a conflict resolution approach, which involved calculating the difference between the highest and second highest predictions per conflicting model and awarding the final decision to the model with the highest differential value.
Article
Full-text available
Sensor-based human activity recognition aims at detecting various physical activities performed by people with ubiquitous sensors. Different from existing deep learning-based method which mainly extracting black-box features from the raw sensor data, we propose a hierarchical multi-view aggregation network based on multi-view feature spaces. Specifically, we first construct various views of feature spaces for each individual sensor in terms of white-box features and black-box features. Then our model learns a unified representation for multi-view features by aggregating views in a hierarchical context from the aspect of feature level, position level and modality level. We design three aggregation modules corresponding to each level aggregation respectively. Based on the idea of non-local operation and attention, our fusion method is able to capture the correlation between features and leverage the relationship across different sensor position and modality. We comprehensively evaluate our method on 12 human activity benchmark datasets and the resulting accuracy outperforms the state-of-the-art approaches.
Article
Full-text available
While Deep learning (DL) has proven to be a powerful paradigm for classification of large-scale big image data set. Deep learning network, such as CNN requires a large number of labeled samples for training the network. However, Often times the labeled data is difficult, expensive and time-consuming. In this study, we propose a semi-supervised approach to fused Fuzzy-Rough C-Mean clustering with Convolutional Neural Network (CNNs) to knowledge learnt from simultaneously intra-model and inter-model relationships forming final data representation to be classified, which can be achieved better performance. Conception behind this is to reduce the uncertainty in terms of vagueness and indiscernibility by using Fuzzy-Rough C-Mean clustering and specifically removing noise samples by using CNN from raw data. Framework of our proposed semi-supervised approach, uses all the training data with abundant unlabeled data with few labeled data to train FRCNN model. To show the effectiveness of our model, we used four benchmark large-scale image datasets and also compared it with state-of-the-art supervised, unsupervised, and semi-supervised learning methods for image classification.
Article
Full-text available
We have compared the performance of different machine learning techniques for human activity recognition. Experiments were made using a benchmark dataset where each subject wore a device in the pocket and another on the wrist. The dataset comprises thirteen activities, including physical activities, common postures, working activities and leisure activities. We apply a methodology known as the activity recognition chain, a sequence of steps involving preprocessing, segmentation, feature extraction and classification for traditional machine learning methods; we also tested convolutional deep learning networks that operate on raw data instead of using computed features. Results show that combination of two sensors does not necessarily result in an improved accuracy. We have determined that best results are obtained by the extremely randomized trees approach, operating on precomputed features and on data obtained from the wrist sensor. Deep learning architectures did not produce competitive results with the tested architecture.
Article
Full-text available
Providing accurate information about human activity is an important task in a smart city environment. Human activity is complex, and it is important to use the best technology and benefit from the machine learning to learn about human activity. Although people have been interested in the past decade in recording human activities, there are still major aspects to be addressed to take advantage of technology in the knowledge of human activity. In this paper, Adaboost ensemble classifier is used to recognize human activity data taken from body sensors. Ensemble classifiers achieve better performance by using a weighted combination of several classifier models. Many researchers have shown the efficiencies of ensemble classifiers in different real-world problems. Experimental results have shown the feasibility of Adaboost ensemble classifiers by achieving the better performance for automated human activity recognition by using human body sensors. Results have shown that ensemble classifiers based on Adaboost algorithm significantly improve the performance of automated human activity recognition (HAR).
Article
Full-text available
Although there have been many studies aimed at the field of Human Activity Recognition, the relationship between what we do and where we do it has been little explored in this field. The objective of this paper is to propose an approach based on machine learning to address the challenge of the 1st UCAmI cup, which is the recognition of 24 activities of daily living using a dataset that allows to explore the aforementioned relationship, since it contains data collected from four data sources: binary sensors, an intelligent floor, proximity and acceleration sensors. The methodology for data mining projects CRISP-DM was followed in this work. To perform synchronization and classification tasks a java desktop application was developed. As a result, the accuracy achieved in the classification of the 24 activities using 10-fold-cross-validation on the training dataset was 92.1%, but an accuracy of 60.1% was obtained on the test dataset. The low accuracy of the classification might be caused by the class imbalance of the training dataset; therefore, more labeled data are necessary for training the algorithm. Although we could not obtain an optimal result, it is possible to iterate in the methodology to look for a way to improve the obtained results.
Article
Full-text available
Purpose Currently, ubiquitous smartphones embedded with various sensors provide a convenient way to collect raw sequence data. These data bridges the gap between human activity and multiple sensors. Human activity recognition has been widely used in quite a lot of aspects in our daily life, such as medical security, personal safety, living assistance and so on. Design/methodology/approach To provide an overview, the authors survey and summarize some important technologies and involved key issues of human activity recognition, including activity categorization, feature engineering as well as typical algorithms presented in recent years. In this paper, the authors first introduce the character of embedded sensors and dsiscuss their features, as well as survey some data labeling strategies to get ground truth label. Then, following the process of human activity recognition, the authors discuss the methods and techniques of raw data preprocessing and feature extraction, and summarize some popular algorithms used in model training and activity recognizing. Third, they introduce some interesting application scenarios of human activity recognition and provide some available data sets as ground truth data to validate proposed algorithms. Findings The authors summarize their viewpoints on human activity recognition, discuss the main challenges and point out some potential research directions. Originality/value It is hoped that this work will serve as the steppingstone for those interested in advancing human activity recognition.
Article
Full-text available
The research area of Ambient Assisted Living (AAL) has led to the development of Activity Recognition Systems (ARS) based on Human Activity Recognition (HAR). These systems improve the quality of life and the health care of the elderly and dependent people. However, before making them available to end users, it is necessary to evaluate their performance in recognising Activities of Daily Living (ADL), using dataset benchmarks in experimental scenarios. For that reason, the scientific community has developed and provided a huge amount of datasets for HAR. Therefore, identifying which ones to use in the evaluation process and which techniques are the most appropriate for prediction of HAR in a specific context is not a trivial task and is key to further progress in this area of research. This work presents a Systematic Review of Literature (SRL) of the sensor-based datasets used to evaluate ARS. On the one hand, an analysis of different variables taken from indexed publications related to this field was performed. The sources of information are journals, proceedings and books located in specialised databases. The analysed variables characterise publications by year, database, type, quartile, country of origin and destination, using scientometrics, which allowed identification of the dataset most used by researchers. On the other hand, descriptive and functional variables were analysed for each of the identified datasets: occupation, annotation, approach, segmentation, representation, feature selection, balancing and addition of instances, and classifier used for recognition. This paper provides an analysis of the sensor-based datasets used in HAR to date, identifying the most appropriate dataset to evaluate ARS and the classification techniques that generate better results.
Article
Full-text available
Medical informatics comprises of huge amount of medical resources to enhance storage, retrieval, and employ these resources in healthcare. The advancement has been done to monitor the health of the patients and provide the details to the caretakers, who are near by the remote areas. This could be done in a real-time with the help of the internet access. Due to the condition of monitoring the patient at a real-time, the caretaker can provide the suggestions regarding their essential signs of their body situation through a video conference. In this paper, we have proposed a system to report the progress of the elderly in an appropriate manner with the help of the technology used in the healthcare system and integrate the report of progress to the remote caretakers employing smartphones and videos. Through this advanced method we could able to identify the abnormalities at early stages so that the doctors could cure it without any difficulty. This could increase the physical and mental health of the patients. The system incorporated in this method requires certain sensors which are of very low in cost, certain electronic devices and smart phone for the communication purpose and WSN.
Article
Full-text available
Human activity recognition systems are developed as part of a framework to enable continuous monitoring of human behaviours in the area of ambient assisted living, sports injury detection, elderly care, rehabilitation, and entertainment and surveillance in smart home environments. The extraction of relevant features is the most challenging part of the mobile and wearable sensor-based human activity recognition pipeline. Feature extraction influences the algorithm performance and reduces computation time and complexity. However, current human activity recognition relies on handcrafted features that are incapable of handling complex activities especially with the current influx of multimodal and high dimensional sensor data. With the emergence of deep learning and increased computation powers, deep learning and artificial intelligence methods are being adopted for automatic feature learning in diverse areas like health, image classification, and recently, for feature extraction and classification of simple and complex human activity recognition in mobile and wearable sensors. Furthermore, the fusion of mobile or wearable sensors and deep learning methods for feature learning provide diversity, offers higher generalisation, and tackles challenging issues in human activity recognition. The focus of this review is to provide in-depth summaries of deep learning methods for mobile and wearable sensor-based human activity recognition. The review presents the methods, uniqueness, advantages and their limitations. We not only categorise the studies into generative, discriminative and hybrid methods but also highlight their important advantages. Furthermore, the review presents classification and evaluation procedures and discusses publicly available datasets for mobile sensor human activity recognition. Finally, we outline and explain some challenges to open research problems that require further research and improvements.
Article
Full-text available
Human action recognition is one fundamental challenge in robotics systems. In this paper, we propose one lightweight action recognition architecture based on deep neural networks (DNNs) just using RGB data. The proposed architecture consists of convolution neural network (CNN), long short-term memory (LSTM) units, and temporal-wise attention model. First, CNN is used to extract spatial features to distinguish objects from the background with both local and semantic characteristics. Second, two kinds of LSTM networks are performed on the spatial feature maps of different CNN layers (pooling layer and fully-connected layer) to extract temporal motion features. Then, one temporal-wise attention model is designed after LSTM to learn which parts in which frames are more important. Lastly, a joint optimization module is designed to explore intrinsic relations between two kinds of LSTM features. Experimental results demonstrate the efficiency of the proposed method.
Conference Paper
Full-text available
Human activity recognition using smart home sensors is one of the bases of ubiquitous computing in smart environments and a topic undergoing intense research in the field of ambient assisted living. The increasingly large amount of data sets calls for machine learning methods. In this paper, we introduce a deep learning model that learns to classify human activities without using any prior knowledge. For this purpose, a Long Short Term Memory (LSTM) Recurrent Neural Network was applied to three real world smart home datasets. The results of these experiments show that the proposed approach outperforms the existing ones in terms of accuracy and performance.
Article
Full-text available
Sensor-based activity recognition seeks the profound high-level knowledge about human activity from multitudes of low-level sensor readings. Conventional pattern recognition approaches have made tremendous progress in the past years. However, most of those approaches heavily rely on heuristic hand-crafted feature extraction methods, which dramatically hinder their generalization performance. Additionally, those methods often produce unsatisfactory results for unsupervised and incremental learning tasks. Meanwhile, the recent advancement of deep learning makes it possible to perform automatic high-level feature extraction thus achieves promising performance in many areas. Since then, deep learning based methods have been widely adopted for the sensor-based activity recognition tasks. In this paper, we survey and highlight the recent advancement of deep learning approaches for sensor-based activity recognition. Specifically, we summarize existing literatures from three aspects: sensor modality, deep model and application. We also present a detailed discussion and propose grand challenges for future direction.
Article
Full-text available
Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders and is traditionally performed by a sleep expert who assigns to each 30s of signal a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEG), electrooculograms (EOG), electrocardiograms (ECG) and electromyograms (EMG). In this paper, we introduce the first end-to-end deep learning approach that performs automatic temporal sleep stage classification from multivariate and multimodal Polysomnography (PSG) signals. We build a general deep architecture which can extract information from EEG, EOG and EMG channels and pools the learnt representations into a final softmax classifier. The architecture is light enough to be distributed in time in order to learn from the temporal context of each sample, namely previous and following data segments. Our model, which is unique in its ability to learn a feature representation from multiple modalities, is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields state-of-the-art performance. Our study reveals a number of insights on the spatio-temporal distribution of the signal of interest: a good trade-off for optimal classification performance measured with balanced accuracy is to use 6 EEG with some EOG and EMG channels. Also exploiting one minute of data before and after each data segment to be classified offers the strongest improvement when a limited number of channels is available. Our approach aims to improve a key step in the study of sleep disorders. As sleep experts, our system exploits the multivariate and multimodal character of PSG signals to deliver state-of-the-art classification performance at a very low complexity cost.
Article
Full-text available
Recently, deep learning (DL) methods have been introduced very successfully into human activity recognition (HAR) scenarios in ubiquitous and wearable computing. Especially the prospect of overcoming the need for manual feature design combined with superior classification capabilities render deep neural networks very attractive for real-life HAR application. Even though DL-based approaches now outperform the state-of-the-art in a number of recognitions tasks of the field, yet substantial challenges remain. Most prominently, issues with real-life datasets, typically including imbalanced datasets and problematic data quality, still limit the effectiveness of activity recognition using wearables. In this paper we tackle such challenges through Ensembles of deep Long Short Term Memory (LSTM) networks. We have developed modified training procedures for LSTM networks and combine sets of diverse LSTM learners into classifier collectives. We demonstrate, both formally and empirically, that Ensembles of deep LSTM learners outperform the individual LSTM networks. Through an extensive experimental evaluation on three standard benchmarks (Opportunity, PAMAP2, Skoda) we demonstrate the excellent recognition capabilities of our approach and its potential for real-life applications of human activity recognition.
Article
Full-text available
Sensor-based motion recognition integrates the emerging area of wearable sensors with novel machine learning techniques to make sense of low-level sensor data and provide rich contextual information in a real-life application. Although Human Activity Recognition (HAR) problem has been drawing the attention of researchers, it is still a subject of much debate due to the diverse nature of human activities and their tracking methods. Finding the best predictive model in this problem while considering different sources of heterogeneities can be very difficult to analyze theoretically, which stresses the need of an experimental study. Therefore, in this paper, we first create the most complete dataset, focusing on accelerometer sensors, with various sources of heterogeneities. We then conduct an extensive analysis on feature representations and classification techniques (the most comprehensive comparison yet with 293 classifiers) for activity recognition. Principal component analysis is applied to reduce the feature vector dimension while keeping essential information. The average classification accuracy of eight sensor positions is reported to be 96.44% ± 1.62% with 10-fold evaluation, whereas accuracy of 79.92% ± 9.68% is reached in the subject-independent evaluation. This study presents significant evidence that we can build predictive models for HAR problem under more realistic conditions, and still achieve highly accurate results.
Chapter
Full-text available
Mobile Phone used not to be matter of luxury only, it has become a significant need for rapidly evolving fast track world. This paper proposes a spatial context recognition system in which certain types of human physical activities using accelerometer and gyroscope data generated by a mobile device focuses on reducing processing time. The benchmark Human Activity Recognition dataset is considered for this work is acquired from UCI Machine Learning Repository, which is available in public domain. Our experiment shows that Principal Component Analysis used for dimensionality reduction brings 70 principal components from 561 features of raw data while maintaining the most discriminative information. Multi Layer Perceptron Classifier was tested on principal components. We found that the Multi Layer Perceptron reaches an overall accuracy of 96.17 % with 70 principal components compared to 98.11 % with 561 features reducing time taken to build a model from 658.53 s to 128.00 s.
Article
Full-text available
Human activity recognition (HAR) tasks have traditionally been solved using engineered features obtained by heuristic processes. Current research suggests that deep convolutional neural networks are suited to automate feature extraction from raw sensor inputs. However, human activities are made of complex sequences of motor movements, and capturing this temporal dynamics is fundamental for successful HAR. Based on the recent success of recurrent neural networks for time series domains, we propose a generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which: (i) is suitable for multimodal wearable sensors; (ii) can perform sensor fusion naturally; (iii) does not require expert knowledge in designing features; and (iv) explicitly models the temporal dynamics of feature activations. We evaluate our framework on two datasets, one of which has been used in a public activity recognition challenge. Our results show that our framework outperforms competing deep non-recurrent networks on the challenge dataset by 4% on average; outperforming some of the previous reported results by up to 9%. Our results show that the framework can be applied to homogeneous sensor modalities, but can also fuse multimodal sensors to improve performance. We characterise key architectural hyperparameters’ influence on performance to provide insights about their optimisation.
Article
Full-text available
Smart phone platforms, equipped with a rich set of sensors enable mobile sensing applications that support users for both personal sensing and large-scale community sensing. In such mobile sensing applications, the position/placement of the phone relative to the user body provides valuable context information. For example, in physical activity recognition using motion sensors, the position of the phone provides important information, since the sensors generate different signals when the phone is carried in different positions and this makes it difficult to successfully identify the activities with sensor data coming from different positions. In this paper, we investigate whether it is possible to successfully identify phone positions using only accelerometer data which is the most commonly used sensor on physical activity recognition studies, rather than using additional sensors. Additionally, we explore how much this position information increases the activity recognition accuracy compared with position independent activity recognition. For this purpose, we collected activity data from 15 participants carrying three phones in different positions, performing activities of walking, running, sitting, standing, climbing up/down stairs, transportation with a bus, making a phone call, interacting with an application on the smart phone, sending an SMS. The collected data is processed with the Random Forest classifier. According to the results of position recognition, using basic accelerometer features which are also used in the activity recognition, can achieve an accuracy of 77.34%, however, this ratio increases to 85% when basic features are combined with angular features calculated from the orientation of the phone. According to the results of the activity recognition experiments, on average the results are similar for position specific and position independent recognition. Only for the pocket case, 2% increase was observed.
Article
Full-text available
While the potential benefits of smart home technology are widely recognized, a lightweight design is needed for the benefits to be realized at a large scale. We introduce the CASAS "smart home in a box", a lightweight smart home design that is easy to install and provides smart home capabilities out of the box with no customization or training. We discuss types of data analysis that have been performed by the CASAS group and can be pursued in the future by using this approach to designing and implementing smart home technologies.
Article
Full-text available
Recognizing direct relationships between variables connected in a network is a pervasive problem in biological, social and information sciences as correlation-based networks contain numerous indirect relationships. Here we present a general method for inferring direct effects from an observed correlation matrix containing both direct and indirect effects. We formulate the problem as the inverse of network convolution, and introduce an algorithm that removes the combined effect of all indirect paths of arbitrary length in a closed-form solution by exploiting eigen-decomposition and infinite-series sums. We demonstrate the effectiveness of our approach in several network applications: distinguishing direct targets in gene expression regulatory networks; recognizing directly interacting amino-acid residues for protein structure prediction from sequence alignments; and distinguishing strong collaborations in co-authorship social networks using connectivity information alone. In addition to its theoretical impact as a foundational graph theoretic tool, our results suggest network deconvolution is widely applicable for computing direct dependencies in network science across diverse disciplines.
Chapter
The reduction in the size of convolution filters has been shown to be effective in image classification models. They make it possible to reduce the calculation and the number of parameters used in the operations of the convolution layer while increasing the efficiency of the representation. The authors present a deep architecture for classification with improved performance. The main objective of this architecture is to improve the main performances of the network thanks to a new design based on CONVblock. The proposal is evaluated on a classification database: CIFAR-10 and MNIST. The experimental results demonstrate the effectiveness of the proposed method. This architecture offers an error of 1.4% on CIFAR-10 and 0.055% on MNIST.
Article
Successfully employing ultrasonic testing to distinguish a flaw in close proximity to another flaw or geometrical feature depends on the wavelength and the bandwidth of the ultrasonic transducer. This explains why the frequency is commonly increased in ultrasonic testing in order to improve the axial resolution. However, as the frequency increases, the penetration depth of the propagating ultrasonic waves is reduced due to an attendant increase in attenuation. The nondestructive testing research community is consequently very interested in finding methods that combine high penetration depth with high axial resolution. This work aims to improve the compromise between the penetration depth and the axial resolution by using a convolutional neural network to separate overlapping echoes in time traces in order to estimate the time-of-flight and amplitude. The originality of the proposed framework consists in its training of the neural network using data generated in simulations. The framework was validated experimentally to detect flat bottom holes in an aluminum block with a minimum depth corresponding to λ/4.
Article
Purpose: The paper aims to develop a novel method for the classification of different physical activities of a human being, using fabric sensors. This method focuses mainly on classifying the physical activity between normal action and violent attack on a victim and verifies its validity. Design/methodology/approach: The system is realized as a protective jacket that can be worn by the subject. Stretch sensors, pressure sensors and a 9 degree of freedom accelerometer are strategically woven on the jacket. The jacket has an internal bus system made of conductive fabric that connects the sensors to the Flora chip, which acts as the data acquisition unit for the data generated. Different activities such as still, standing up, walking, twist-jump-turn, dancing and violent action are performed. The jacket in this study is worn by a healthy subject. The main phases which describe the activity recognition method undertaken in this study are the placement of sensors, pre-processing of data and deploying machine learning models for classification. Findings: The effectiveness of the method was validated in a controlled environment. Certain challenges are also faced in building the experimental setup for the collection of data from the hardware. The most tedious challenge is to collect the data without noise and error, created by voltage fluctuations when stretched. The results show that the support vector machine classifier can classify different activities and is able to differentiate normal action and violent attacks with an accuracy of 98.8%, which is superior to other methods and algorithms. Practical implications: This study leads to an understanding of human physical movement under violent activity. The results show that data compared with normal physical motion, which includes even a form of dance is quite different from the data collected during violent physical motion. This jacket construction with woven sensors can capture every dimension of the physical motion adding features to the data on which the machine learning model will be built. Originality/value: Unlike other studies, where sensors are placed on isolated parts of the body, in this study, the fabric sensors are woven into the fabric itself to collect the data and to achieve maximum accuracy instead of using isolated wearable sensors. This method, together with a fabric pressure and stretch sensors, can provide key data and accurate feedback information when the victim is being attacked or is in a normal state of action.
Article
Change detection (CD) is a process of identifying dissimilarities from two or more co-registered multitemporal images. In this paper, we have introduced a α-cut induced Fuzzy layer to the Deep Neural Network (αFDNN). Deep neural networks for change detection normally rely on the pre-classified labels of the clustering. But the pre-classified labels are more coarse and ambiguous, which is not able to highlight the changed information accurately. This challenge can be addressed by encapsulating the local information and fuzzy logic into the deep neural network. This takes the advantage of enhancing the changed information and of reducing the effect of speckle noise. As the first step in change detection, a fused difference image is generated from the mean and log ratio image with the advent of Stationary Wavelet Transform (SWT). It not only eliminates the impact of speckle noise but also it has good ability to identify the trend of change thanks to the shift invariance property. Pseudo classification is performed as the next step using Fuzzy C Means (FCM) clustering. Then, we apply reformulated α-cut induced Fuzzy Deep Neural Network to generate the final change map which facilitates a final representation of data more suitable for the process of classification and clustering. It also results into a noteworthy improvement in the change detection result. The efficacy of the algorithm is analyzed through the parameter study. Experimental results on three Synthetic Aperture Radar (SAR) datasets demonstrate the superior performance of the proposed method compared to state-of-the art change detection methods.
Article
Deep neural networks are a class of powerful machine learning model that uses successive layers of non-linear processing units to extract features from data. However, the training process of such networks is quite computationally intensive and uses commonly used optimization methods that do not guarantee optimum performance. Furthermore, deep learning methods are often sensitive to noise in data and do not operate well in areas where data are incomplete. An alternative, yet little explored, method in enhancing deep learning performance is the use of fuzzy systems. Fuzzy systems have been previously used in conjunction with neural networks. This survey explores the different ways in which deep learning is improved with fuzzy logic systems. The techniques are classified based on how the two paradigms are combined. Finally, the real-life applications of the models are also explored.
Article
This paper presents a novel deep learning framework for the inter-patient electrocardiogram (ECG) heartbeat classification. A symbolization approach especially designed for ECG is introduced, which can jointly represent the morphology and rhythm of the heartbeat and alleviate the influence of inter-patient variation through baseline correction. The symbolic representation of the heartbeat is used by a multi-perspective convolutional neural network (MPCNN) to learn features automatically and classify the heartbeat. We evaluate our method for the detection of the supraventricular ectopic beat (SVEB) and ventricular ectopic beat (VEB) on MIT-BIH arrhythmia dataset. Compared with the state-of-the-art methods based on manual features or deep learning models, our method shows superior performance: the overall accuracy of 96.4%, F1 scores for SVEB and VEB of 76.6% and 89.7%, respectively. The ablation study on our method validates the effectiveness of the proposed symbolization approach and joint representation architecture, which can help the deep learning model to learn more general features and improve the ability of generalization for unseen patients. Because our method achieves a competitive inter-patient heartbeat classification performance without complex handcrafted features or the intervention of the human expert, it can also be adjusted to handle various other tasks relative to ECG classification.
Article
For the artificial intelligence (AI) to effectively mimic humans, understanding humans, more specifically, human emotion is important. Sentiment analysis aims to automatically uncover the underlying sentiment or emotions that humans hold towards an entity. There is high ambiguity of emotion in text data. In this paper, we consider the sentence-level sentiment classification task, and propose a novel type of convolutional neural network combined with fuzzy logic called the Fuzzy Convolutional Neural Network (FCNN) and its associated learning algorithm. The new model is an integration of modified Convolutional Neural Network (CNN) in the fuzzy logic domain. The proposed model benefits from the use of fuzzy membership degrees to produce more refined outputs, thereby reducing the ambiguities in emotional aspects of sentiment classification. Also it benefits from extracting high-level emotional features due to convolutional neural representation. We compare the performance of our proposed approach with conventional CNN for sentiment classification. The experimental results indicate that the proposed FCNN outperforms the conventional methods for sentiment classification task.
Article
Smart user devices are becoming increasingly ubiquitous and useful for detecting the user’s context and his/her current activity. This work analyzes and proposes several techniques to improve the robustness of a Human Activity Recognition (HAR) system that uses accelerometer signals from different smartwatches and smartphones. This analysis reveals some of the challenges associated with both device heterogeneity and the different use of smartwatches compared to smartphones. When using smartwatches to recognize whole body activities, the arm movements introduce additional variability giving rise to a significant degradation in HAR. In this analysis, we describe and evaluate several techniques which successfully address these challenges when using smartwatches and when training and testing with different devices and/or users.
Article
Electronic health records (EHRs) contain critical information useful for clinical studies. Early assessment of patients' mortality in intensive care units is of great importance. In this paper, a Deep Rule-Based Fuzzy System (DRBFS) was proposed to develop an accurate in-hospital mortality prediction in the intensive care unit (ICU) patients employing a large number of input variables. Our main contribution is proposing a system, which is capable of dealing with big data with heterogeneous mixed categorical and numeric attributes. In DRBFS, the hidden layer in each unit is represented by interpretable fuzzy rules. Benefiting the strength of soft partitioning, a modified supervised fuzzy k-prototype clustering has been employed for fuzzy rule generation. According to the stacked approach, the same input space is kept in every base building unit of DRBFS. The training set in addition to random shifts, obtained from random projections of prediction results of the current base building unit is presented as the input of the next base building unit. A cohort of 10972 adult admissions was selected from MedicalInformationMart forIntensiveCare (MIMIC-III) data set, where 9.31% of patients have died in the hospital. A heterogeneous feature set of first 48 hours from ICU admissions, were extracted for in-hospital mortality rate. Required preprocessing and appropriate feature extraction were applied. To avoid biased assessments, performance indexes were calculated using holdout validation. We have evaluated our proposed method with several common classifiers including naïve Bayes (NB), decision trees (DT), Gradient Boosting (GB), Deep Belief Networks (DBN) and D-TSK-FC. The area under the receiver operating characteristics curve (AUROC) for NB, DT, GB, DBN, D-TSK-FC and our proposed method were 73.51%, 61.81%, 72.98%, 70.07%, 66.74% and 73.90% respectively. Our results have demonstrated that DRBFS outperforms various methods, while maintaining interpretable rule bases. Besides, benefiting from specific clustering methods, DRBFS can be well scaled up for large heterogeneous data sets.
Article
With a widespread of various sensors embedded in mobile devices, the analysis of human daily activities becomes more common and straightforward. This task now arises in a range of applications such as healthcare monitoring, fitness tracking or user-adaptive systems, where a general model capable of instantaneous activity recognition of an arbitrary user is needed. In this paper, we present a user-independent deep learning-based approach for online human activity classification. We propose using Convolutional Neural Networks for local feature extraction together with simple statistical features that preserve information about the global form of time series. Furthermore, we investigate the impact of time series length on the recognition accuracy and limit it up to 1. s that makes possible continuous real-time activity classification. The accuracy of the proposed approach is evaluated on two commonly used WISDM and UCI datasets that contain labeled accelerometer data from 36 and 30 users respectively, and in cross-dataset experiment. The results show that the proposed model demonstrates state-of-the-art performance while requiring low computational cost and no manual feature engineering.
Article
Purpose In sensor-based activity recognition, most of the previous studies focused on single activities such as body posture, ambulation and simple daily activities. Few works have been done to analyze complex concurrent activities. The purpose of this paper is to use a statistical modeling approach to classify them. Design/methodology/approach In this study, the recognition problem of concurrent activities is explored with the framework of parallel hidden Markov model (PHMM), where two basic HMMs are used to model the upper limb movements and lower limb states, respectively. Statistical time-domain and frequency-domain features are extracted, and then processed by the principal component analysis method for classification. To recognize specific concurrent activities, PHMM merges the information (by combining probabilities) from both channels to make the final decision. Findings Four studies are investigated to validate the effectiveness of the proposed method. The results show that PHMM can classify 12 daily concurrent activities with an average recognition rate of 93.2 per cent, which is superior to regular HMM and several single-frame classification approaches. Originality/value A statistical modeling approach based on PHMM is investigated, and it proved to be effective in concurrent activity recognition. This might provide more accurate feedback on people’s behaviors. Practical implications The research may be significant in the field of pervasive healthcare, supporting a variety of practical applications such as elderly care, ambient assisted living and remote monitoring.
Article
Background: Drowsiness is one of the major factors that cause crashes in the transportation industry. Drowsiness detection systems can alert drowsy operators and potentially reduce the risk of crashes. In this study, a Google-Glass-based drowsiness detection system was developed and validated. Methods: The proximity sensor of Google Glass was used to monitor eye blink frequency. A simulated driving study was carried out to validate the system. Driving performance and eye blinks were compared between the two states of alertness and drowsiness while driving. Results: Drowsy drivers increased frequency of eye blinks, produced longer braking response time and increased lane deviation, compared to when they were alert. A threshold algorithm for proximity sensor can reliably detect eye blinks and proved the feasibility of using Google Glass to detect operator drowsiness. Applications: This technology provides a new platform to detect operator drowsiness and has the potential to reduce drowsiness-related crashes in driving and aviation.
Conference Paper
Mobile sensing and computing applications usually require time-series inputs from sensors, such as accelerometers, gyroscopes, and magnetometers. Some applications, such as tracking, can use sensed acceleration and rate of rotation to calculate displacement based on physical system models. Other applications, such as activity recognition, extract manually designed features from sensor inputs for classification. Such applications face two challenges. On one hand, on-device sensor measurements are noisy. For many mobile applications, it is hard to find a distribution that exactly describes the noise in practice. Unfortunately, calculating target quantities based on physical system and noise models is only as accurate as the noise assumptions. Similarly, in classification applications, although manually designed features have proven to be effective, it is not always straightforward to find the most robust features to accommodate diverse sensor noise patterns and heterogeneous user behaviors. To this end, we propose DeepSense, a deep learning framework that directly addresses the aforementioned noise and feature customization challenges in a unified manner. DeepSense integrates convolutional and recurrent neural networks to exploit local interactions among similar mobile sensors, merge local interactions of different sensory modalities into global interactions, and extract temporal relationships to model signal dynamics. DeepSense thus provides a general signal estimation and classification framework that accommodates a wide range of applications. We demonstrate the effectiveness of DeepSense using three representative and challenging tasks: car tracking with motion sensors, heterogeneous human activity recognition, and user identification with biometric motion analysis. DeepSense significantly outperforms the state-of-the-art methods for all three tasks. In addition, we show that DeepSense is feasible to implement on smartphones and embedded devices thanks to its moderate energy consumption and low latency.
Conference Paper
Human physical activity recognition based on wearable sensors has applications relevant to our daily life such as healthcare. How to achieve high recognition accuracy with low computational cost is an important issue in the ubiquitous computing. Rather than exploring handcrafted features from time-series sensor signals, we assemble signal sequences of accelerometers and gyroscopes into a novel activity image, which enables Deep Convolutional Neural Networks (DCNN) to automatically learn the optimal features from the activity image for the activity recognition task. Our proposed approach is evaluated on three public datasets and it outperforms state-of-the-arts in terms of recognition accuracy and computational cost.
Article
Deep learning (DL) is an emerging and powerful paradigm that allows large-scale task-driven feature learning from big data. However, typical DL is a fully deterministic model that sheds no light on data uncertainty reductions. In this paper, we show how to introduce the concepts of fuzzy learning into DL to overcome the shortcomings of fixed representation. The bulk of the proposed fuzzy system is a hierarchical deep neural network that derives information from both fuzzy and neural representations. Then, the knowledge learnt from these two respective views are fused altogether forming the final data representation to be classified. The effectiveness of the model is verified on three practical tasks of image categorization, high frequency financial data prediction and brain MRI segmentation that all contain high-level of uncertainties in the raw data. The fuzzy deep learning paradigm greatly outperforms other non-fuzzy and shallow learning approaches on these tasks.
Article
We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.
Article
Human activities are inherently translation invariant and hierarchical. Human activity recognition (HAR), a field that has garnered a lot of attention in recent years due to its high demand in various application domains, makes use of time-series sensor data to infer activities. In this paper, a deep convolutional neural network (convnet) is proposed to perform efficient and effective HAR using smartphone sensors by exploiting the inherent characteristics of activities and 1D time-series signals, at the same time providing a way to automatically and data-adaptively extract robust features from raw data. Experiments show that convnets indeed derive relevant and more complex features with every additional layer, although difference of feature complexity level decreases with every additional layer. A wider time span of temporal local correlation can be exploited (1 × 9-1 × 14) and a low pooling size (1 × 2-1 × 3) is shown to be beneficial. Convnets also achieved an almost perfect classification on moving activities, especially very similar ones which were previously perceived to be very difficult to classify. Lastly, convnets outperform other state-of-the-art data mining techniques in HAR for the benchmark dataset collected from 30 volunteer subjects, achieving an overall performance of 94.79% on the test set with raw sensor data, and 95.75% with additional information of temporal fast Fourier transform of the HAR data set.
Conference Paper
Recognizing human activities from temporal streams of sensory data observations is a very important task on a wide variety of applications in context recognition. Especially for timeseries sensory data, a method that takes into account the inherent sequential characteristics of the data is needed. Moreover, activities are hierarchical in nature, in as much that complex activities can be decomposed to a number of simpler ones. In this paper, we propose a two-stage continuous hidden Markov model (CHMM) approach for the task of activity recognition using accelerometer and gyroscope sensory data gathered from a smartphone. The proposed method consists of first-level CHMMs for coarse classification, which separates stationary and moving activities, and second-level CHMMs for fine classification, which classifies the data into their corresponding activity classes. Random Forests (RF) variable importance measures are exploited to determine the optimal feature subsets for both coarse and fine classification. Experiments show that with the use of a significantly reduced number of features, the proposed method shows competitive performance in comparison to other classification algorithms, achieving an over-all accuracy of 91.76%.
Article
This paper reports a novel deep architecture referred to as Maxout network In Network (MIN), which can enhance model discriminability and facilitate the process of information abstraction within the receptive field. The proposed network adopts the framework of the recently developed Network In Network structure, which slides a universal approximator, multilayer perceptron (MLP) with rectifier units, to exact features. Instead of MLP, we employ maxout MLP to learn a variety of piecewise linear activation functions and to mediate the problem of vanishing gradients that can occur when using rectifier units. Moreover, batch normalization is applied to reduce the saturation of maxout units by pre-conditioning the model and dropout is applied to prevent overfitting. Finally, average pooling is used in all pooling layers to regularize maxout MLP in order to facilitate information abstraction in every receptive field while tolerating the change of object position. Because average pooling preserves all features in the local patch, the proposed MIN model can enforce the suppression of irrelevant information during training. Our experiments demonstrated the state-of-the-art classification performance when the MIN model was applied to MNIST, CIFAR-10, and CIFAR-100 datasets and comparable performance for SVHN dataset.
Article
Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch}. Batch Normalization allows us to use much higher learning rates and be less careful about initialization. It also acts as a regularizer, in some cases eliminating the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.9% top-5 validation error (and 4.8% test error), exceeding the accuracy of human raters.
Conference Paper
Activity recognition (AR) or in other words context recognition is an active area of research in the domain of pervasive and mobile computing that has direct applications about life quality and health of the users. Related studies aim to classify different daily human activities with high accuracy rates using various types of sensors. Becoming a substantial part in our daily lives with their sensing capabilities, smartphones are now feasible platforms that enable people to make use of AR technologies without being obliged to use or wear some extra devices. However, due to the fact that users carry these devices at different positions, such as in the pocket or in the bag, it becomes a challenging task to attain accurate results by directly used classification models. In this paper, we focus on phone position uncertainty problem and compare the classification results with position independent and position dependent classification models.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Article
We first generalize the Perron-Frobenius results for nonnegative matrices to nonnegative fuzzy matrices which says that these fuzzy matrices have a (dominant) nonnegative fuzzy eigenvalue. This result is then used to extend Leontief's closed input-output analysis to fuzzy economies and obtain a fuzzy model of the world's economy.