Conference Paper

Improving CNN-based activity recognition by data augmentation and transfer learning

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Activity classification is a challenging problem due to large signal dimensionality, high intra-and inter-subject variability in activity patterns, presence of transitional classes showing mixture of patterns, and dominance of the null class. Supervised learning has been the prevalent choice with deep neural networks (DNNs) showing some promising potential. Deep learning however requires a large number of labeled samples which is difficult to acquire, especially from vulnerable older people. In this paper we implement 3 different convolutional neural network architectures trained on data from older people, incorporating Bayesian optimization for efficient hyper-parameter tuning. We exploit various augmentation methods for time-series to make invariant predictions and also cross-utilize knowledge about physical activity of younger persons in order to improve generalization in our models designed for older adults.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Because it is difficult to distinguish between foreground and background categories of accelerometer signals, it is difficult to apply the above signal-synthesizing methods to sensor-data augmentation. However, because accelerometer signals have characteristics of time-series data, augmentation methods [4,12,18,24,33] using signal processing (e.g., jittering, scaling, rotation, and random sampling) are widely used in the field of HAR. ...
... However, collecting a sufficient number of human-activity data is difficult, because it is very time-consuming and laborious. Therefore, several studies have applied data augmentation techniques [4,12,18,24,33]. Doing so is essential for HAR tasks so that the quantity of training data will be sufficient to improve the activity recognition rate of deep-learning methods [4,12,18,24,33]. ...
... Therefore, several studies have applied data augmentation techniques [4,12,18,24,33]. Doing so is essential for HAR tasks so that the quantity of training data will be sufficient to improve the activity recognition rate of deep-learning methods [4,12,18,24,33]. ...
Article
Full-text available
Human activity recognition (HAR) using an accelerometer can provide valuable information for understanding user context. Therefore, several studies have been conducted using deep learning to increase the recognition rate of activity classification. However, the existing dataset that is publicly available for HAR tasks contains limited data. Previous works have applied data augmentation methods that simply transform the entire accelerometer-signal dataset. However, the label of the augmented signal cannot be easily recognized by humans, and the augmentation methods cannot ensure that the label of the signal is preserved. Therefore, we propose a novel data augmentation method that reflects the characteristics of the sensor signal and can preserve the label of the augmented signal by generating partially occluded data of the accelerometer signals. To generate the augmented data, we apply time-warping, which deforms the time-series data in the time direction. We handle jittering effects and subsequently apply data masking to drop out a part of the input signals. We compare the performance of the proposed augmentation method with that of conventional methods by using two public datasets and an activity recognition model based on convolutional neural networks. The experimental results show that the proposed augmentation method improves the recognition rate of the activity classification model, regardless of the dataset. Additionally, the proposed method shows superior performance over conventional methods on the two datasets.
... Different types of data augmentation have been used to enlarge the training dataset [12,20]: rotation, permutation, jittering and scaling performed for the original signal, or local averaging as a down-sampling technique and shuffling in the feature space [21]. However, the specific augmentation, as well as the optimization of the classification window length (epoch, time window, segment, observation) was performed in each study for its specific datasets. ...
... Transfer learning is a method that prepares a classification model for one dataset and uses this pretrained model as a base for training a model for another similar dataset. For example, training a pretrained model based on younger population groups and using it as the initial condition to train a model for older people, as was carried out by [19,20]. The additional training of pretrained models using data from specific objects and environments improves their accuracy and decreases the training time relative to newly trained models or existing models. ...
Article
Full-text available
Due to technological developments, wearable sensors for monitoring the behavior of farm animals have become cheaper, have a longer lifespan and are more accessible for small farms and researchers. In addition, advancements in deep machine learning methods provide new opportunities for behavior recognition. However, the combination of the new electronics and algorithms are rarely used in PLF, and their possibilities and limitations are not well-studied. In this study, a CNN-based model for the feeding behavior classification of dairy cows was trained, and the training process was analyzed considering a training dataset and the use of transfer learning. Commercial acceleration measuring tags, which were connected by BLE, were fitted to cow collars in a research barn. Based on a dataset including 33.7 cow × days (21 cows recorded during 1–3 days) of labeled data and an additional free-access dataset with similar acceleration data, a classifier with F1 = 93.9% was developed. The optimal classification window size was 90 s. In addition, the influence of the training dataset size on the classifier accuracy was analyzed for different neural networks using the transfer learning technique. While the size of the training dataset was being increased, the rate of the accuracy improvement decreased. Beginning from a specific point, the use of additional training data can be impractical. A relatively high accuracy was achieved with few training data when the classifier was trained using randomly initialized model weights, and a higher accuracy was achieved when transfer learning was used. These findings can be used for the estimation of the necessary dataset size for training neural network classifiers intended for other environments and conditions.
... Furthermore, we increase the number of input data instances before feeding them into our classifier. This data augmentation method is known to help avoid overfitting concerns, caused by small sample datasets [31]. Since the number of collected instances in our experiments may not be enough, we doubled the size of our dataset by applying an augmentation method during training, but not testing. ...
... Many other HAR studies have been implemented with deep learning methods, such as convolutional and recurrent approaches [9], [13], [14], [26]. In this sense, a thorough survey is reported in [3] where new challenges and trends are identified for this area. ...
Article
Full-text available
Physical inactivity is one of the main risk factors for mortality, and its relationship with the main chronic diseases has experienced intensive medical research. A well-known method for assessing people’s activity is the use of accelerometers implanted in wearables and mobile phones. However, a series of main critical issues arise in the healthcare context related to the limited amount of available labelled data to build a classification model. Moreover, the discrimination ability of activities is often challenging to capture since the variety of movement patterns in a particular group of patients (e.g. obesity or geriatric patients) is limited over time. Consequently, the proposed work presents a novel approach for Human Activity Recognition (HAR) in healthcare to avoid this problem. This proposal is based on semi-supervised classification with Encoder-Decoder Convolutional Neural Networks (CNNs) using a combination strategy of public labelled and private unlabelled raw sensor data. In this sense, the model will be able to take advantage of the large amount of unlabelled data available by extracting relevant characteristics in these data, which will increase the knowledge in the innermost layers. Hence, the trained model can generalize well when used in real-world use cases. Additionally, real-time patient monitoring is provided by Apache Spark streaming processing with sliding windows. For testing purposes, a real-world case study is conducted with a group of overweight patients in the healthcare system of Andalusia (Spain), classifying close to 30 TBs of accelerometer sensor-based data. The proposed HAR streaming deep-learning approach properly classifies movement patterns in real-time conditions, crucial for long-term daily patient monitoring.
... Transfer learning has been successfully implemented in many computer vision tasks 32 and for time series classification tasks 31 , such as EEG sleep-staging 33,34 , and importantly, towards accelerometery based falls prediction 27 and within physical activity recognition 35,36 . ...
Article
Full-text available
The emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of developing rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. Deep Convolutional Neural Networks (DCNN) may capture a richer representation of healthy and MS-related ambulatory characteristics from the raw smartphone-based inertial sensor data than standard feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework leveraged the ambulatory information learned on human activity recognition (HAR) tasks collected from wearable smartphone sensor data. It was demonstrated that fine-tuning TL DCNN HAR models towards MS disease recognition tasks outperformed previous Support Vector Machine (SVM) feature-based methods, as well as DCNN models trained end-to-end, by upwards of 8–15%. A lack of transparency of “black-box” deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus people with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions.
... e image classification method based on deep convolution neural network promotes the development of neural information processing system [36]. On the basis of CNN, the ability of activity recognition is improved by using data enhancement and transfer learning [37]. e framework of activity recognition based on CNN is developed [38]. ...
Article
Full-text available
The field of activity recognition has evolved relatively early and has attracted countless researchers. With the continuous development of science and technology, people’s research on human activity recognition is also deepening and becoming richer. Nowadays, whether it is medicine, education, sports, or smart home, various fields have developed a strong interest in activity recognition, and a series of research results have also been put into people’s real production and life. Nowadays, smart phones have become quite popular, and the technology is becoming more and more mature, and various sensors have emerged at the historic moment, so the related research on activity recognition based on mobile phone sensors has its necessity and possibility. This article will use an Android smartphone to collect the data of six basic behaviors of human, which are walking, running, standing, sitting, going upstairs, and going downstairs, through its acceleration sensor, and use the classic model of deep learning CNN (convolutional neural network) to fuse those multidimensional mobile data, using TensorFlow for model training and test evaluation. The generated model is finally transplanted to an Android phone to complete the mobile-end activity recognition system.
... Transfer learning has been successfully implemented in many computer vision tasks [32] and for time series classification tasks [31], such as EEG sleep-staging [33], [34], and importantly, towards accelerometry based falls prediction [27] and within physical activity recognition [35], [36]. ...
Preprint
Full-text available
The emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of developing rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. In this work, deep convolutional neural networks (DCNN) applied to smartphone inertial sensor data were shown to better distinguish healthy from MS participant ambulation, compared to standard Support Vector Machine (SVM) feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework utilised the ambulatory information learned on Human Activity Recognition (HAR) tasks collected from similar smartphone-based sensor data. A lack of transparency of "black-box" deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus persons with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions.
... Data augmentation has been widely used for the sake of increasing the size and diversity of training datasets by introducing a range of simple image transformations, hence increasing the generalization capability of the considered deep network [71]. Many transfer learning approaches in literature have considered data augmentation when dealing with datasets having limited sizes [54,[72][73][74][75]. Transfer learning based RIQA algorithms have also commonly used data augmentation in order to increase the variability of their training datasets for more enhanced performance [33,60,62]. ...
Article
Full-text available
Retinal image quality assessment (RIQA) is essential to assure that images used for medical analysis are of sufficient quality for reliable diagnosis. A modified VGG16 network with transfer learning is introduced in order to classify retinal images into good or bad quality images. Both spatial and wavelet detail subbands are compared as inputs to the modified VGG16 network. Three public retinal image datasets captured with different imaging devices are used, both individually and collectively. Superior performance was attained by the modified VGG16 network, where accuracies in the range of 99–100% were achieved regardless of whether retinal images from the same or different sources were considered and whether the spatial or wavelet images were used. The implemented RIQA algorithm was also found to outperform other RIQA deep learning algorithms from literature by 1.5–10% and to achieve accuracies that are up to 32% higher than traditional RIQA methods for the same dataset.
... Some studies adopted DAs for sensor-based HAR using deep learning [35,36,37,38,39]. However, their adpted DAs are only folllowings: ...
Preprint
In the research field of activity recognition, although it is difficult to collect a large amount of measured sensor data, there has not been much discussion about data augmentation (DA). In this study, I propose Octave Mix as a new synthetic-style DA method for sensor-based activity recognition. Octave Mix is a simple DA method that combines two types of waveforms by intersecting low and high frequency waveforms using frequency decomposition. In addition, I propose a DA ensemble model and its training algorithm to acquire robustness to the original sensor data while remaining a wide variety of feature representation. I conducted experiments to evaluate the effectiveness of my proposed method using four different benchmark datasets of sensing-based activity recognition. As a result, my proposed method achieved the best estimation accuracy. Furthermore, I found that ensembling two DA strategies: Octave Mix with rotation and mixup with rotation, make it possible to achieve higher accuracy.
... In the context of HAR, data augmentation has been previously used with inertial measurements extracted from wearable sensors by Eyobu and Han [33], who applied local averaging as a down-sampling technique and shuffling. Also, using similar data, Kalouris et al. [34] applied a set of domain specific transformations, such as rotation, scaling, jittering and so forth. Hernandez et al. [35] worked with hand points and applied data warping on the magnitude and the temporal location of motion signals. ...
Article
Full-text available
Recent advances in big data systems and databases have made it possible to gather raw unlabeled data at unprecedented rates. However, labeling such data constitutes a costly and timely process. This is especially true for video data, and in particular for human activity recognition (HAR) tasks. For this reason, methods for reducing the need of labeled data for HAR applications have drawn significant attention from the research community. In particular, two popular approaches developed to address the above issue are data augmentation and domain adaptation. The former attempts to leverage problem-specific, hand-crafted data synthesizers to augment the training dataset with artificial labeled data instances. The latter attempts to extract knowledge from distinct but related supervised learning tasks for which labeled data is more abundant than the problem at hand. Both methods have been extensively studied and used successfully on various tasks, but a comprehensive comparison of the two has not been carried out in the context of video data HAR. In this work, we fill this gap by providing ample experimental results comparing data augmentation and domain adaptation techniques on a cross-viewpoint, human activity recognition task from pose information.
... This study aims to build upon previous CNN-based approaches for the identification of inhaler events, by investigating also acceleration strategies. CNNs have been established as a reliable state-of-the-art, data-driven approach for biosignal classification [28][29][30][31][32]. The adaptation of acceleration approaches, including filter pruning, scalar pruning and vector quantization, aims to lead to lower computational complexity and higher energy efficiency, facilitating IoT targeted implementations. ...
Article
Full-text available
Effective management of chronic constrictive pulmonary conditions lies in proper and timely administration of medication. As a series of studies indicates, medication adherence can effectively be monitored by successfully identifying actions performed by patients during inhaler usage. This study focuses on the recognition of inhaler audio events during usage of pressurized metered dose inhalers (pMDI). Aiming at real-time performance, we investigate deep sparse coding techniques including convolutional filter pruning, scalar pruning and vector quantization, for different convolutional neural network (CNN) architectures. The recognition performance has been assessed on three healthy subjects following both within and across subjects modeling strategies. The selected CNN architecture classified drug actuation, inhalation and exhalation events, with 100%, 92.6% and 97.9% accuracy, respectively, when assessed in a leave-one-subject-out cross-validation setting. Moreover, sparse coding of the same architecture with an increasing compression rate from 1 to 7 resulted in only a small decrease in classification accuracy (from 95.7% to 94.5%), obtained by random (subject-agnostic) cross-validation. A more thorough assessment on a larger dataset, including recordings of subjects with multiple respiratory disease manifestations, is still required in order to better evaluate the method’s generalization ability and robustness.
... The type of recognizable activities include sitting, standing, walking, walking upstairs or downstairs, and laying down actions or transition events. Recognition is performed based on supervised learning via support vector machines, while more recent work [53] has shown that the incorporation of deep learning techniques can further improve recognition accuracy. The method also addresses data inconsistencies that are caused by device-relative issues or possible sensor misplacement during monitoring of physiological activity [54]. ...
Article
The implications of frailty in older adults' health status and autonomy necessitates the understanding and effective management of this widespread condition as a priority for modern societies. Despite its importance, we still stand far from early detection, effective management and prevention of frailty. One of the most important reasons for this is the lack of sensitive instruments able to early identify frailty and pre-frailty conditions. The FrailSafe system provides a novel approach to this complex, medical, social and public health problem. It aspires to identify the most important components of frailty, construct cumulative metrics serving as biomarkers, and apply this knowledge and expertise for self-management and prevention. This paper presents a high-level overview of the FrailSafe system architecture providing details on the monitoring sensors and devices, the software front-ends for the interaction of the users with the system, as well as the back-end part including the data analysis and decision support modules. Data storage, remote processing and security issues are also discussed. The evaluation of the system by older individuals from 3 different countries highlighted the potential of frailty prediction strategies based on information and communication technology (ICT).
Article
Full-text available
Citation: Ariza-Colpas, P.P.; Vicario, E.; Oviedo-Carrascal, A.I.; Butt Aziz, S.; Piñeres-Melo, M.A.; Quintero-Linero,
Article
The Assisted Living Environment Research Area-AAL (Ambient Assisted Living), focuses on generating innovative technology, products, and services to assist medical attention and rehabilitation to the elderly, with the purpose of increasing the time in which these people can live independently, since whether or not they suffer from neurodegenerative diseases or a disability. This important area is responsible for the development of systems for the recognition of activity-ARS (Activity Recognition Systems) which are a valuable tool when identifying the type of activity carried out by the elderly, in order to provide them with effective assistance that allows you to carry out daily activities with total normality. This article aims to show the review of the literature and the evolution of the different data mining techniques applied to this health sector, by showing the metrics of recent experiments for researchers in this area of knowledge. The objective of this article is to carry out the review of highly relevant research works in terms of learning based on reinforcement and transfer, to later outline the different components of the RTLHAR model, for the identification and adaptation of learning focused on the recognition of human activities.
Conference Paper
Physical activity recognition in patients with Parkinson's Disease (PwPD) is challenging due to the lack of large-enough and good quality motion data for PwPD. A common approach to this obstacle involves the use of models trained on better quality data from healthy patients. Models can struggle to generalize across these domains due to motor complications affecting the movement patterns in PwPD and differences in sensor axes orientations between data. In this paper, we investigated the generalizability of a deep convolutional neural network (CNN) model trained on a young, healthy population to PD, and the role of data augmentation on alleviating sensor position variability. We used two publicly available healthy datasets - PAMAP2 and MHEALTH. Both datasets had sensor placements on the chest, wrist, and ankle with 9 and 10 subjects, respectively. A private PD dataset was utilized as well. The proposed CNN model was trained on PAMAP2 in k-fold cross-validation based on the number of subjects, with and without data augmentation, and tested directly on MHEALTH and PD data. Without data augmentation, the trained model resulted in 48.16% accuracy on MHEALTH and 0% on the PD data when directly applied with no model adaptation techniques. With data augmentation, the accuracies improved to 87.43% and 44.78%, respectively, indicating that the method compensated for the potential sensor placement variations between data. Clinical Relevance- Wearable sensors and machine learning can provide important information about the activity level of PwPD. This information can be used by the treating physician to make appropriate clinical interventions such as rehabilitation to improve quality of life.
Article
The problem of human action recognition has attracted the interest of several researchers due to its significant use in many applications. With the great success of deep learning methods in most areas, researchers decided to switch from traditional methods-based hand-crafted feature extractors to recent deep learning-based techniques to recognise the action. In the present research work, we propose a learning approach for human activity recognition in the elderly based on convolutional neural network (LAHAR-CNN). The CNN model is used to extract features from the dataset, then, a multilayer perceptron (MLP) classifier is used for action classification. It has been widely admitted that features learned using a CNN model on a large dataset can be successfully transferred to an action recognition task with a small training dataset. The proposed method is evaluated on the well-known MSRDailyActivity 3D dataset. It has shown impressive results that exceed the performances obtained in the state of the art using the same dataset, thus reaching 99.4%. Furthermore, our proposed approach predicts human activity (HA) from one single frame sample which justifies its robustness. Hence, the proposed model is ranked at the top of the list of space-time techniques.
Article
Full-text available
Research in human activity recognition (HAR) requires a huge amount of data, but it is not easy to collect such measured sensor data. Besides, there is no much application of data augmentation (DA) in HAR. This study proposes Octave Mix as a novel synthetic-style DA method for sensor-based HAR. The proposed method uses frequency decomposition to intersect low- and high-frequency waveforms. In addition, we propose a DA ensemble method and a training algorithm to ensure robustness to the original sensor data while applicable to various feature representations. To evaluate the Octave Mix method’s effectiveness, we conduct experiments using four different benchmark datasets of sensor-based HAR and achieve high estimation accuracy in our results. Furthermore, we demonstrate that ensembling two DA strategies: Octave Mix with rotation and mixup with rotation, achieves higher accuracy.
Article
Full-text available
The physiological monitoring of older people using wearable sensors has shown great potential in improving their quality of life and preventing undesired events related to their health status. Nevertheless, creating robust predictive models from data collected unobtrusively in home environments can be challenging, especially for vulnerable ageing population. Under that premise, we propose an activity recognition scheme for older people exploiting feature extraction and machine learning, along with heuristic computational solutions to address the challenges due to inconsistent measurements in non-standardized environments. In addition, we compare the customized pipeline with deep learning architectures, such as convolutional neural networks, applied to raw sensor data without any pre-or post-processing adjustments. The results demonstrate that the generalizable deep architectures can compensate for inconsistencies during data acquisition providing a valuable alternative.
Article
Full-text available
Background Frailty is a common clinical syndrome in ageing population that carries an increased risk for adverse health outcomes including falls, hospitalization, disability, and mortality. As these outcomes affect the health and social care planning, during the last years there is a tendency of investing in monitoring and preventing strategies. Although a number of electronic health record (EHR) systems have been developed, including personalized virtual patient models, there are limited ageing population oriented systems. Methods We exploit the openEHR framework for the representation of frailty in ageing population in order to attain semantic interoperability, and we present the methodology for adoption or development of archetypes. We also propose a framework for a one-to-one mapping between openEHR archetypes and a column-family NoSQL database (HBase) aiming at the integration of existing and newly developed archetypes into it. Results The requirement analysis of our study resulted in the definition of 22 coherent and clinically meaningful parameters for the description of frailty in older adults. The implemented openEHR methodology led to the direct use of 22 archetypes, the modification and reuse of two archetypes, and the development of 28 new archetypes. Additionally, the mapping procedure led to two different HBase tables for the storage of the data. Conclusions In this work, an openEHR-based virtual patient model has been designed and integrated into an HBase storage system, exploiting the advantages of the underlying technologies. This framework can serve as a base for the development of a decision support system using the openEHR’s Guideline Definition Language in the future.
Article
Full-text available
Multidimensional data that occur in a variety of applications in clinical diagnostics and health care can naturally be represented by multidimensional arrays (i.e., tensors). Tensor decompositions offer valuable and powerful tools for latent concept discovery that can handle effectively missing values and noise. We propose a seamless, application-independent feature extraction and multiple-instance (MI) classification method, which represents the raw multidimensional, possibly incomplete, data by means of learning a high-order dictionary. The effectiveness of the proposed method is demonstrated in two application scenarios: (i) prediction of frailty in older people using multisensor recordings and (ii) breast cancer classification based on histopathology images. The proposed method outperforms or is comparable to the state-of-the-art multiple-instance learning classifiers highlighting its potential for computer-assisted diagnosis and health care support.
Conference Paper
Full-text available
As the global community becomes more interested in improving the quality of life of older people and preventing undesired events related to their health status, the development of sophisticated devices and analysis algorithms for monitoring everyday activities is necessary more than ever. Wearable devices lie among the most popular solutions from a hardware point of view, while machine learning techniques have shown to be very powerful in behavioral monitoring. Nevertheless, creating robust models from data collected unobtrusively in home environments can be challenging, especially for the vulnerable ageing population. Under that premise, we propose an activity recognition scheme for older people along with heuristic computational solutions to address the challenges due to inconsistent measurements in non-standardized environments.
Article
Full-text available
Human activity recognition systems are developed as part of a framework to enable continuous monitoring of human behaviours in the area of ambient assisted living, sports injury detection, elderly care, rehabilitation, and entertainment and surveillance in smart home environments. The extraction of relevant features is the most challenging part of the mobile and wearable sensor-based human activity recognition pipeline. Feature extraction influences the algorithm performance and reduces computation time and complexity. However, current human activity recognition relies on handcrafted features that are incapable of handling complex activities especially with the current influx of multimodal and high dimensional sensor data. With the emergence of deep learning and increased computation powers, deep learning and artificial intelligence methods are being adopted for automatic feature learning in diverse areas like health, image classification, and recently, for feature extraction and classification of simple and complex human activity recognition in mobile and wearable sensors. Furthermore, the fusion of mobile or wearable sensors and deep learning methods for feature learning provide diversity, offers higher generalisation, and tackles challenging issues in human activity recognition. The focus of this review is to provide in-depth summaries of deep learning methods for mobile and wearable sensor-based human activity recognition. The review presents the methods, uniqueness, advantages and their limitations. We not only categorise the studies into generative, discriminative and hybrid methods but also highlight their important advantages. Furthermore, the review presents classification and evaluation procedures and discusses publicly available datasets for mobile sensor human activity recognition. Finally, we outline and explain some challenges to open research problems that require further research and improvements.
Conference Paper
Full-text available
The problem of data augmentation in feature space is considered. A new architecture, denoted the FeATure TransfEr Network (FATTEN), is proposed for the modeling of feature trajectories induced by variations of object pose. This architecture exploits a parametrization of the pose manifold in terms of pose and appearance. This leads to a deep encoder/decoder network architecture, where the encoder factors into an appearance and a pose predictor. Unlike previous attempts at trajectory transfer, FATTEN can be efficiently trained end-to-end, with no need to train separate feature transfer functions. This is realized by supplying the decoder with information about a target pose and the use of a multi-task loss that penalizes category- and pose-mismatches. In result, FATTEN discourages discontinuous or non-smooth trajectories that fail to capture the structure of the pose manifold, and generalizes well on object recognition tasks involving large pose variation. Experimental results on the artificial ModelNet database show that it can successfully learn to map source features to target features of a desired pose, while preserving class identity. Most notably, by using feature space transfer for data augmentation (w.r.t. pose and depth) on SUN-RGBD objects, we demonstrate considerable performance improvements on one/few-shot object recognition in a transfer learning setup, compared to current state-of-the-art methods.
Article
Full-text available
Wearable technologies play a central role in human-centered Internet-of-Things applications. Wearables leverage machine learning algorithms to detect events of interest such as physical activities and medical complications. A major obstacle in large-scale utilization of current wearables is that their computational algorithms need to be re-built from scratch upon any changes in the configuration. Retraining of these algorithms requires significant amount of labeled training data, a process that is labor-intensive and time-consuming. We propose an approach for automatic retraining of the machine learning algorithms in real-time without need for any labeled training data. We measure the inherent correlation between observations made by an old sensor view for which trained algorithms exist and the new sensor view for which an algorithm needs to be developed. Our multi-view learning approach can be used in both online and batch modes. By applying the autonomous multi-view learning in the batch mode, we achieve an accuracy of 83.7% in activity recognition which is an improvement of 9.3% due to the automatic labeling of the data in the new sensor node. In addition to gain the less computation advantage of incremental training, the online learning algorithm results in an accuracy of 82.2% in activity recognition.
Article
Full-text available
In this paper, we propose a method of improving Convolutional Neural Networks (CNN) by determining the optimal alignment of weights and inputs using dynamic programming. Conventional CNNs convolve learnable shared weights, or filters, across the input data. The filters use a linear matching of weights to inputs using an inner product between the filter and a window of the input. However, it is possible that there exists a more optimal alignment of weights. Thus, we propose the use of Dynamic Time Warping (DTW) to dynamically align the weights to optimized input elements. This dynamic alignment is useful for time series recognition due to the complexities of temporal relations and temporal distortions. We demonstrate the effectiveness of the proposed architecture on the Unipen online handwritten digit and character datasets, the UCI Spoken Arabic Digit dataset, and the UCI Activities of Daily Life dataset.
Article
Full-text available
Monitoring of activities of daily living (ADL) using wearable sensors can provide an objective indication of the activity levels or restrictions experienced by patients or elderly. The current study presented a two-sensor ADL classification method designed and tested specifically with elderly subjects. Ten healthy elderly were involved in a laboratory testing with 6 types of daily activities. Two inertial measurement units were attached to the thigh and the trunk of each subject. The results indicated an overall rate of misdetection being 2.8%. The findings of the current study can be used as the first step towards a more comprehensive activity monitoring technology specifically designed for the aging population.
Article
Full-text available
TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. TensorFlow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with particularly strong support for training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model in contrast to existing systems, and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.
Article
Full-text available
Epileptiform discharges in interictal electroencephalography (EEG) form the mainstay of epilepsy diagnosis and localization of seizure onset. Visual analysis is rater-dependent and time consuming, especially for long-term recordings, while computerized methods can provide efficiency in reviewing long EEG recordings. This paper presents a machine learning approach for automated detection of epileptiform discharges (spikes). The proposed method first detects spike patterns by calculating similarity to a coarse shape model of a spike waveform and then refines the results by identifying subtle differences between actual spikes and false detections. Pattern classification is performed using support vector machines in a low dimensional space on which the original waveforms are embedded by locality preserving projections. The automatic detection results are compared to experts’ manual annotations (101 spikes) on a whole-night sleep EEG recording. The high sensitivity (97 %) and the low false positive rate (0.1 min−1), calculated by intra-patient cross-validation, highlight the potential of the method for automated interictal EEG assessment.
Article
Full-text available
Human activity recognition (HAR) tasks have traditionally been solved using engineered features obtained by heuristic processes. Current research suggests that deep convolutional neural networks are suited to automate feature extraction from raw sensor inputs. However, human activities are made of complex sequences of motor movements, and capturing this temporal dynamics is fundamental for successful HAR. Based on the recent success of recurrent neural networks for time series domains, we propose a generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which: (i) is suitable for multimodal wearable sensors; (ii) can perform sensor fusion naturally; (iii) does not require expert knowledge in designing features; and (iv) explicitly models the temporal dynamics of feature activations. We evaluate our framework on two datasets, one of which has been used in a public activity recognition challenge. Our results show that our framework outperforms competing deep non-recurrent networks on the challenge dataset by 4% on average; outperforming some of the previous reported results by up to 9%. Our results show that the framework can be applied to homogeneous sensor modalities, but can also fuse multimodal sensors to improve performance. We characterise key architectural hyperparameters’ influence on performance to provide insights about their optimisation.
Conference Paper
Full-text available
A variety of real-life mobile sensing applications are becoming available, especially in the life-logging, fitness tracking and health monitoring domains. These applications use mobile sensors embedded in smart phones to recognize human activities in order to get a better understanding of human behavior. While progress has been made, human activity recognition remains a challenging task. This is partly due to the broad range of human activities as well as the rich variation in how a given activity can be performed. Using features that clearly separate between activities is crucial. In this paper, we propose an approach to automatically extract discriminative features for activity recognition. Specifically, we develop a method based on Convolutional Neural Networks (CNN), which can capture local dependency and scale invariance of a signal as it has been shown in speech recognition and image recognition domains. In addition, a modified weight sharing technique, called partial weight sharing, is proposed and applied to accelerometer signals to get further improvements. The experimental results on three public datasets, Skoda (assembly line activities), Opportunity (activities in kitchen), Actitracker (jogging, walking, etc.), indicate that our novel CNN-based approach is practical and achieves higher accuracy than existing state-of-the-art methods. © 2014 The Institute for Computer Sciences, Social Informatics, and Telecommunications Engineering (ICST).
Article
Full-text available
This paper focuses on human activity recognition (HAR) problem, in which inputs are multichannel time series signals acquired from a set of body-worn inertial sensors and outputs are predefined human activities. In this problem, extracting effective features for identifying activities is a critical but challenging task. Most existing work relies on heuristic hand-crafted feature design and shallow feature learning architectures, which cannot find those distinguishing features to accurately classify different activities. In this paper, we propose a systematic feature learning method for HAR problem. This method adopts a deep convolutional neural networks (CNN) to automate feature learning from the raw inputs in a systematic way. Through the deep architecture, the learned features are deemed as the higher level abstract representation of low level raw time series signals. By leveraging the labelled information via supervised learning, the learned features are endowed with more discrimi-native power. Unified in one model, feature learning and classification are mutually enhanced. All these unique advantages of the CNN make it out-perform other HAR algorithms, as verified in the experiments on the Opportunity Activity Recognition Challenge and other benchmark datasets.
Article
Full-text available
Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Article
Full-text available
Many real-world applications that focus on addressing needs of a human, require information about the activities being performed by the human in real-time. While advances in pervasive computing have lead to the development of wireless and non-intrusive sensors that can capture the necessary activity information, current activity recognition approaches have so far experimented on either a scripted or pre-segmented sequence of sensor events related to activities. In this paper we propose and evaluate a sliding window based approach to perform activity recognition in an on line or streaming fashion; recognizing activities as and when new sensor events are recorded. To account for the fact that different activities can be best characterized by different window lengths of sensor events, we incorporate the time decay and mutual information based weighting of sensor events within a window. Additional contextual information in the form of the previous activity and the activity of the previous window is also appended to the feature describing a sensor window. The experiments conducted to evaluate these techniques on real-world smart home datasets suggests that combining mutual information based weighting of sensor events and adding past contextual information into the feature leads to best performance for streaming activity recognition.
Conference Paper
Full-text available
A rapidly growing aging population presents many challenges to health and aged care services around the world. Recognising and understanding the activities performed by elderly is an important research area that has the potential to address these challenges and healthcare needs of the 21st century by enabling a wide range of valuable applications such as remote health monitoring. A key enabling technology for such applications is wireless sensors. However we must first overcome a number of challenges that are technological, social and economic, before being able to realize such applications using pervasive technologies.
Conference Paper
Full-text available
Activity recognition systems have been found to be very effective for tracking users' activities in research areas like healthcare and assisted living. Wearable accelerometers that can help in classifying Physical Activities (PA) have been made available by MEMS technology. State-of-the-art PA classification systems use threshold-based techniques and Machine Learning (ML) algorithms. Each PA may exhibit inter-subject and intra-subject variability which is a major drawback for threshold and machine learning based techniques. Due to lack of empirical data in order to train classifier for ML clustering algorithms, there is a need to develop a mechanism which requires less training data for PA clustering. This paper describes a novel personalized PA recognition model framework based on a semi-supervised clustering approach to avoid fixed threshold techniques and traditional clustering methods by using a single accelerometer. The proposed methodology requires limited amount of data to compute (initial) centroids for PA clusters and achieved an accuracy of about 93% on average, moreover it has the potential capability of recognizing subjects' behavioral shifts and exceptional events, falls, etc.
Conference Paper
Full-text available
This paper addresses the lack of a commonly used, standard dataset and established benchmarking problems for physical activity monitoring. A new dataset - recorded from 18 activities performed by 9 subjects, wearing 3 IMUs and a HR-monitor - is created and made publicly available. Moreover, 4 classification problems are benchmarked on the dataset, using a standard data processing chain and 5 different classifiers. The benchmark shows the difficulty of the classification tasks and exposes new challenges for physical activity monitoring.
Article
Full-text available
Smart phones comprise a large and rapidly growing market. These devices provide unprecedented opportunities for sensor mining since they include a large variety of sensors, including an: accele-ration sensor (accelerometer), location sensor (GPS), direction sensor (compass), audio sensor (microphone), image sensor (cam-era), proximity sensor, light sensor, and temperature sensor. Com-bined with the ubiquity and portability of these devices, these sensors provide us with an unprecedented view into people's lives—and an excellent opportunity for data mining. But there are obstacles to sensor mining applications, due to the severe resource limitations (e.g., power, memory, bandwidth) faced by mobile devices. In this paper we discuss these limitations, their impact, and propose a solution based on our WISDM (WIireless Sensor Data Mining) smart phone-based sensor mining architecture.
Conference Paper
Full-text available
We deployed 72 sensors of 10 modalities in 15 wireless and wired networked sensor systems in the environment, in objects, and on the body to create a sensor-rich environment for the machine recognition of human activities. We acquired data from 12 subjects performing morning activities, yielding over 25 hours of sensor data. We report the number of activity occurrences observed during post-processing, and estimate that over 13000 and 14000 object and environment interactions occurred. We describe the networked sensor setup and the methodology for data acquisition, synchronization and curation. We report on the challenges and outline lessons learned and best practice for similar large scale deployments of heterogeneous networked sensor systems. We evaluate data acquisition quality for on-body and object integrated wireless sensors; there is less than 2.5% packet loss after tuning. We outline our use of the dataset to develop new sensor network self-organization principles and machine learning techniques for activity recognition in opportunistic sensor configurations. Eventually this dataset will be made public.
Conference Paper
Full-text available
Human action recognition is gaining interest from many computer vision researchers because of its wide variety of potential applications. For instance: surveillance, advanced human computer interaction, content-based video retrieval, or athletic performance analysis. In this research, we focus to recognize some human actions such as waving, punching, clapping, etc. We choose exemplar-based sequential single- layered approach using Dynamic Time Warping (DTW) because of its robustness against variation in speed or style in performing action. For improving recognition rate, we perform body part tracking using depth camera to recover human joints body part information in 3D real world coordinate system. We build our feature vector from joint orientation along time series that invariant to human body size. Dynamic Time Warping is then applied to the resulted feature vector. We examine our approach to recognize several actions and we confirm our method can work well with several experiments. Further experiment for benchmarking the result will be held in near future. mature for tracking human poses accurately given reliable segmentation result. These two improvements become the baseline for our research. Combing depth camera and mature computer vision algorithm, we bound our research to recognize human action given human pose estimation in 3D coordinate system. Related methods for low level processing can be found in vast literature available. Aggarwal and Ryoo have provided a tidy overview of state-of-the-art in human activity recognition methodologies. Based on they overview, our research can be classified in the part of sequential, single-layered approaches using exemplar- based. We choose exemplar-based sequential single-layered approach to recognize human action because of its robustness against variation in speed or style in performing action. Moreover, it requires less training data than state-based approach (2).
Conference Paper
Full-text available
Feature extraction for activity recognition in context-aware ubiquitous computing applications is usually a heuristic process, informed by underlying domain knowledge. Relying on such explicit knowledge is problematic when aiming to generalize across different application domains. We investigate the potential of recent machine learning methods for discovering universal features for context-aware applications of activity recognition. We also describe an alternative data representation based on the empirical cumulative distribution function of the raw data, which effectively abstracts from absolute values. Experiments on accelerometer data from four publicly available activity recognition datasets demonstrate the significant potential of our approach to address both contemporary activity recognition tasks and next generation problems such as skill assessment and the detection of novel activities.
Conference Paper
Full-text available
Human activity recognition is a thriving research field. There are lots of studies in different sub-areas of activity recognition proposing different methods. However, unlike other applications, there is lack of established benchmarking problems for activity recognition. Typically, each research group tests and reports the performance of their algorithms on their own datasets using experimental setups specially conceived for that specific purpose. In this work, we introduce a versatile human activity dataset conceived to fill that void. We illustrate its use by presenting comparative results of different classification techniques, and discuss about several metrics that can be used to assess their performance. Being an initial benchmarking, we expect that the possibility to replicate and outperform the presented results will contribute to further advances in state-of-the-art methods.
Conference Paper
While convolutional neural networks (CNNs) have been successfully applied to many challenging classification applications, they typically require large datasets for training. When the availability of labeled data is limited, data augmentation is a critical preprocessing step for CNNs. However, data augmentation for wearable sensor data has not been deeply investigated yet. In this paper, various data augmentation methods for wearable sensor data are proposed. The proposed methods and CNNs are applied to the classification of the motor state of Parkinson’s Disease patients, which is challenging due to small dataset size, noisy labels, and large intra-class variability. Appropriate augmentation improves the classification performance from 77.54% to 86.88%.
Conference Paper
Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Conference Paper
Human activity recognition (HAR) in ubiquitous computing is beginning to adopt deep learning to substitute for well-established analysis techniques that rely on hand-crafted feature extraction and classification techniques. From these isolated applications of custom deep architectures it is, however, difficult to gain an overview of their suitability for problems ranging from the recognition of manipulative gestures to the segmentation and identification of physical activities like running or ascending stairs. In this paper we rigorously explore deep, convolutional, and recurrent approaches across three representative datasets that contain movement data captured with wearable sensors. We describe how to train recurrent approaches in this setting, introduce a novel regularisation approach, and illustrate how they outperform the state-of-the-art on a large benchmark dataset. Across thousands of recognition experiments with randomly sampled model configurations we investigate the suitability of each model for different tasks in HAR, explore the impact of hyperparameters using the fANOVA framework, and provide guidelines for the practitioner who wants to apply deep learning in their problem setting.
Conference Paper
Human physical activity recognition based on wearable sensors has applications relevant to our daily life such as healthcare. How to achieve high recognition accuracy with low computational cost is an important issue in the ubiquitous computing. Rather than exploring handcrafted features from time-series sensor signals, we assemble signal sequences of accelerometers and gyroscopes into a novel activity image, which enables Deep Convolutional Neural Networks (DCNN) to automatically learn the optimal features from the activity image for the activity recognition task. Our proposed approach is evaluated on three public datasets and it outperforms state-of-the-arts in terms of recognition accuracy and computational cost.
Conference Paper
Statistics show that the population in Europe is aging and in the near future human assistance for elderly persons will be prohibitive. This paper analyses the possibilities to implement a supervision system, which is capable of monitoring a person's activity in his/her home without violating intimacy. The main idea is to collect information from various sensors placed in house and on mobile devices and infer a most probable sequence of activities performed by the supervised person. A Hidden Markov chain method is adapted for the activity chain recognition.
Conference Paper
In order to plan and deliver health care in a world with increasing number of older people, human motion monitoring is a must in their surveillance, since the related information is crucial for understanding their physical status. In this article, we focus on the physiological function and motor performance thus we present a light human motion identification scheme together with preliminary evaluation results, which will be further exploited within the FrailSafe Project. For this purpose, a large number of time and frequency domain features extracted from the sensor signals (accelerometer and gyroscope) and concatenated to a single feature vector are evaluated in a subject dependent cross-validation setting using SVMs. The mean classification accuracy reaches 96%. In a further step, feature ranking and selection is preformed prior to subject independent classification using the ReliefF ranking algorithm. The classification model using feature subsets of different size is evaluated in order to reveal the best dimensionality of the feature vector. The achieved accuracy is 97% which is a slight improvement compared to previous approaches evaluated on the same dataset. However, such an improvement can be considered significant given the fact that it is achieved with lighter processing using a smaller number of features.
Article
Empirical analysis serves as an important complement to theoretical analysis for studying practical Bayesian optimization. Often empirical insights expose strengths and weaknesses inaccessible to theoretical analysis. We define two metrics for comparing the performance of Bayesian optimization methods and propose a ranking mechanism for summarizing performance within various genres or strata of test functions. These test functions serve to mimic the complexity of hyperparameter optimization problems, the most prominent application of Bayesian optimization, but with a closed form which allows for rapid evaluation and more predictable behavior. This offers a flexible and efficient way to investigate functions with specific properties of interest, such as oscillatory behavior or an optimum on the domain boundary.
Chapter
The correct identification of Activities of Daily Living (ADL) is a fundamental task to implement an effective remote monitoring of weak users with particular regards to elderlies. In this paper a comparison of the performances of two different algorithms for the classification of ADL, developed by the authors, is presented. The first algorithm exploits a threshold mechanism while the other one is based on Principal Component Analysis (PCA). The threshold based algorithm provides reasonable performances in performing classifications between different ADL. Moreover the threshold definition mechanism implemented is flexible and adaptable to several different application contexts due to the use of Receiver Operating Characteristic (ROC) theory which allows to properly define thresholds values on the basis of constraints on the system sensibility and specificity. Advantage of the PCA approach resides in the possibility to improve the system specificity in classifying different kind of ADL and a reduction of the classification problem complexity. The developed strategy allows for ADL classification with sensibility and specificity features in line with real applications in Ambient Assisted Living (AAL) context.
Article
In this paper, a large scale evaluation of time-domain and frequency domain features of electroencephalographic and electrocardiographic signals for seizure detection was performed. For the classification we relied on the support vector machines algorithm. The seizure detection architecture was evaluated on three subjects and the achieved detection accuracy was more than 90% for two of them and slightly lower than 90% for the third subject.
Article
Quantifying daily physical activity in older adults can provide relevant monitoring and diagnostic information about risk of fall and frailty. In this study, we introduce instrumented shoes capable of recording movement and foot loading data unobtrusively throughout the day. Recorded data were used to devise an activity classification algorithm. Ten elderly persons wore the instrumented shoe system consisting of insoles inside the shoes and inertial measurement units on the shoes, and performed a series of activities of daily life as part of a semi-structured protocol. We hypothesized that foot loading, orientation, and elevation can be used to classify postural transitions, locomotion, and walking type. Additional sensors worn at the right thigh and the trunk were used as reference, along with an event marker. An activity classification algorithm was built based on a decision tree that incorporates rules inspired from movement biomechanics. The algorithm revealed excellent performance with respect to the reference system with an overall accuracy of 97% across all activities. The algorithm was also capable of recognizing all postural transitions and locomotion periods with elevation changes. Furthermore, the algorithm proved to be robust against small changes of tuning parameters. This instrumented shoe system is suitable for daily activity monitoring in elderly persons and can additionally provide gait parameters, which, combined with activity parameters, can supply useful clinical information regarding the mobility of elderly persons.
Conference Paper
Data augmentation using label preserving transformations has been shown to be effective for neural network training to make invariant predictions. In this paper we focus on data augmentation approaches to acoustic modeling using deep neural networks (DNNs) for automatic speech recognition (ASR). We first investigate a modified version of a previously studied approach using vocal tract length perturbation (VTLP) and then propose a novel data augmentation approach based on stochastic feature mapping (SFM) in a speaker adaptive feature space. Experiments were conducted on Bengali and Assamese limited language packs (LLPs) from the IARPA Babel program. Improved recognition performance has been observed after both cross-entropy (CE) and state-level minimum Bayes risk (sMBR) training of DNN models.
Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of developing automatic approaches which can optimize the performance of a given learning algorithm to the task at hand. In this work, we consider the automatic tuning problem within the framework of Bayesian optimization, in which a learning algorithm's generalization performance is modeled as a sample from a Gaussian process (GP). The tractable posterior distribution induced by the GP leads to efficient use of the information gathered by previous experiments, enabling optimal choices about what parameters to try next. Here we show how the effects of the Gaussian process prior and the associated inference procedure can have a large impact on the success or failure of Bayesian optimization. We show that thoughtful choices can lead to results that exceed expert-level performance in tuning machine learning algorithms. We also describe new algorithms that take into account the variable cost (duration) of learning experiments and that can leverage the presence of multiple cores for parallel experimentation. We show that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization on a diverse set of contemporary algorithms including latent Dirichlet allocation, structured SVMs and convolutional neural networks.
Conference Paper
This paper proposes a new ensemble classifier based on Dynamic Time Warping (DTW), and demonstrates how it can be used to combine information from multiple time-series sensors, to relate them to the activities of the person wearing them. The training data for the system comprises a set of short time samples for each sensor and each activity, which are used as templates for DTW, and time series for each sensor are classified by assessing their similarity to these templates. To arrive at a final classification, results from separate classifiers are combined using a voting ensemble. The approach is evaluated on data relating to six different activities of daily living (ADLs) from the MIT Placelab dataset, using hip, thigh and wrist sensors. It is found that the overall average accuracy in recognising all six activities ranges from 45.5% to 57.2% when using individual sensors, but this increases to 84.3% when all three sensors are used together in the ensemble. The results compare well with other published results in which different classification algorithms were used, indicating that the ensemble DTW classification approach is a promising one.
Personalizing EEG-based affective models with transfer learning
  • zheng
W.-L. Zheng and B.-L. Lu, "Personalizing EEG-based affective models with transfer learning," in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016.
Meeting challenges of activity recognition for ageing population in real life settings
  • A Papagiannaki
  • E I Zacharaki
  • K Deltouzos
  • R Orselli
  • A Freminet
  • S Cela
  • E Aristodemou
  • M Polycarpou
  • M Kotsani
A. Papagiannaki, E. I. Zacharaki, K. Deltouzos, R. Orselli, A. Freminet, S. Cela, E. Aristodemou, M. Polycarpou, M. Kotsani, A. Benetos and others, "Meeting challenges of activity recognition for ageing population in real life settings," in 2018 IEEE 20th International Conference on e-Health Networking, Applications and Services (Healthcom), 2018.