Article

Deep learning analysis based on multi-sensor fusion data for hemiplegia rehabilitation training system for stoke patients

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

By recognizing the motion of the healthy side, the lower limb exoskeleton robot can provide therapy to the affected side of stroke patients. To improve the accuracy of motion intention recognition based on sensor data, the research based on deep learning was carried out. Eighty healthy subjects performed gait experiments under five different gait environments (flat ground, 10 ${}^\circ$ upslope and downslope, and upstairs and downstairs) by simulating stroke patients. To facilitate the training and classification of the neural network, this paper presents template processing schemes to adapt to different data formats. The novel algorithm model of a hybrid network model based on convolutional neural network (CNN) and Long–short-term memory (LSTM) model is constructed. To mitigate the data-sparse problem, a spatial–temporal-embedded LSTM model (SQLSTM) combining spatial–temporal influence with the LSTM model is proposed. The proposed CNN-SQLSTM model is evaluated on a real trajectory dataset, and the results demonstrate the effectiveness of the proposed model. The proposed method will be used to guide the control strategy design of robot system for active rehabilitation training.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Meanwhile, Qi et al. [218] proposed a multi-layer RNN composed of an LSTM module and a dropout layer (LSTM-RNN) for remote operation of surgical robots. Moreover, the field of medical rehabilitation has also seen various research, including rehabilitation robots [219] and intelligent prostheses [220]. ...
... There are several deep learning models that can be used in rehabilitation, depending on the specific task and type of data being analyzed [58]. ...
Article
Full-text available
Background: As the field of sensor-based rehabilitation continues to expand, it is important to gain a comprehensive understanding of its current research landscape. This study aimed to conduct a bibliometric analysis to identify the most influential authors, institutions, journals, and research areas in this field. Methods: A search of the Web of Science Core Collection was performed using keywords related to sensor-based rehabilitation in neurological diseases. The search results were analyzed with CiteSpace software using bibliometric techniques, including co-authorship analysis, citation analysis, and keyword co-occurrence analysis. Results: Between 2002 and 2022, 1103 papers were published on the topic, with slow growth from 2002 to 2017, followed by a rapid increase from 2018 to 2022. The United States was the most active country, while the Swiss Federal Institute of Technology had the highest number of publications among institutions. Sensors published the most papers. The top keywords included rehabilitation, stroke, and recovery. The clusters of keywords comprised machine learning, specific neurological conditions, and sensor-based rehabilitation technologies. Conclusions: This study provides a comprehensive overview of the current state of sensor-based rehabilitation research in neurological diseases, highlighting the most influential authors, journals, and research themes. The findings can help researchers and practitioners to identify emerging trends and opportunities for collaboration and can inform the development of future research directions in this field.
... Human activity recognition (HAR) can be used to monitor user's behaviours, analyse them, and consequently assist the user in his/her daily life or provide histories on the activities to specialists for evaluation. The applications of HAR include health monitoring [1,2], rehabilitation [3], fitness [4], home automation [5], and safety [6]. ...
Article
Full-text available
Health monitoring, rehabilitation, and fitness are just a few domains where human activity recognition can be applied. In this study, a deep learning approach has been proposed to recognise ambulation and fitness activities from data collected by five participants using smart insoles. Smart insoles, consisting of pressure and inertial sensors, allowed for seamless data collection while minimising user discomfort, laying the baseline for the development of a monitoring and/or rehabilitation system for everyday life. The key objective has been to enhance the deep learning model performance through several techniques, including data segmentation with overlapping technique (2 s with 50% overlap), signal down-sampling by averaging contiguous samples, and a cost-sensitive re-weighting strategy for the loss function for handling the imbalanced dataset. The proposed solution achieved an Accuracy and F1-Score of 98.56% and 98.57%, respectively. The Sitting activities obtained the highest degree of recognition, closely followed by the Spinning Bike class, but fitness activities were recognised at a higher rate than ambulation activities. A comparative analysis was carried out both to determine the impact that pre-processing had on the proposed core architecture and to compare the proposed solution with existing state-of-the-art solutions. The results, in addition to demonstrating how deep learning solutions outperformed those of shallow machine learning, showed that in our solution the use of data pre-processing increased performance by about 2%, optimising the handling of the imbalanced dataset and allowing a relatively simple network to outperform more complex networks, reducing the computational impact required for such applications.
... A number of systems are targeted for stroke patient rehabilitation. Zhang and Zhang [40] developed a hybrid network model for guiding the control strategy design of a robotic system for active rehabilitation training of stroke patients. The hybrid network combines a CNN and a spatial-temporal-embedded long-short-term memory (SQLSTM) model. ...
Article
Full-text available
Hemiplegia is a condition caused by brain injury and affects a significant percentage of the population. The effect of patients suffering from this condition is a varying degree of weakness, spasticity, and motor impairment to the left or right side of the body. This paper proposes an automatic feature selection and construction method based on grammatical evolution (GE) for radial basis function (RBF) networks that can classify the hemiplegia type between patients and healthy individuals. The proposed algorithm is tested in a dataset containing entries from the accelerometer sensors of the RehaGait mobile gait analysis system, which are placed in various patients’ body parts. The collected data were split into 2-second windows and underwent a manual pre-processing and feature extraction stage. Then, the extracted data are presented as input to the proposed GE-based method to create new, more efficient features, which are then introduced as input to an RBF network. The paper’s experimental part involved testing the proposed method with four classification methods: RBF network, multi-layer perceptron (MLP) trained with the Broyden–Fletcher–Goldfarb–Shanno (BFGS) training algorithm, support vector machine (SVM), and a GE-based parallel tool for data classification (GenClass). The test results revealed that the proposed solution had the highest classification accuracy (90.07%) compared to the other four methods.
Article
The autonomous underwater vehicle (AUV) has a problem with feature loss when recognizing small targets underwater. At present, algorithms usually use multi-scale feature extraction to solve the problem, but this method increases the computational effort of the algorithm. In addition, low underwater light and turbid water result in incomplete information on target features. This paper proposes an enhanced dilated convolution framework (EHDC) for underwater blurred target recognition. Firstly, this paper extracts small target features through hybrid dilated convolution networks, increasing the perceptive field of the algorithm without increasing the computational power of the algorithm. Secondly, the proposed algorithm learns spatial semantic features through an adaptive correlation matrix and compensates for the missing features of the target. Finally, this paper fuses spatial semantic features and visual features for the recognition of small underwater blurred targets. Experiments show that the proposed method improves the recognition accuracy by 1.04% compared to existing methods when recognizing small underwater blurred targets.
Article
Full-text available
Electromyography (EMG) signal is one of the extensively utilised biological signals for predicting human motor intention, which is an essential element in human-robot collaboration platforms. Studies on motion intention prediction from EMG signals have often been concentrated on either classification and regression models of muscle activity. In this study, we leverage the information from the EMG signals, to detect the subject's intentions in generating motion commands for a robot-assisted upper limb rehabilitation platform. The EMG signals are recorded from ten healthy subjects' biceps muscle, and the movements of the upper limb evaluated are voluntary elbow flexion and extension along the sagittal plane. The signals are filtered through a fifth-order Butterworth filter. A number of features were extracted from the filtered signals namely waveform length (WL), mean absolute value (MAV), root mean square (RMS), standard deviation (SD), minimum (MIN) and maximum (MAX). Several different classifiers viz. Linear Discriminant Analysis (LDA), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM) and k-Nearest Neighbour (k-NN) were investigated on its efficacy to accurately classify the pre-intention and intention classes based on the significant features identified (MIN and MAX) via Extremely Randomised Tree feature selection technique. It was observed from the present investigation that the DT classifier yielded an excellent classification with a classification accuracy of 100%, 99% and 99% on training, testing and validation dataset, respectively based on the identified features. The findings of the present investigation are non-trivial towards facilitating the rehabilitation phase of patients based on their actual capability and hence, would eventually yield a more active participation from them.
Article
Full-text available
In this paper, we introduce a new mode of mechanomyography (MMG) signal capture for enhancing the performance of human-machine interfaces (HMIs) through modulation of normal pressure at the sensor location. Utilizing this novel approach, increased MMG signal resolution is enabled by a tunable degree of freedom normal to the sensor-skin contact area. We detail the mechatronic design, experimental validation, and user study of an armband with embedded acoustic sensors demonstrating this capacity. The design is motivated by the nonlinear viscoelasticity of the tissue, which increases with the normal surface pressure. This, in theory, results in higher conductivity of mechanical waves and hypothetically allows to interface with deeper muscle; thus, enhancing the discriminative information context of the signal space. Ten subjects (seven able-bodied and three trans-radial amputees) participated in a study consisting of the classification of hand gestures through MMG while increasing levels of contact force were administered. Four MMG channels were positioned around the forearm and placed over the flexor carpi radialis, brachioradialis, extensor digitorum communis, and flexor carpi ulnaris muscles. A total of 852 spectrotemporal features were extracted (213 features per each channel) and passed through a Neighborhood Component Analysis (NCA) technique to select the most informative neurophysiological subspace of the features for classification. A linear support vector machine (SVM) then classified the intended motion of the user. The results indicate that increasing the normal force level between the MMG sensor and the skin can improve the discriminative power of the classifier, and the corresponding pattern can be user-specific. These results have significant implications enabling embedding MMG sensors in sockets for prosthetic limb control and HMI.
Article
Full-text available
Recently, computer vision and deep learning technology has been applied in various gait rehabilitation researches. Considering the long short-term memory (LSTM) network has been proved an excellent performance in learn sequence feature representations, we proposed a lower limb joint trajectory prediction method based on LSTM for conducting active rehabilitation on a rehabilitation robotic system. Our approach based on synergy theory exploits that the follow-up lower limb joint trajectory, i.e. limb intention, could be generated by joint angles of the previous swing process of upper limb which were acquired from Kinect platform, an advanced computer vision platform for motion tracking. A customize Kinect-Treadmill data acquisition platform was built for this study. With this platform, data acquisition on ten healthy subjects is processed in four different walking speeds to acquire the joint angles calculated by Kinect visual signals of upper and lower limb swing. Then, the angles of hip and knee in one side which were presented as lower limb intentions are predicted by the fore angles of the elbow and shoulder on the opposite side via a trained LSTM model. The results indicate that the trained LSTM model has a better estimation of predicting the lower limb intentions, and the feasibility of Kinect visual signals has been validated as well.
Conference Paper
Full-text available
In assistive technologies designed for patients with extremely limited motor or communication capabilities, it is of significant importance to accurately predict the intention of the user, in a timely manner. This paper presents a new framework for the early prediction of the user's intent via their eye gaze. The seen objects in the displayed images, and the order of their selection are identified from the spatial and temporal information of the gaze. By employing a combination of convolution neuronal network (CNN) and long short term memory (LSTM), early prediction of the user's intention is enabled. The proposed framework is tested using experimental data obtained from eight subjects. Results demonstrate an average accuracy of 82.27% across all considered intended tasks for early prediction, confirming the effectiveness of the proposed method.
Conference Paper
Full-text available
An interactive tangible interface has been developed to capture and communicate emotions between people who are missing and longing for loved ones. EmoEcho measures the wearer’s pulse, touch and movement to provide varying vibration patterns on the partner device. During an informal evaluation of two prototype devices users acknowledged how EmoEcho could help counter the negative feeling of missing someone and liked the range of feedback offered. In general, we believe, tangible interfaces appear to offer a non-obtrusive means towards interpreting and communicating emotions to others.
Conference Paper
Full-text available
When it comes to attention and notification management, most of the previous attempts to visualise notifications and smart phone usage have focused on digital representation on a screen that is not fully embedded in users’ environment. Today, the constant development in hardware and embedded systems including mini displays, LEDs, actuation as well as digital fabrication, have begun to provide new opportunities for representing data physically in the surrounding environments In this paper, we introduce a new way of visualising notification data using physical representations that are deeply integrated with the physical space and everyday objects. Based on our preliminary design and prototype, we identify a variety of design challenges for embedded data representation, and suggest opportunities for future research.
Article
Full-text available
Urban spaces have a great impact on how people’s emotion and behaviour. There are number of factors that impact our brain responses to a space. This paper presents a novel urban place recommendation approach, that is based on modelling in-situ EEG data. The research investigations leverages on newly affordable Electroencephalogram (EEG) headsets, which has the capability to sense mental states such as meditation and attention levels. These emerging devices have been utilized in understanding how human brains are affected by the surrounding built environments and natural spaces. In this paper, mobile EEG headsets have been used to detect mental states at different types of urban places. By analysing and modelling brain activity data, we were able to classify three different places according to the mental state signature of the users, and create an association map to guide and recommend people to therapeutic places that lessen brain fatigue and increase mental rejuvenation. Our mental states classifier has achieved accuracy of (%90.8). NeuroPlace breaks new ground not only as a mobile ubiquitous brain monitoring system for urban computing, but also as a system that can advise urban planners on the impact of specific urban planning policies and structures. We present and discuss the challenges in making our initial prototype more practical, robust, and reliable as part of our on-going research. In addition, we present some enabling applications using the proposed architecture.
Article
Full-text available
Abstract Over the past few years, there has been a noticeable advancement in environmental models and information fusion systems taking advantage of the recent developments in sensor and mobile technologies. However, little attention has been paid so far to quantifying the relationship between environment changes and their impact on our bodies in real-life settings. In this paper, we identify a data driven approach based on direct and continuous sensor data to assess the impact of the surrounding environment and physiological changes and emotion. We aim at investigating the potential of fusing on-body physiological signals, environmental sensory data and on-line self-report emotion measures in order to achieve the following objectives: 1) model the short term impact of the ambient environment on human body, 2) predict emotions based on-body sensors and environmental data. To achieve this, we have conducted a real-world study ‘in the wild’ with on-body and mobile sensors. Data was collected from participants walking around Nottingham city centre, in order to develop analytical and predictive models. Multiple regression, after allowing for possible confounders, showed a noticeable correlation between noise exposure and heart rate. Similarly, UV and environmental noise have been shown to have a noticeable effect on changes in ElectroDermal Activity (EDA). Air pressure demonstrated the greatest contribution towards the detected changes in body temperature and motion. Also, significant correlation was found between air pressure and heart rate. Finally, decision fusion of the classification results from different modalities is performed. To the best of our knowledge this work presents the first attempt at fusing and modelling data from environmental and physiological sources collected from sensors in a real-world setting.
Conference Paper
Full-text available
In this paper we address the problem of human action recognition from video sequences. Inspired by the exemplary results obtained via automatic feature learning and deep learning approaches in computer vision, we focus our attention towards learning salient spatial features via a convolutional neural network (CNN) and then map their temporal relationship with the aid of Long-Short-Term-Memory (LSTM) networks. Our contribution in this paper is a deep fusion framework that more effectively exploits spatial features from CNNs with temporal features from LSTM models. We also extensively evaluate their strengths and weaknesses. We find that by combining both the sets of features , the fully connected features effectively act as an attention mechanism to direct the LSTM to interesting parts of the convolutional feature sequence. The significance of our fusion method is its simplicity and effectiveness compared to other state-of-the-art methods. The evaluation results demonstrate that this hierarchical multi stream fusion method has higher performance compared to single stream mapping methods allowing it to achieve high accuracy out-performing current state-of-the-art methods in three widely used databases: UCF11, UCFSports, jHMDB.
Article
Full-text available
Despite the successes in capturing continuous distributions, the application of generative adversarial networks (GANs) to discrete settings, like natural language tasks, is rather restricted. The fundamental reason is the difficulty of back-propagation through discrete random variables combined with the inherent instability of the GAN training objective. To address these problems, we propose Maximum-Likelihood Augmented Discrete Generative Adversarial Networks. Instead of directly optimizing the GAN objective, we derive a novel and low-variance objective using the discriminator's output that follows corresponds to the log-likelihood. Compared with the original, the new objective is proved to be consistent in theory and beneficial in practice. The experimental results on various discrete datasets demonstrate the effectiveness of the proposed approach.
Article
Full-text available
We introduce a novel approach to training generative adversarial networks, where we train a generator to match a target distribution that converges to the data distribution at the limit of a perfect discriminator. This objective can be interpreted as training a generator to produce samples that lie on the decision boundary of a current discriminator in training at each update, and we call a GAN trained using this algorithm a boundary-seeking GAN (BS-GAN). This approach can be used to train a generator with discrete output when the generator outputs a parametric conditional distribution. We demonstrate the effectiveness of the proposed algorithm with discrete image data. In contrary to the proposed algorithm, we observe that the recently proposed Gumbel-Softmax technique for re-parametrizing the discrete variables does not work for training a GAN with discrete data. Finally, we notice that the proposed boundary-seeking algorithm works even with continuous variables, and demonstrate its effectiveness with two widely used image data sets, SVHN and CelebA.
Chapter
Full-text available
Biosignals have become an important indicator not only for medical diagnosis and subsequent therapy, but also passive health monitoring. Extracting meaningful features from biosignals can help people understand the human functional state, so that upcoming harmful symptoms or diseases can be alleviated or avoided. There are two main approaches commonly used to derive useful features from biosignals, which are hand-engineering and deep learning. The majority of the research in this field focuses on hand-engineering features, which require domain-specific experts to design algorithms to extract meaningful features. In the last years, several studies have employed deep learning to automatically learn features from raw biosignals to make feature extraction algorithms less dependent on humans. These studies have also demonstrated promising results in a variety of biosignal applications. In this survey, we review different types of biosignals and the main approaches to extract features from the signal in the context of biomedical applications. We also discuss challenges and limitations of the existing approaches, and possible future research.
Article
Full-text available
As a new way of training generative models, Generative Adversarial Nets (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data. However, it has limitations when the goal is for generating sequences of discrete tokens. A major reason lies in that the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model. Also, the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is non-trivial to balance its current score and the future one once the entire sequence has been generated. In this paper, we propose a sequence generation framework, called SeqGAN, to solve the problems. Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update. The RL reward signal comes from the GAN discriminator judged on a complete sequence, and is passed back to the intermediate state-action steps using Monte Carlo search. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over strong baselines.
Article
Full-text available
Human activity recognition (HAR) in ubiquitous computing is beginning to adopt deep learning to substitute for well-established analysis techniques that rely on hand-crafted feature extraction and classification techniques. From these isolated applications of custom deep architectures it is, however, difficult to gain an overview of their suitability for problems ranging from the recognition of manipulative gestures to the segmentation and identification of physical activities like running or ascending stairs. In this paper we rigorously explore deep, convolutional, and recurrent approaches across three representative datasets that contain movement data captured with wearable sensors. We describe how to train recurrent approaches in this setting, introduce a novel regularisation approach, and illustrate how they outperform the state-of-the-art on a large benchmark dataset. Across thousands of recognition experiments with randomly sampled model configurations we investigate the suitability of each model for different tasks in HAR, explore the impact of hyperparameters using the fANOVA framework, and provide guidelines for the practitioner who wants to apply deep learning in their problem setting.
Article
Full-text available
Human activity recognition (HAR) tasks have traditionally been solved using engineered features obtained by heuristic processes. Current research suggests that deep convolutional neural networks are suited to automate feature extraction from raw sensor inputs. However, human activities are made of complex sequences of motor movements, and capturing this temporal dynamics is fundamental for successful HAR. Based on the recent success of recurrent neural networks for time series domains, we propose a generic deep framework for activity recognition based on convolutional and LSTM recurrent units, which: (i) is suitable for multimodal wearable sensors; (ii) can perform sensor fusion naturally; (iii) does not require expert knowledge in designing features; and (iv) explicitly models the temporal dynamics of feature activations. We evaluate our framework on two datasets, one of which has been used in a public activity recognition challenge. Our results show that our framework outperforms competing deep non-recurrent networks on the challenge dataset by 4% on average; outperforming some of the previous reported results by up to 9%. Our results show that the framework can be applied to homogeneous sensor modalities, but can also fuse multimodal sensors to improve performance. We characterise key architectural hyperparameters’ influence on performance to provide insights about their optimisation.
Conference Paper
Full-text available
Despite the widespread installation of accelerometers in almost all mobile phones and wearable devices, activity recognition using accelerometers is still immature due to the poor recognition accuracy of existing recognition methods and the scarcity of labeled training data. We consider the problem of human activity recognition using triaxial accelerometers and deep learning paradigms. This paper shows that deep activity recognition models (a) provide better recognition accuracy of human activities, (b) avoid the expensive design of handcrafted features in existing systems, and (c) utilize the massive unlabeled acceleration samples for unsupervised feature extraction. Moreover, a hybrid approach of deep learning and hidden Markov models (DL-HMM) is presented for sequential activity recognition. This hybrid approach integrates the hierarchical representations of deep activity recognition models with the stochastic modeling of temporal sequences in the hidden Markov models. We show substantial recognition improvement on real world datasets over state-of-the-art methods of human activity recognition using triaxial accelerometers.
Conference Paper
Full-text available
Visual features are of vital importance for human action understanding in videos. This paper presents a new video representation, called trajectory-pooled deep-convolutional descriptor (TDD), which shares the merits of both hand-crafted features and deep-learned features. Specifically, we utilize deep architectures to learn discriminative convolutional feature maps, and conduct trajectory-constrained pooling to aggregate these convolutional features into effective descriptors. To enhance the robustness of TDDs, we design two normalization methods to transform convolutional feature maps, namely spatiotemporal normalization and channel normalization. The advantages of our features come from (i) TDDs are automatically learned and contain high discriminative capacity compared with those hand-crafted features; (ii) TDDs take account of the intrinsic characteristics of temporal dimension and introduce the strategies of trajectory-constrained sampling and pooling for aggregating deep-learned features. We conduct experiments on two challenging datasets: HMDB51 and UCF101. Experimental results show that TDDs outperform previous hand-crafted features and deep-learned features. Our method also achieves superior performance to the state of the art on these datasets (HMDB51 65.9%, UCF101 91.5%).
Conference Paper
Full-text available
Analysis of human movement is an important research area, specially for health applications. In order to assess the quality of life of people with mobility problems like Parkinson’s disease (PD) or stroke patients, it is crucial to monitor their daily life activities. The main goal of this work is to characterize basic activities and their transitions using a single sensor located at the waist. This paper presents a novel postural detection algorithm which is able to detect and identify 6 different postural transitions, sit to stand, stand to sit, bending up/down and lying to sit and sit to lying transitions with a sensitivity of 86.5% and specificity of 95%. The algorithm has been tested on 31 healthy volunteers and 8 PD patients who performed a total of 545 and 176 transitions respectively. The proposed algorithm is suitable to be implemented in real-time systems for on-line monitoring applications.
Conference Paper
Full-text available
this work proposes a system for rating shops and for monitoring the cell phone-based emotion responses of customers in a shopping mall environment. To measure customer satisfaction in a shopping environment, a mobile, non-intrusive and comfortable wearable biosensor is used to measure the Electrodermal Activity (EDA) of the shopper. The users' proximity to the store is detected using NFC tags that report to the custom application on the mobile phone. The custom emotion recognition software analyses these streams of data in real-time and associates emotion levels to each event. The aim of this project is to demonstrate the possibility of using pervasive affective computing to explicate consumer behavior towards the stores in shopping malls. By triggering positive emotions through enhancing services and improving advertising campaigns, retailers can trigger positive emotional states, which ultimately contribute to a positive and memorable shopping experience.
Article
Full-text available
The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.
Article
Full-text available
One of the significant challenges in the upper-limb-prosthetics research field is to identify appropriate interfaces that utilize the full potential of current state-of-the-art neuroprostheses. As the new generation of such prostheses paces towards approximating the human physiological performance in terms of movement dexterity and sensory feedback, it is clear that current non-invasive interfaces are still severely limited. Surface electromyography, the interface ubiquitously used in the field, is riddled with several shortcomings. Gesture recognition, an interface pervasively used in wearables and mobile devices, shows a strong potential as a non-invasive upper-limb prosthetic interface. This study aims at showcasing its potential in the field by using gyroscope sensors. To this end, we (1) explore the viability of Dynamic Time Warping as a classification method for upper-limb prosthetics and (2) look for appropriate sensor locations on the body. Results indicate an optimal classification rate of 97.53%, σ = 8.74 using a sensor located proximal to the endpoint performing a gesture.
Article
Full-text available
The real-time monitoring of human movement can provide valuable information regarding an individual's degree of functional ability and general level of activity. This paper presents the implementation of a real-time classification system for the types of human movement associated with the data acquired from a single, waist-mounted triaxial accelerometer unit. The major advance proposed by the system is to perform the vast majority of signal processing onboard the wearable unit using embedded intelligence. In this way, the system distinguishes between periods of activity and rest, recognizes the postural orientation of the wearer, detects events such as walking and falls, and provides an estimation of metabolic energy expenditure. A laboratory-based trial involving six subjects was undertaken, with results indicating an overall accuracy of 90.8% across a series of 12 tasks (283 tests) involving a variety of movements related to normal daily activities. Distinction between activity and rest was performed without error; recognition of postural orientation was carried out with 94.1% accuracy, classification of walking was achieved with less certainty (83.3% accuracy), and detection of possible falls was made with 95.6% accuracy. Results demonstrate the feasibility of implementing an accelerometry-based, real-time movement classifier using embedded intelligence
Article
Trajectory prediction is one of the key capabilities for robots to safely navigate and interact with pedestrians. Critical insights from human intention and behavioral patterns need to be integrated to effectively forecast long-term pedestrian behavior. Thus, we propose a framework incorporating a Mutable Intention Filter and a Warp LSTM (MIF-WLSTM) to simultaneously estimate human intention and perform trajectory prediction. The Mutable Intention Filter is inspired by particle filtering and genetic algorithms, where particles represent intention hypotheses that can be mutated throughout the pedestrian's motion. Instead of predicting sequential displacement over time, our Warp LSTM learns to generate offsets on a full trajectory predicted by a nominal intention-aware linear model, which considers the intention hypotheses during filtering process. Through experiments on a publicly available dataset, we show that our method outperforms baseline approaches and demonstrate the robust performance of our method under abnormal intention-changing scenarios.
Article
The recognition of the human lower limb jump phases plays an important role in measuring the degree of rehabilitation and the control of the exoskeleton. However, one of the challenges is that the recognition accuracy using sEMG signal is low. In this paper, we propose two types of long–short term memory network (LSTM) models for offline and online recognition of jump sequences. The recognition accuracies of bidirectional LSTM and convolutional LSTM (ConvLSTM) for sEMG reach 97.84% and 97.44%, respectively. When the offline analysis model is used with sEMG sequence of the jump process, the misclassification only occurs in the adjacent phases. From the Pearson correlation coefficients (PCCs) of sEMG and IMU signals, the complex network of muscles and kinematics is built to analyze the coupling of muscle and motion in the jumping process. Taking the sequence composed of PCC matrix with sensor information confusion as the input, ConvLSTM model can acquire spatiotemporal features and the accuracy of the online model can reach 98.13%. In this paper, the number and length of analysis windows that influence the model performance are studied. The synthesis method of Euler angle signals facilitates the recognition of human movement intention.
Article
Nonlinear articular geometries of biological joints have contributed to highly agile and adaptable human-body motions. However, human–machine interaction could potentially distort natural human motions if the artificial mechanisms overload the articular surfaces and constrain biological joint kinematics. It is desired to better understand the deformable articular geometries of biological joints in vivo during movements for design and control of wearable robotics. An articular geometry reconstruction method is proposed to measure the effective articular profile with a wearable compliant device and illustrated with its application to knee-joint kinematic analysis. Regarding the joint articulation as boundary constraints for the compliant mechanism, the equivalent articular geometry is constructed from the beam deformations driven by knee motions, where the continuous deformations are estimated with strain data from the embedded sensors. Both simulated analysis and experimental validation are presented to justify the proposed method.
Article
In this paper, we present a passive lower extremity exoskeleton with a simple structure and a light weight. The exoskeleton does not require any external energy source and can achieve energy transfer only by human body’s own gravity. The exoskeleton is self-adaptive to human gait to achieve basic matching therewith. During walking, pulling forces are generated through Bowden cables by pressing plantar power output devices by feet, and the forces are transmitted to the exoskeleton through a crank-slider mechanism to enable the exoskeleton to provide torques for the ankle and knee joints as required by the human body during the stance phase and the swing phase. Our self-developed gait detection system is used to perform experiments on kinematics, dynamics and metabolic cost during walking of the human body wearing the exoskeleton in different states. The experimental results show that the exoskeleton has the greatest influence on motion of the ankle joint and has the least influence on hip joint. With the increase in elastic coefficient of the spring, the torques generated at the joints by the exoskeleton increase. When walking with wearing k3EF exoskeleton at a speed of 0.5 m/s, it can save the most metabolic cost, reaching 13.63%.
Article
This paper presents a lower-limb exoskeleton that is actuated by pneumatic muscle actuators (PMAs). This exoskeleton system is composed of the mechanical structures, a treadmill, and a weight support system. With the cooperative work of the three parts, the system aims to assist either the elderly for muscle strengthening by conducting walking activities or the stroke patients during a rehabilitation training program. A mechanism is developed to separate the PMAs from the wearer’s legs to reduce the subject’s physical exertion. Furthermore, considering the difficulty in the modeling of proposed PMAs-driven exoskeleton, a safe and model-free control strategy called proxy-based sliding mode control (PSMC) is used to ensure proper control of the exoskeleton. However, the favorable performances are strongly dependent on the appropriate control parameters, which may be difficult to obtain with blind tuning. Therefore, we propose a global parameters optimization algorithm called switch-mode firefly algorithm (SMFA) to automatically calculate the pre-defined object function and attain the most applicable parameters. Experimental studies are conducted, and the results show the effectiveness of the proposed method.
Article
Low visual quality has prevented underwater robotic vision from a wide range of applications. Although several algorithms have been developed, real time and adaptive methods are deficient for real-world tasks. In this paper, we address this difficulty based on generative adversarial networks (GAN), and propose a GAN-based restoration scheme (GAN-RS). In particular, we develop a multibranch discriminator including an adversarial branch and a critic branch for the purpose of simultaneously preserving image content and removing underwater noise. In addition to adversarial learning, a novel dark channel prior loss also promotes the generator to produce realistic vision. More specifically, an underwater index is investigated to describe underwater properties, and a loss function based on the underwater index is designed to train the critic branch for underwater noise suppression. Through extensive comparisons on visual quality and feature restoration, we confirm the superiority of the proposed approach. Consequently, the GAN-RS can adaptively improve underwater visual quality in real time and induce an overall superior restoration performance. Finally, a real-world experiment is conducted on the seabed for grasping marine products, and the results are quite promising. The source code is publicly available <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">11</sup> [Online]. Available: https://github.com/SeanChenxy/GAN_RS..
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Conference Paper
Human activity recognition (HAR) in ubiquitous computing is beginning to adopt deep learning to substitute for well-established analysis techniques that rely on hand-crafted feature extraction and classification techniques. From these isolated applications of custom deep architectures it is, however, difficult to gain an overview of their suitability for problems ranging from the recognition of manipulative gestures to the segmentation and identification of physical activities like running or ascending stairs. In this paper we rigorously explore deep, convolutional, and recurrent approaches across three representative datasets that contain movement data captured with wearable sensors. We describe how to train recurrent approaches in this setting, introduce a novel regularisation approach, and illustrate how they outperform the state-of-the-art on a large benchmark dataset. Across thousands of recognition experiments with randomly sampled model configurations we investigate the suitability of each model for different tasks in HAR, explore the impact of hyperparameters using the fANOVA framework, and provide guidelines for the practitioner who wants to apply deep learning in their problem setting.
Conference Paper
As a new way of training generative models, Generative Adversarial Net (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data. However, it has limitations when the goal is for generating sequences of discrete tokens. A major reason lies in that the discrete outputs from the generative model make it difficult to pass the gradient update from the discriminative model to the generative model. Also, the discriminative model can only assess a complete sequence, while for a partially generated sequence, it is nontrivial to balance its current score and the future one once the entire sequence has been generated. In this paper, we propose a sequence generation framework, called SeqGAN, to solve the problems. Modeling the data generator as a stochastic policy in reinforcement learning (RL), SeqGAN bypasses the generator differentiation problem by directly performing gradient policy update. The RL reward signal comes from the GAN discriminator judged on a complete sequence, and is passed back to the intermediate state-action steps using Monte Carlo search. Extensive experiments on synthetic data and real-world tasks demonstrate significant improvements over strong baselines.
Article
Human activity recognition in videos with CNN features has received increasing attention in multimedia understanding. Taking videos as a sequence of frames, a new record was recently set on several benchmark datasets by feeding frame-level CNN sequence features to Long Short-Term Memory (LSTM) model for video activity recognition. This recurrent model based visual recognition pipeline is a natural choice for perceptual problems with time-varying visual input or sequential outputs. However, the above pipeline takes frame-level CNN sequence features as input for LSTM, which may fail to capture the rich motion information from adjacent frames or maybe multiple clips. Furthermore, an activity is conducted by a subject or multiple subjects. It is important to consider attention which allows for salient features, instead of mapping an entire frame into a static representation. To tackle these issues, we propose a novel pipeline, Saliency-aware 3D CNN with LSTM (scLSTM), for video action recognition by integrating LSTM with salientaware deep 3D CNN features on videos shots. Specifically, we first apply saliency-aware methods to generate saliency-aware videos. Then, we design an end-to-end pipeline by integrating 3D CNN with LSTM, followed by a time series pooling layer and a Softmax layer to predict the activities. Noticeably, we set a new record on two benchmark datasets, i.e., UCF101 with 13,320 videos and HMDB-51 with 6,766 videos. Our method outperforms the stateof- the-art end-to-end methods of action recognition by 3.8% and 3.2% respectively on above two datasets.
Article
Human activities are inherently translation invariant and hierarchical. Human activity recognition (HAR), a field that has garnered a lot of attention in recent years due to its high demand in various application domains, makes use of time-series sensor data to infer activities. In this paper, a deep convolutional neural network (convnet) is proposed to perform efficient and effective HAR using smartphone sensors by exploiting the inherent characteristics of activities and 1D time-series signals, at the same time providing a way to automatically and data-adaptively extract robust features from raw data. Experiments show that convnets indeed derive relevant and more complex features with every additional layer, although difference of feature complexity level decreases with every additional layer. A wider time span of temporal local correlation can be exploited (1 × 9-1 × 14) and a low pooling size (1 × 2-1 × 3) is shown to be beneficial. Convnets also achieved an almost perfect classification on moving activities, especially very similar ones which were previously perceived to be very difficult to classify. Lastly, convnets outperform other state-of-the-art data mining techniques in HAR for the benchmark dataset collected from 30 volunteer subjects, achieving an overall performance of 94.79% on the test set with raw sensor data, and 95.75% with additional information of temporal fast Fourier transform of the HAR data set.
Article
A novel multi-sensor information fusion method combined with the support vector machine (SVM) was proposed in diagnosing three types of faults which are collision, front collision and obstruction, as the robot's arm approaches the grasping place. After fusing the proper number of the data from multi-sensors and searching the optimal parameters C and γ of the SVM by grid searching, the proposed method can successfully diagnose the faults of obstruction, front collision and collision. Besides, the selection of the number of the features of data to be fused by multi-sensor information fusion was discussed. The experimental results show that the selection of the proper number of the fusing features of the sampling data influences the number of fusion data obtained and the accuracy of classification.
Article
In this paper we introduce a generative parametric model capable of producing high quality samples of natural images. Our approach uses a cascade of convolutional networks within a Laplacian pyramid framework to generate images in a coarse-to-fine fashion. At each level of the pyramid, a separate generative convnet model is trained using the Generative Adversarial Nets (GAN) approach (Goodfellow et al.). Samples drawn from our model are of significantly higher quality than alternate approaches. In a quantitative assessment by human evaluators, our CIFAR10 samples were mistaken for real images around 40% of the time, compared to 10% for samples drawn from a GAN baseline model. We also show samples from models trained on the higher resolution images of the LSUN scene dataset.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.
Article
Relative foot position estimation is important for rehabilitation, sports training and functional diagnostics. In this paper an extended Kalman filter fusing ultrasound range estimates and inertial sensors is described. With this filter several gait parameters can be estimated ambulatory. Step lengths and stride widths from 54 walking trials of three healthy subjects were estimated and compared to an optical reference. Mean ( standard deviation) of absolute difference was 1.7 cm (1.8 cm) and 1.2 cm (1.2 cm) for step length and stride width respectively. Walking with a turn and walking around in a square area were also investigated and resulted in mean absolute differences of 1.7 cm (2.0 cm) and 1.5 cm (1.5 cm) for step lengths and stride widths. In addition to these relative positions, velocities, orientations and stance and swing times can also be estimated. We conclude that the presented system is low-cost and provides a complete description of footstep kinematics and timing.
Article
This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the 'relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art 'support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages. These include the benefits of probabilistic predictions, automatic estimation of 'nuisance' parameters, and the facility to utilise arbitrary basis functions (e.g. non-'Mercer' kernels). We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks. We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning.
Article
This paper presents a systematic design approach for constructing neural classifiers that are capable of classifying human activities using a triaxial accelerometer. The philosophy of our design approach is to apply a divide-and-conquer strategy that separates dynamic activities from static activities preliminarily and recognizes these two different types of activities separately. Since multilayer neural networks can generate complex discriminating surfaces for recognition problems, we adopt neural networks as the classifiers for activity recognition. An effective feature subset selection approach has been developed to determine significant feature subsets and compact classifier structures with satisfactory accuracy. Experimental results have successfully validated the effectiveness of the proposed recognition scheme.
Article
In this paper, a gesture recognition system based on single tri-axis accelerometer mounted on a cell phone is proposed. We present a novel human computer interaction for cell phone through recognizing seventeen complex gestures. A new feature fusion method for gesture recognition based on time-domain and frequency-domain is proposed. First of all, we extract the time-domain features from acceleration data, that is short-time energy. Secondly, we extract the hybrid features which combine Wavelet Packet Decomposition with Fast Fourier Transform. Finally, we fuse these two categories features together and employ the principal component analysis to reduce dimension of fusion features. The Classifier we used is Multi-class Support Vector Machine. The average recognition results of seventeen complex gestures using the proposed fusion feature are 89.89%, which better than previous works. The performance of experimental results show that gesturebased interaction can be used as a novel human computer interaction for mobile device and consumer electronics.
Article
Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.
Article
The objective of this paper is to apply Support Vector Machines to the problem of classifying emotion on images of human faces. This welldefined problem is complicated by the natural variation in people's faces, requiring the classification algorithm to distinguish the small number of relevant features from the large pool of input features. Recent experimentation using neural networks has achieved over 85% classification accuracy. These experiments provide a metric for evaluation of the Support Vector Machine technique, which was shown to have equivalent performance to neural networks. 1
Research on gait classification based on acceleration sensor
  • Chen
A tangible interface to convey and communicate emotions
  • W Kieran
  • K Eiman
  • Emoecho
Research on gait classification based on acceleration sensor
  • X Chen
  • H Liu
  • W Huang
  • X Y Xing
  • H Y Liu