Chapter

Real Time Hand Movement Trajectory Tracking for Enhancing Dementia Screening in Ageing Deaf Signers of British Sign Language

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Real time hand movement trajectory tracking based on machine learning approaches may assist the early identification of dementia in ageing Deaf individuals who are users of British Sign Language (BSL), since there are few clinicians with appropriate communication skills, and a shortage of sign language interpreters. Unlike other computer vision systems used in dementia stage assessment such as RGB-D video with the aid of depth camera, activities of daily living (ADL) monitored by information and communication technologies (ICT) facilities, or X-Ray, computed tomography (CT), and magnetic resonance imaging (MRI) images fed to machine learning algorithms, the system developed here focuses on analysing the sign language space envelope (sign trajectories/depth/speed) and facial expression of deaf individuals, using normal 2D videos. In this work, we are interested in providing a more accurate segmentation of objects of interest in relation to the background, so that accurate real-time hand trajectories (path of the trajectory and speed) can be achieved. The paper presents and evaluates two types of hand movement trajectory models. In the first model, the hand sign trajectory is tracked by implementing skin colour segmentation. In the second model, the hand sign trajectory is tracked using Part Affinity Fields based on the OpenPose Skeleton Model [1, 2]. Comparisons of results between the two different models demonstrate that the second model provides enhanced improvements in terms of tracking accuracy and robustness of tracking. The pattern differences in facial and trajectory motion data achieved from the presented models will be beneficial not only for screening of deaf individuals for dementia, but also for assessment of other acquired neurological impairments associated with motor changes, for example, stroke and Parkinson’s disease.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Additionally, the rates of two-hand movements and those of dominant and non-dominant hand movements when performing a simple game, such as Tangram, were used [29]. In a 2019 study, real-time hand movement trajectory [28] was used to examine cognitive functions based on hand movements for sign language. The study used the speed and trajectory of hand movements as variables for the cognitive function correlation analysis. ...
Article
Full-text available
The prevalence of dementia, a condition associated with high social costs, is rising alongside the aging population. Early diagnosis of mild cognitive impairment (MCI), a precursor to dementia, is essential for effective intervention. Recent research has focused on diagnosing cognitive function in the elderly by analyzing behavioral data, such as gait and hand movements. Compared to traditional neuropsychological assessment methods, behavioral data-based assessments offer advantages, including reduced fatigue for patients and examiners, faster testing procedures, and more objective evaluation of results. This study reviews 15 research projects from the past five years (2018–2023) that have utilized behavioral data to assess cognitive function. It examines the specific gait and hand movement variables used, the technologies implemented, and user experiences reported in these studies. As these types of assessments require new technologies or environments, we analyzed usability issues that should be considered for accurate cognitive assessment. Based on this analysis, the paper proposes future directions for research in the field of behavioral data-based cognitive function assessment.
Article
Full-text available
Accurate 3D tracking of hand and fingers movements poses significant challenges in computer vision. Thepotential applications span across multiple domains, including human–computer interaction, virtual reality,industry, and medicine. While gesture recognition has achieved remarkable accuracy, quantifying fine move-ments remains a hurdle, particularly in clinical applications where the assessment of hand dysfunctions andrehabilitation training outcomes necessitate precise measurements. Several novel and lightweight frameworksbased on Deep Learning have emerged to address this issue; however, their performance in accurately andreliably measuring finger movements requires validation against well-established gold standard systems. In thispaper, the aim is to validate the hand-tracking framework implemented by Google MediaPipe Hand (GMH) andan innovative enhanced version, GMH-D, that exploits the depth estimation of an RGB-Depth camera to achievemore accurate tracking of 3D movements. Three dynamic exercises commonly administered by clinicians toassess hand dysfunctions, namely hand opening–closing, single finger tapping and multiple finger tappingare considered. Results demonstrate high temporal and spectral consistency of both frameworks with thegold standard. However, the enhanced GMH-D framework exhibits superior accuracy in spatial measurementscompared to the baseline GMH, for both slow and fast movements. Overall, our study contributes to theadvancement of hand tracking technology, and the establishment of a validation procedure as a good-practiceto prove efficacy of deep-learning-based hand-tracking. Moreover, it proves that GMH-D is a reliable frameworkfor assessing 3D hand movements in clinical applications.
Article
Background & Objective With populations ageing, the number of people with dementia worldwide is expected to triple to 152 million by 2050. Seventy percent of cases are due to Alzheimer’s disease (AD) pathology and there is a 10-20 year ’pre-clinical’ period before significant cognitive decline occurs. We urgently need, cost effective, objective biomarkers to detect AD, and other dementias, at an early stage. Risk factor modification could prevent 40% of cases and drug trials would have greater chances of success if participants are recruited at an earlier stage. Currently, detection of dementia is largely by pen and paper cognitive tests but these are time consuming and insensitive to pre-clinical phases. Specialist brain scans and body fluid biomarkers can detect the earliest stages of dementia but are too invasive or expensive for widespread use. With the advancement of technology, Artificial Intelligence (AI) shows promising results in assisting with detection of early-stage dementia. This scoping review aims to summarise the current capabilities of AI-aided digital biomarkers to aid in early detection of dementia, and also discusses potential future research directions. Methods & Materials In this scoping review, we used PubMed and IEEE Xplore to identify relevant papers. The resulting records were further filtered to retrieve articles published within five years and written in English. Duplicates were removed, titles and abstracts were screened and full texts were reviewed. Results After an initial yield of 1,463 records, 1,444 records were screened after removal of duplication. A further 771 records were excluded after screening titles and abstracts, and 496 were excluded after full text review. The final yield was 177 studies. Records were grouped into different artificial intelligence based tests: (a) computerized cognitive tests (b) movement tests (c) speech, conversion, and language tests and (d) computer- assisted interpretation of brain scans. Conclusions In general, AI techniques enhance the performance of dementia screening tests because more features can be retrieved from a single test, there are less errors due to subjective judgements and AI shifts the automation of dementia screening to a higher level. Compared with traditional cognitive tests, AI-based computerized cognitive tests improve the discrimination sensitivity by around 4% and specificity by around 3%. In terms of speech, conversation and language tests, combining both acoustic features and linguistic features achieve the best result with accuracy around 94%. Deep learning techniques applied in brain scan analysis achieves around 92% accuracy. Movement tests and setting smart environments to capture daily life behaviours are two potential future directions that may help discriminate dementia from normal ageing. AI-based smart environments and multi-modal tests are promising future directions to improve detection of dementia in the earliest stages.
Chapter
The ageing population trend is correlated with an increased prevalence of acquired cognitive impairments such as dementia. Although there is no cure for dementia, a timely diagnosis helps in obtaining necessary support and appropriate medication. Researchers are working urgently to develop effective technological tools that can help doctors undertake early identification of cognitive disorder. In particular, screening for dementia in ageing Deaf signers of British Sign Language (BSL) poses additional challenges as the diagnostic process is bound up with conditions such as quality and availability of interpreters, as well as appropriate questionnaires and cognitive tests. On the other hand, deep learning based approaches for image and video analysis and understanding are promising, particularly the adoption of Convolutional Neural Network (CNN), which require large amounts of training data. In this paper, however, we demonstrate novelty in the following way: a) a multi-modal machine learning based automatic recognition toolkit for early stages of dementia among BSL users in that features from several parts of the body contributing to the sign envelope, e.g., hand-arm movements and facial expressions, are combined, b) universality in that it is possible to apply our technique to users of any sign language, since it is language independent, c) given the trade-off between complexity and accuracy of machine learning (ML) prediction models as well as the limited amount of training and testing data being available, we show that our approach is not over-fitted and has the potential to scale up.
Article
Full-text available
Physical traits such as the shape of the hand and face can be used for human recognition and identification in video surveillance systems and in biometric authentication smart card systems, as well as in personal health care. However, the accuracy of such systems suffers from illumination changes, unpredictability, and variability in appearance (e.g. occluded faces or hands, cluttered backgrounds, etc.). This work evaluates different statistical and chrominance models in different environments with increasingly cluttered backgrounds where changes in lighting are common and with no occlusions applied, in order to get a reliable neural network reconstruction of faces and hands, without taking into account the structural and temporal kinematics of the hands. First a statistical model is used for skin colour segmentation to roughly locate hands and faces. Then a neural network is used to reconstruct in 3D the hands and faces. For the filtering and the reconstruction we have used the growing neural gas algorithm which can preserve the topology of an object without restarting the learning process. Experiments conducted on our own database but also on four benchmark databases (Stirling’s, Alicante, Essex, and Stegmann’s) and on deaf individuals from normal 2D videos are freely available on the BSL signbank dataset. Results demonstrate the validity of our system to solve problems of face and hand segmentation and reconstruction under different environmental conditions.
Article
Full-text available
The heterogeneity of neurodegenerative diseases is a key confound to disease understanding and treatment development, as study cohorts typically include multiple phenotypes on distinct disease trajectories. Here we introduce a machine-learning technique—Subtype and Stage Inference (SuStaIn)—able to uncover data-driven disease phenotypes with distinct temporal progression patterns, from widely available cross-sectional patient studies. Results from imaging studies in two neurodegenerative diseases reveal subgroups and their distinct trajectories of regional neurodegeneration. In genetic frontotemporal dementia, SuStaIn identifies genotypes from imaging alone, validating its ability to identify subtypes; further the technique reveals within-genotype heterogeneity. In Alzheimer’s disease, SuStaIn uncovers three subtypes, uniquely characterising their temporal complexity. SuStaIn provides fine-grained patient stratification, which substantially enhances the ability to predict conversion between diagnostic categories over standard models that ignore subtype (p = 7.18 × 10⁻⁴) or temporal stage (p = 3.96 × 10⁻⁵). SuStaIn offers new promise for enabling disease subtype discovery and precision medicine.
Article
Full-text available
Introduction Advanced machine learning methods might help to identify dementia risk from neuroimaging, but their accuracy to date is unclear. Methods We systematically reviewed the literature, 2006 to late 2016, for machine learning studies differentiating healthy aging from dementia of various types, assessing study quality, and comparing accuracy at different disease boundaries. Results Of 111 relevant studies, most assessed Alzheimer's disease versus healthy controls, using AD Neuroimaging Initiative data, support vector machines, and only T1-weighted sequences. Accuracy was highest for differentiating Alzheimer's disease from healthy controls and poor for differentiating healthy controls versus mild cognitive impairment versus Alzheimer's disease or mild cognitive impairment converters versus nonconverters. Accuracy increased using combined data types, but not by data source, sample size, or machine learning method. Discussion Machine learning does not differentiate clinically relevant disease categories yet. More diverse data sets, combinations of different types of data, and close clinical integration of machine learning would help to advance the field.
Article
Full-text available
The number of people diagnosed with dementia is expected to rise in the coming years. Given that there is currently no definite cure for dementia and the cost of care for this condition soars dramatically, slowing the decline and maintaining independent living are important goals for supporting people with dementia. This paper discusses a study that is called Technology Integrated Health Management (TIHM). TIHM is a technology assisted monitoring system that uses Internet of Things (IoT) enabled solutions for continuous monitoring of people with dementia in their own homes. We have developed machine learning algorithms to analyse the correlation between environmental data collected by IoT technologies in TIHM in order to monitor and facilitate the physical well-being of people with dementia. The algorithms are developed with different temporal granularity to process the data for long-term and short-term analysis. We extract higher-level activity patterns which are then used to detect any change in patients’ routines. We have also developed a hierarchical information fusion approach for detecting agitation, irritability and aggression. We have conducted evaluations using sensory data collected from homes of people with dementia. The proposed techniques are able to recognise agitation and unusual patterns with an accuracy of up to 80%.
Article
Full-text available
INTRODUCTION: Advanced machine learning methods might help to identify dementia risk from neuroimaging, but their accuracy to date is unclear. METHODS: We systematically reviewed the literature, 2006 to late 2016, for machine learning studies differentiating healthy ageing through to dementia of various types, assessing study quality, and comparing accuracy at different disease boundaries. RESULTS: Of 111 relevant studies, most assessed Alzheimer's disease (AD) vs healthy controls, used ADNI data, support vector machines and only T1-weighted sequences. Accuracy was highest for differentiating AD from healthy controls, and poor for differentiating healthy controls vs MCI vs AD, or MCI converters vs non-converters. Accuracy increased using combined data types, but not by data source, sample size or machine learning method. DISCUSSION: Machine learning does not differentiate clinically-relevant disease categories yet. More diverse datasets, combinations of different types of data, and close clinical integration of machine learning would help to advance the field.
Article
Full-text available
Praxis test is a gesture-based diagnostic test which has been accepted as diagnostically indicative of cortical pathologies such as Alzheimer's disease. Despite being simple, this test is oftentimes skipped by the clinicians. In this paper, we propose a novel framework to investigate the potential of static and dynamic upper-body gestures based on the Praxis test and their potential in a medical framework to automatize the test procedures for computer-assisted cognitive assessment of older adults. In order to carry out gesture recognition as well as correctness assessment of the performances we have recollected a novel challenging RGB-D gesture video dataset recorded by Kinect v2, which contains 29 specific gestures suggested by clinicians and recorded from both experts and patients performing the gesture set. Moreover, we propose a framework to learn the dynamics of upper-body gestures, considering the videos as sequences of short-term clips of gestures. Our approach first uses body part detection to extract image patches surrounding the hands and then, by means of a fine-tuned convolutional neural network (CNN) model, it learns deep hand features which are then linked to a long short-term memory to capture the temporal dependencies between video frames. We report the results of four developed methods using different modalities. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks. Satisfaction of clinicians from the assessment reports indicates the impact of framework corresponding to the diagnosis.
Conference Paper
Full-text available
Human activity recognition using smart home sensors is one of the bases of ubiquitous computing in smart environments and a topic undergoing intense research in the field of ambient assisted living. The increasingly large amount of data sets calls for machine learning methods. In this paper, we introduce a deep learning model that learns to classify human activities without using any prior knowledge. For this purpose, a Long Short Term Memory (LSTM) Recurrent Neural Network was applied to three real world smart home datasets. The results of these experiments show that the proposed approach outperforms the existing ones in terms of accuracy and performance.
Presentation
Full-text available
This presentation discussed how Information and Communications Technology (ICT) can be used for the health and well-being of older people with dementia and, thereby, address one of the major challenges of the world’s smart cities in meeting the needs of an ageing demographic.
Article
Full-text available
Cognitive impairment due to dementia decreases functionality in Activities of Daily Living (ADL). Its assessment is useful to identify care needs, risks and monitor disease progression. This study investigates differences in ADL pattern-performance between dementia patients and healthy controls using unobtrusive sensors. Around 9,600 person-hours of activity data were collected from the home of ten dementia patients and ten healthy controls using a wireless-unobtrusive sensors and analysed to detect ADL. Recognised ADL were visualized using activity maps, the heterogeneity and accuracy to discriminate patients from healthy were analysed. Activity maps of dementia patients reveal unorganised behaviour patterns and heterogeneity differed significantly between the healthy and diseased. The discriminating accuracy increases with observation duration (0.95 for 20 days). Unobtrusive sensors quantify ADL-relevant behaviour, useful to uncover the effect of cognitive impairment, to quantify ADL-relevant changes in the course of dementia and to measure outcomes of anti-dementia treatments.
Article
Full-text available
We describe a novel technique to combine motion data with scene information to capture activity characteristics of older adults using a single Microsoft Kinect depth sensor. Specifically, we describe a method to learn activities of daily living (ADLs) and instrumental ADLs (IADLs) in order to study the behavior patterns of older adults to detect health changes. To learn the ADLs, we incorporate scene information to provide contextual information to build our activity model. The strength of our algorithm lies in its generalizability to model different ADLs while adding more information to the model as we instantiate ADLs from learned activity states. We validate our results in a controlled environment and compare it with another widely accepted classifier, the Hidden Markov Model (HMM) and its variations. We also test our system on depth data collected in a dynamic unstructured environment at TigerPlace, an independent living facility for older adults. An in-home activity monitoring system would benefit from our algorithm to alert healthcare providers of significant temporal changes in ADL behavior patterns of frail older adults for fall risk, cognitive impairment, and other health changes.
Conference Paper
Full-text available
The present paper proposes a computer vision system to diagnose the stage of illness in patients affected by Alzheimer's disease. In the context of Ambient Assisted Living (AAL), the system monitors people in home environment during daily personal care activities. The aim is to evaluate the dementia stage, observing actions listed in the Direct Assessment of Funcional Status (DAFS) index and detecting anomalies during the performance, in order to assign a score explaining if the action is correct or not. In this work brushing teeth and grooming hair by a hairbrush are analysed. The technology consists of the application of a Recurrent Neural Network with Parametric Bias (RNNPB) that is able to learn movements connected with a specific action and recognize human activities by parametric bias that work like mirror neurons. This study has been conducted using Microsoft Kinect to collect data about the actions observed and oversee the user tracking and gesture recognition. Experiments prove that the proposed computer vision system can learn and recognize complex human activities and evaluates DAFS score.
Conference Paper
Full-text available
Ambient Assisted Living is currently one of the important research and development areas, where accessibility, usability and learning plays a major role and where future interfaces are an important concern for applied engineering. The general goal of ambient assisted living solutions is to apply ambient intelligence technology to enable people with specific demands, e.g. handicapped or elderly, to live in their preferred environment longer. Due to the high potential of emergencies, a sound emergency assistance is required, for instance assisting elderly people with comprehensive ambient assisted living solutions sets high demands on the overall system quality and consequently on software and system engineering – user acceptance and support by various user-interfaces is an absolute necessity. In this article, we present an Assisted Living Laboratory that is used to train elderly people to handle modern interfaces for Assisted Living and evaluate the usability and suitability of these interfaces in specific situations, e.g., emergency cases.
Conference Paper
Full-text available
This paper describes a machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates. This work is distinguished by three key contributions. The first is the introduction of a new image representation called the "integral image" which allows the features used by our detector to be computed very quickly. The second is a learning algorithm, based on AdaBoost, which selects a small number of critical visual features from a larger set and yields extremely efficient classifiers. The third contribution is a method for combining increasingly more complex classifiers in a "cascade" which allows background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade can be viewed as an object specific focus-of-attention mechanism which unlike previous approaches provides statistical guarantees that discarded regions are unlikely to contain the object of interest. In the domain of face detection the system yields detection rates comparable to the best previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection.
Article
Full-text available
This paper addresses our proposed method to automatically segment out a person's face from a given image that consists of a head-and-shoulders view of the person and a complex background scene. The method involves a fast, reliable, and effective algorithm that exploits the spatial distribution characteristics of human skin color. A universal skin-color map is derived and used on the chrominance component of the input image to detect pixels with skin-color appearance. Then, based on the spatial distribution of the detected skin-color pixels and their corresponding luminance values, the algorithm employs a set of novel regularization processes to reinforce regions of skin-color pixels that are more likely to belong to the facial regions and eliminate those that are not. The performance of the face-segmentation algorithm is illustrated by some simulation results carried out on various head-and-shoulders test images. The use of face segmentation for video coding in applications such as videotelephony is then presented. We explain how the face-segmentation results can be used to improve the perceptual quality of a videophone sequence encoded by the H.261-compliant coder
Article
Realtime multi-person 2D pose estimation is a key component in enabling machines to have an understanding of people in images and videos. In this work, we present a realtime approach to detect the 2D pose of multiple people in an image. The proposed method uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. This bottom-up system achieves high accuracy and realtime performance, regardless of the number of people in the image. In previous work, PAFs and body part location estimation were refined simultaneously across training stages. We demonstrate that using a PAF-only refinement is able to achieve a substantial increase in both runtime performance and accuracy. We also present the first combined body and foot keypoint detector, based on an annotated foot dataset that we have publicly released. We show that the combined detector not only reduces the inference time compared to running them sequentially, but also maintains the accuracy of each component individually. This work has culminated in the release of OpenPose, the first open-source realtime system for multi-person 2D pose detection, including body, foot, hand, and facial keypoints.
Conference Paper
A new project is presented, in which a computer vision and deep learning based automated screening toolkit is being developed, with the aim of supporting screening for dementia in deaf signers of British Sign Language. Unlike other current computer vision systems used in dementia stage assessment such as RGB-D video or monitoring using Information and Communication Technologies facilities, the proposed system focuses on analysing the sign space envelope (sign trajectories/depth/speed) and facial expressions of deaf individuals using standard 2D videos freely available on the BSL Signbank dataset. This approach is thus more economical, simpler, more flexible, and more adaptable. First phase research work on hand/face detection, hand tracking, and feature extraction in an OpenCV Python environment will be presented. Double hand sign trajectories are tracked by implementing skin colour filtering, K-Nearest Neighbour background subtraction, as well as morphology to de-noise before using contour extraction to track hand blob trajectories based on contour centroids. Meanwhile, face detection is performed using a HAAR algorithm for facial analysis. Demonstrated experiment results will be beneficial not only to the screening of deaf individuals for dementia, but also for assessment of other acquired neurological impairments associated with motor changes, for example, stroke and Parkinson’s disease.
Conference Paper
This paper presents an unsupervised approach for learning long-term human activities without requiring any user interaction (e.g., clipping long-term videos into short-term actions, labeling huge amount of short-term actions as in supervised approaches). First, important regions in the scene are learned via clustering trajectory points and the global movement of people is presented as a sequence of primitive events. Then, using local action descriptors with bag-of-words (BoW) approach, we represent the body motion of people inside each region. Incorporating global motion information with action descriptors, a comprehensive representation of human activities is obtained by creating models that contains both global and body motion of people. Learning of zones and the construction of primitive events is automatically performed. Once models are learned, the approach provides an online recognition framework. We have tested the performance of our approach on recognizing activities of daily living and showed its efficiency over existing approaches.
Conference Paper
The present paper proposes a computer vision system to diagnose the stage of illness in patients affected by Alzheimer's disease. In the context of Ambient Assisted Living (AAL), the system monitors people in home environment during daily personal care activities. The aim is to evaluate the dementia stage, observing actions listed in the Direct Assessment of Funcional Status (DAFS) index and detecting anomalies during the performance, in order to assign a score explaining if the action is correct or not. In this work brushing teeth and grooming hair by a hairbrush are analysed. The technology consists of the application of a Recurrent Neural Network with Parametric Bias (RNNPB) that is able to learn movements connected with a specific action and recognize human activities by parametric bias that work like mirror neurons. This study has been conducted using Microsoft Kinect to collect data about the actions observed and oversee the user tracking and gesture recognition. Experiments prove that the proposed computer vision system can learn and recognize complex human activities and evaluates DAFS score.
Article
An electromagnetic tracking system was used to record arm motion in subjects with Parkinson's disease (n = 23), essential tremor (n = 28) or without neurological disease (n = 4). Tremor magnitude was calculated by averaging the three-dimensional displacement of individual tremor bursts. Tremor magnitude calculated in this manner was quite closely correlated with a clinician's estimate (r = 0.88 and 0.86 for Parkinsonian and essential tremors, respectively) and was reproducible (r = 0.93 for repeated recordings). The accuracy of the device and algorithm was confirmed by mechanically generating oscillations of known magnitudes and frequencies. This device is adaptable for quantifying different types of tremors and its accuracy is easy to verify. Because position rather than acceleration is tracked, tremor amplitude can be stated in readily comprehensible units. © 2001 Movement Disorder Society.
Human activity recognition using recurrent neural networks
  • D Singh
  • A Holzinger
  • P Kieseberg
  • AM Tjoa
  • E Weippl
Evaluation of different chrominance models in the detection and reconstruction of faces and hands using the growing neural gas network
  • A Angelopoulou
Angelopoulou A, et al. Evaluation of different chrominance models in the detection and reconstruction of faces and hands using the growing neural gas network. J Pattern Anal Appl. 2019; 22:1-19. DOI: 10.1007/s10044-019-00819-x 21. OpenCV. https://opencv.org/ 22. OpenPose. https://github.com/CMU-Perceptual-Computing-Lab/openpose 23. OpenPose in Tensorflow. https://github.com/ildoonet/tf-pose-estimation
  • E Pellegrini
Pellegrini, E., et al.: Machine learning of neuroimaging to diagnose cognitive impairment and dementia: a systematic review and comparative analysis. arXiv: 1804.01961 (2018)