Hubert P. H. Shum

Hubert P. H. Shum
Durham University | DU · Department of Computer Science

SMIEEE, FHEA, PhD, MSc, BEng
Human-Centric Computer Vision and Graphics http://hubertshum.com

About

186
Publications
49,288
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,563
Citations
Introduction
I am an Associate Professor and the Deputy Director of Research in Computer Science at Durham University researching on spatio-temporal visual computing with over 150 research publications. I had worked as the Associate Professor at Northumbria University and a Postdoctoral Researcher at RIKEN Japan, after received my PhD from the University of Edinburgh. I led research projects as the Principal Investigator awarded by EPSRC, DASA, Royal Society and Innovate UK. http://hubertshum.com
Additional affiliations
September 2016 - September 2020
Northumbria University
Position
  • Associate Professor & Director of Research
July 2012 - September 2016
Northumbria University
Position
  • Senior Lecturer
January 2010 - April 2011
RIKEN
Position
  • Post-Doctoral Research Fellow
Education
June 2006 - January 2010
The University of Edinburgh
Field of study
  • Computer Science

Publications

Publications (186)
Article
Full-text available
As the most common idiopathic inflammatory myopathy in children, juvenile dermatomyositis (JDM) is characterized by skin rashes and muscle weakness. The childhood myositis assessment scale (CMAS) is commonly used to measure the degree of muscle involvement for diagnosis or rehabilitation monitoring. On the one hand, human diagnosis is not scalable...
Preprint
Full-text available
Whilst the availability of 3D LiDAR point cloud data has significantly grown in recent years, annotation remains expensive and time-consuming, leading to a demand for semi-supervised semantic segmentation methods with application domains such as autonomous driving. Existing work very often employs relatively large segmentation backbone networks to...
Preprint
Full-text available
Diffusion models are generative models, which gradually add and remove noise to learn the underlying distribution of training data for data generation. The components of diffusion models have gained significant attention with many design choices proposed. Existing reviews have primarily focused on higher-level solutions, thereby covering less on th...
Article
Full-text available
Action quality assessment (AQA) automatically evaluates how well humans perform actions in a given video, a technique widely used in fields such as rehabilitation medicine, athletic competitions, and specific skills assessment. However, existing works that uniformly divide the video sequence into small clips of equal length suffer from intra-clip c...
Article
Full-text available
The early diagnosis of cerebral palsy is an area which has recently seen significant multi-disciplinary research. Diagnostic tools such as the General Movements Assessment (GMA), have produced some very promising results. However, the prospect of automating these processes may improve accessibility of the assessment and also enhance the understandi...
Article
Full-text available
Interaction-aware autonomous driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task, as it requires the AV to be able to understand and predict the behaviour of human road users. In thi...
Article
In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is...
Chapter
Full-text available
Resting-state fMRI (rs-fMRI) functional connectivity (FC) analysis provides valuable insights into the relationships between different brain regions and their potential implications for neurological or psychiatric disorders. However, specific design efforts to predict treatment response from rs-fMRI remain limited due to difficulties in understandi...
Preprint
Full-text available
Generating 3D images of complex objects conditionally from a few 2D views is a difficult synthesis problem, compounded by issues such as domain gap and geometric misalignment. For instance, a unified framework such as Generative Adversarial Networks cannot achieve this unless they explicitly define both a domain-invariant and geometric-invariant jo...
Preprint
Pre-captured immersive environments using omnidirectional cameras provide a wide range of virtual reality applications. Previous research has shown that manipulating the eye height in egocentric virtual environments can significantly affect distance perception and immersion. However, the influence of eye height in pre-captured real environments has...
Preprint
Full-text available
Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (i.e. transfer-based attacks) or frequent model queries (i.e. black-box attacks). All their requireme...
Conference Paper
Full-text available
Whilst the availability of 3D LiDAR point cloud data has significantly grown in recent years, annotation remains expensive and time-consuming, leading to a demand for semi-supervised semantic segmentation methods with application domains such as autonomous driving. Existing work very often employs relatively large segmentation backbone networks to...
Preprint
Full-text available
We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients privacy, we design a software framework using image inpainting, which does not require cleft lip images for training, thereby mitigating the risk of mod...
Article
Full-text available
We present a software that predicts non-cleft facial images for patients with cleft lip, thereby facilitating the understanding, awareness and discussion of cleft lip surgeries. To protect patients’ privacy, we design a software framework using image inpainting, which does not require cleft lip images for training, thereby mitigating the risk of mo...
Preprint
Full-text available
Learning view-invariant representation is a key to improving feature discrimination power for skeleton-based action recognition. Existing approaches cannot effectively remove the impact of viewpoint due to the implicit view-dependent representations. In this work, we propose a self-supervised framework called Focalized Contrastive View-invariant Le...
Preprint
Full-text available
It is difficult to train classifiers on paintings collections due to model bias from domain gaps and data bias from the uneven distribution of artistic styles. Previous techniques like data distillation, traditional data augmentation and style transfer improve classifier training using task specific training datasets or domain adaptation. We propos...
Preprint
Full-text available
Generating realistic motions for digital humans is a core but challenging part of computer animations and games, as human motions are both diverse in content and rich in styles. While the latest deep learning approaches have made significant advancements in this domain, they mostly consider motion synthesis and style manipulation as two separate pr...
Preprint
Full-text available
Acquiring the virtual equivalent of exhibits, such as sculptures, in virtual reality (VR) museums, can be labour-intensive and sometimes infeasible. Deep learning based 3D reconstruction approaches allow us to recover 3D shapes from 2D observations, among which single-view-based approaches can reduce the need for human intervention and specialised...
Article
Full-text available
Musculoskeletal and neurological disorders are the most common causes of walking problems among older people, and they often lead to diminished quality of life. Analyzing walking motion data manually requires trained professionals and the evaluations may not always be objective. To facilitate early diagnosis, recent deep learning-based methods have...
Chapter
Full-text available
Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pos...
Preprint
Full-text available
Generating realistic motions for digital humans is time-consuming for many graphics applications. Data-driven motion synthesis approaches have seen solid progress in recent years through deep generative models. These results offer high-quality motions but typically suffer in motion style diversity. For the first time, we propose a framework using t...
Conference Paper
Full-text available
A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in improving surgical outcomes. If AI can be used to predict what a repaired cleft lip would look like...
Chapter
Full-text available
Parkinson’s disease (PD) is a progressive neurodegenerative disorder that results in a variety of motor dysfunction symptoms, including tremors, bradykinesia, rigidity and postural instability. The diagnosis of PD mainly relies on clinical experience rather than a definite medical test, and the diagnostic accuracy is only about 73–84% since it is c...
Preprint
Full-text available
Early prediction is clinically considered one of the essential parts of cerebral palsy (CP) treatment. We propose to implement a low-cost and interpretable classification system for supporting CP prediction based on General Movement Assessment (GMA). We design a Pytorch-based attention-informed graph convolutional network to early identify infants...
Article
Full-text available
Early prediction is clinically considered one of the essential parts of cerebral palsy (CP) treatment. We propose to implement a low-cost and interpretable classification system for supporting CP prediction based on General Movement Assessment (GMA). We design a Pytorch-based attention-informed graph convolutional network to early identify infants...
Preprint
Full-text available
Musculoskeletal and neurological disorders are the most common causes of walking problems among older people, and they often lead to diminished quality of life. Analyzing walking motion data manually requires trained professionals and the evaluations may not always be objective. To facilitate early diagnosis, recent deep learning-based methods have...
Preprint
Surgical workflow anticipation can give predictions on what steps to conduct or what instruments to use next, which is an essential part of the computer-assisted intervention system for surgery, e.g. workflow reasoning in robotic surgery. However, current approaches are limited to their insufficient expressive power for relationships between instru...
Preprint
Full-text available
A Cleft lip is a congenital abnormality requiring surgical repair by a specialist. The surgeon must have extensive experience and theoretical knowledge to perform surgery, and Artificial Intelligence (AI) method has been proposed to guide surgeons in improving surgical outcomes. If AI can be used to predict what a repaired cleft lip would look like...
Preprint
Full-text available
Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pos...
Preprint
Full-text available
Parkinson's disease (PD) is a progressive neurodegenerative disorder that results in a variety of motor dysfunction symptoms, including tremors, bradykinesia, rigidity and postural instability. The diagnosis of PD mainly relies on clinical experience rather than a definite medical test, and the diagnostic accuracy is only about 73-84% since it is c...
Preprint
Full-text available
Motion control algorithms in the presence of pedestrians are critical for the development of safe and reliable Autonomous Vehicles (AVs). Traditional motion control algorithms rely on manually designed decision-making policies which neglect the mutual interactions between AVs and pedestrians. On the other hand, recent advances in Deep Reinforcement...
Preprint
Full-text available
Detecting human-object interactions is essential for comprehensive understanding of visual scenes. In particular, spatial connections between humans and objects are important cues for reasoning interactions. To this end, we propose a skeleton-aware graph convolutional network for human-object interaction detection, named SGCN4HOI. Our network explo...
Conference Paper
Full-text available
Early diagnosis and intervention are clinically con-sidered the paramount part of treating cerebral palsy (CP), so it is essential to design an efficient and interpretable automatic prediction system for CP. We highlight a significant difference between CP infants' frequency of human movement and that of the healthy group, which improves prediction...
Conference Paper
Full-text available
Computed tomography (CT) is an effective med-ical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, including generation of thin slice multi planar cross-sectional body imaging and 3D reconstructions. Howe...
Preprint
Full-text available
Trajectory prediction of road users in real-world scenarios is challenging because their movement patterns are stochastic and complex. Previous pedestrian-oriented works have been successful in modelling the complex interactions among pedestrians, but fail in predicting trajectories when other types of road users are involved (e.g., cars, cyclists,...
Article
Full-text available
Existing studies on formation control for unmanned aerial vehicles (UAV) have not considered encircling targets where an optimum coverage of the target is required at all times. Such coverage plays a critical role in many real-world applications such as tracking hostile UAVs. This paper proposes a new path planning approach called the Flux Guided (...
Preprint
Full-text available
Early diagnosis and intervention are clinically considered the paramount part of treating cerebral palsy (CP), so it is essential to design an efficient and interpretable automatic prediction system for CP. We highlight a significant difference between CP infants' frequency of human movement and that of the healthy group, which improves prediction...
Article
Full-text available
3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important. To tackle the challenge, we propose a novel system to reconstruct a 3D car using a single sketch image. The system learns from a synthetic database of 3D car mode...
Article
Full-text available
Schizophrenia is a severe mental health condition that requires a long and complicated diagnostic process. However, early diagnosis is vital to control symptoms. Deep learning has recently become a popular way to analyse and interpret medical data. Past attempts to use deep learning for schizophrenia diagnosis from brain-imaging data have shown pro...
Preprint
Full-text available
Schizophrenia is a severe mental health condition that requires a long and complicated diagnostic process. However, early diagnosis is vital to control symptoms. Deep learning has recently become a popular way to analyse and interpret medical data. Past attempts to use deep learning for schizophrenia diagnosis from brain-imaging data have shown pro...
Preprint
Full-text available
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction. Although data-driven learning-based methods demonstrate significant potential in this field, scarce training data and ineffective 360 estimation algorithms are still two key limitati...
Preprint
Full-text available
Computed tomography (CT) is an effective medical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, including generation of thin slice multiplanar cross-sectional body imaging and 3D reconstructions. Howeve...
Article
Full-text available
Motion control algorithms in the presence of pedes- trians are critical for the development of safe and reliable Autonomous Vehicles (AVs). Traditional motion control algo- rithms rely on manually designed decision-making policies which neglect the mutual interactions between AVs and pedestrians. On the other hand, recent advances in Deep Reinforce...
Article
Full-text available
Trajectory Prediction under diverse patterns has attracted increasing attention in multiple real-world applications ranging from urban traffic analysis to human motion understanding, among which graph convolution network (GCN) is frequently adopted with its superior ability in modeling the complex trajectory interactions among multiple humans. In t...
Conference Paper
Full-text available
We present DurLAR, a high-fidelity 128-channel 3D LiDAR dataset with panoramic ambient (near infrared) and reflectivity imagery, as well as a sample benchmark task using depth estimation for autonomous driving applications. Our driving platform is equipped with a high resolution 128 channel LiDAR, a 2MPix stereo camera, a lux meter and a GNSS/INS s...
Preprint
Full-text available
Recent research on biometrics focuses on achieving a high success rate of authentication and addressing the concern of various spoofing attacks. Although hand geometry recognition provides adequate security over unauthorized access, it is susceptible to presentation attack. This paper presents an anti-spoofing method toward hand biometrics. A prese...
Preprint
Full-text available
Creating realistic characters that can react to the users' or another character's movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful resea...
Article
Full-text available
Creating realistic characters that can react to the users’ or another character’s movement can benefit computer graphics, games and virtual reality hugely. However, synthesizing such reactive motions in human-human interactions is a challenging task due to the many different ways two humans can interact. While there are a number of successful resea...
Conference Paper
Full-text available
Hand object interaction in mixed reality (MR) relies on the accurate tracking and estimation of human hands, which provide users with a sense of immersion. However, raw captured hand motion data always contains errors such as joints occlusion, dislocation, high-frequency noise, and involuntary jitter. Denoising and obtaining the hand motion data co...
Preprint
Full-text available
Predicting the movement trajectories of multiple classes of road users in real-world scenarios is a challenging task due to the diverse trajectory patterns. While recent works of pedestrian trajectory prediction successfully modelled the influence of surrounding neighbours based on the relative distances, they are ineffective on multi-class traject...
Article
Full-text available
Recent research on biometrics focuses on achieving a high success rate of authentication and addressing the concern of various spoofing attacks. Although hand geometry recognition provides adequate security over unauthorized access, it is susceptible to presentation attack. This paper presents an anti-spoofing method toward hand biometrics. A prese...
Preprint
Full-text available
Early prediction of cerebral palsy is essential as it leads to early treatment and monitoring. Deep learning has shown promising results in biomedical engineering thanks to its capacity of modelling complicated data with its non-linear architecture. However, due to their complex structure, deep learning models are generally not interpretable by hum...
Preprint
Full-text available
As unmanned aerial vehicles (UAVs) become more accessible with a growing range of applications, the potential risk of UAV disruption increases. Recent development in deep learning allows vision-based counter-UAV systems to detect and track UAVs with a single camera. However, the coverage of a single camera is limited, necessitating the need for mul...
Preprint
Full-text available
Unmanned Aerial Vehicles (UAV) can pose a major risk for aviation safety, due to both negligent and malicious use. For this reason, the automated detection and tracking of UAV is a fundamental task in aerial security systems. Common technologies for UAV detection include visible-band and thermal infrared imaging, radio frequency and radar. Recent a...
Preprint
Full-text available
While multiple studies have proposed methods for the formation control of unmanned aerial vehicles (UAV), the trajectories generated are generally unsuitable for tracking targets where the optimum coverage of the target by the formation is required at all times. We propose a path planning approach called the Flux Guided (FG) method, which generates...