Mohamed DaoudiIMT Nord Europe
Mohamed Daoudi
PhD
About
300
Publications
53,672
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,001
Citations
Introduction
Mohamed Daoudi is a Full Professor of Computer Science at IMT Nord Europe and the head of Image group at CRIStAL Laboratory. He received his Ph.D. degree in Computer Engineering from the University of Lille (France) in 1993. His research interests artificial intelligence, computer vision, pattern recognition. He is/was Associate Editor of IVC, IEEE Trans. On Multimedia, IEEE Tran. On Affective Computing, and Computer & Graphics. He is a Fellow of IAPR and IEEE Senior member.
Additional affiliations
September 1991 - August 1992
October 1998 - present
September 2004 - August 2005
Publications
Publications (300)
This paper introduces a new framework for surface analysis derived from the general setting of elastic Riemannian metrics on shape spaces. Traditionally, those metrics are defined over the infinite dimensional manifold of immersed surfaces and satisfy specific invariance properties enabling the comparison of surfaces modulo shape preserving transfo...
Generating speech-driven 3D talking heads presents numerous challenges; among those is dealing with varying mesh topologies. Existing methods require a registered setting, where all meshes share a common topology: a point-wise correspondence across all meshes the model can animate. While simplifying the problem, it limits applicability as unseen me...
The execution of object-directed motor actions is known to be influenced by the intention to interact with others. In this study, we tested whether the effects of social intention on the kinematics of object-directed actions depended on whether the task was performed in the presence of a human or a virtual confederate. In two experiments, participa...
In this paper, we propose a virtual agent application. We develop a virtual agent that reacts to gestures and a virtual environment in which it can interact with the user. We capture motion with a Kinect V2 camera, predict the end of the motion and then classify it. The application also features a facial expression recognition module. In addition,...
Automatic surgical skill assessment has the capacity to bring a transformative shift in the assessment, development, and enhancement of surgical proficiency. It offers several advantages, including objectivity, precision, and real-time feedback. These benefits will greatly enhance the development of surgical skills for novice surgeons, enabling the...
The execution of object-directed motor actions is known to be influenced by the intention to interact with others. In this study, we tested whether the effects of social intention on the kinematics of object-directed actions depended on whether the task was performed in the presence of a human or a virtual confederate. In two experiments, participa...
3D generative modeling is accelerating as the technology allowing the capture of geometric data is developing. However, the acquired data is often inconsistent, resulting in unregistered meshes or point clouds. Many generative learning algorithms require correspondence between each point when comparing the predicted shape and the target shape. We p...
For decades, researchers of different areas, ranging from artificial intelligence to computer vision, have intensively investigated human-centered data, i [...]
Brain signals have recently been proposed as a strong biometric due to their characteristics such as, uniqueness, permanence, universality, and confidentiality. There are many factors that affect the stability of EEG signals as a biometric for example, using different recording devices, variation in participant emotional states, performing differen...
Recently, wearable emotion recognition based on peripheral physiological signals has drawn massive attention due to its less invasive nature and its applicability in real-life scenarios. However, how to effectively fuse multimodal data remains a challenging problem. Moreover, traditional fully-supervised based approaches suffer from overfitting giv...
The generation of natural human motion interactions is a hot topic in computer vision and computer animation. It is a challenging task due to the diversity of possible human motion interactions. Diffusion models, which have already shown remarkable generative capabilities in other domains, are a good candidate for this task. In this paper, we intro...
We address the challenging task of human reaction generation, which aims to generate a corresponding reaction based on an input action. Most of the existing works do not focus on generating and predicting the reaction and cannot generate the motion when only the action is given as input. To address this limitation, we propose a novel interaction Tr...
Recently, wearable emotion recognition based on peripheral physiological signals has drawn massive attention due to its less invasive nature and its applicability in real-life scenarios. However, how to effectively fuse multimodal data remains a challenging problem. Moreover, traditional fully-supervised based approaches suffer from overfitting giv...
In this work, we address the problem of 4D facial expressions generation. This is usually addressed by animating a neutral 3D face to reach an expression peak, and then get back to the neutral state. In the real world though, people show more complex expressions, and switch from one expression to another. We thus propose a new model that generates...
Shape analysis of landmarks is a fundamental problem in computer vision and multimedia. We propose a family of metrics called Fubini-Study distances defined in the complex projective space based on the seminal work of Kendall [11] for metric learning to measure the similarity between shape representations which are modeled directly by the equivalen...
We present BaRe-ESA, a novel Riemannian framework for human body scan representation, interpolation and extrapolation. BaRe-ESA operates directly on unregistered meshes, i.e., without the need to establish prior point to point correspondences or to assume a consistent mesh structure. Our method relies on a latent space representation, which is equi...
Human facial expressions change dynamically, so their recognition / analysis should be conducted by accounting for the temporal evolution of face deformations either in 2D or 3D. While abundant 2D video data do exist, this is not the case in 3D, where few 3D dynamic (4D) datasets were released for public use. The negative consequence of this scarci...
In this work we propose a novel solution for 3D skeleton-based human motion prediction. The objective of this task consists in forecasting future human poses based on a prior skeleton pose sequence. This involves solving two main challenges still present in recent literature; (1) discontinuity of the predicted motion which results in unrealistic mo...
We propose an automatic method to estimate self-reported pain intensity based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is u...
We propose an automatic method to estimate self-reported pain based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and the pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is used fo...
We propose an automatic method to estimate self-reported pain based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and the pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is used fo...
In this paper we address the task of the comparison and the classification of 3D shape sequences of human. The non-linear dynamics of the human motion and the changing of the surface parametrization over the time make this task very challenging. To tackle this issue, we propose to embed the 3D shape sequences in an infinite dimensional space, the s...
Existing multimodal stress/pain recognition approaches generally extract features from different modalities independently and thus ignore cross-modality correlations. This paper proposes a novel geometric framework for multimodal stress/pain detection utilizing Symmetric Positive Definite (SPD) matrices as a representation that incorporates the cor...
We address the challenging task of human reaction generation which aims to generate a corresponding reaction based on an input action. Most of the existing works do not focus on generating and predicting the reaction and cannot generate the motion when only the action is given as input. To address this limitation, we propose a novel interaction Tra...
In this paper, we propose a solution to the task of generating dynamic 3D facial expressions from a neutral 3D face and an expression label. This involves solving two sub-problems: (i) modeling the temporal dynamics of expressions, and (ii) deforming the neutral mesh to obtain the expressive counterpart. We represent the temporal evolution of expre...
In this work we propose a novel solution for 3D skeleton-based human motion prediction. The objective of this task consists in forecasting future human poses based on a prior skeleton pose sequence. This involves solving two main challenges still present in recent literature; (1) discontinuity of the predicted motion which results in unrealistic mo...
Emotion recognition plays an important role in human computer interaction systems as it helps the computer in understanding human behavior and their decision making process. Using Electroencephalographic (EEG) signals in emotion recognition offers a direct assessment on the inner state of human mind. This study aims to build a subject dependent emo...
We propose a novel framework for comparing 3D human shapes under the change of shape and pose. This problem is challenging since 3D human shapes vary significantly across subjects and body postures. We solve this problem by using a Riemannian approach. Our core contribution is the mapping of the human body surface to the space of metrics and normal...
We analyze human poses and motion by introducing three sequences of easily calculated surface descriptors that are invariant under reparametrizations and Euclidean transformations. These descriptors are obtained by associating to each finitely-triangulated surface two functions on the unit sphere: for each unit vector u we compute the weighted area...
We analyze human poses and motion by introducing three sequences of easily calculated surface descriptors that are invariant under reparametrizations and Euclidean transformations. These
descriptors are obtained by associating to each finitely-triangulated surface two functions on the unit sphere: for each unit vector $u$ we compute the weighted ar...
We propose a novel framework for comparing 3D human shapes under the change of shape and pose. This problem is challenging since 3D human shapes vary significantly across subjects and body postures. We solve this problem by using a Riemannian approach. Our core contribution is the mapping of the human body surface to the space of metrics and normal...
Falls are one of the most critical health care risks for elderly people, being, in some adverse circumstances, an indirect cause of death. Furthermore, demographic forecasts for the future show a growing elderly population worldwide. In this context, models for automatic fall detection and prediction are of paramount relevance, especially AI applic...
Human motion prediction aims to forecast future human poses given a prior pose sequence. The discontinuity of the predicted motion and the performance deterioration in long-term horizons are still the main challenges encountered in current literature. In this work, we tackle these issues by using a compact manifold-valued representation of human mo...
While deep learning-based 3D face generation has made a progress recently, the problem of dynamic 3D (4D) facial expression synthesis is less investigated. In this paper, we propose a novel solution to the following question: given one input 3D neutral face, can we generate dynamic 3D (4D) facial expressions from it? To tackle this problem, we firs...
In this paper, an effective pipeline to automatic 4D Facial Expression Recognition (4D FER) is proposed. It combines two growing but disparate ideas in Computer Vision -- computing the spatial facial deformations using tools from Riemannian geometry and magnifying them using temporal filtering. The flow of 3D faces is first analyzed to capture the...
Recovering the 3D geometric structure of a face from a single input image is a challenging active research area in computer vision. In this paper, we present a novel method for reconstructing 3D heads from a single or multiple image(s) using a hybrid approach based on deep learning and geometric techniques. We propose an encoder-decoder network bas...
In this paper we propose a new family of metrics on the manifold of oriented ellipses centered at the origin in Euclidean n-space, the double cover of the manifold of positive semi-definite matrices of rank two, in order to measure similarities between landmark representations. The metrics, whose distance functions are remarkably simple, are parame...
We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Cur...
The goal of Face and Gesture Analysis for Health Informatics's workshop is to share and discuss the achievements as well as the challenges in using computer vision and machine learning for automatic human behavior analysis and modeling for clinical research and healthcare applications. The workshop aims to promote current research and support growt...
In this paper, a model is presented to extract statistical summaries to characterize the repetition of a cyclic body action, for instance a gym exercise, for the purpose of checking the compliance of the observed action to a template one and highlighting the parts of the action that are not correctly executed (if any). The proposed system relies on...
We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Cur...
We propose to exploit the face geometry by modeling the facial landmarks motion as curves encoded as points on a hypersphere. By proposing a conditional version of manifold-valued Wasserstein generative adversarial network (GAN) for motion generation on the hypersphere, we learn the distribution of facial expression dynamics of different classes, f...
The papers in this special section were presented at the 14TH IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019) that was held in Lille, France, 14–18 May 2019.
In this article, we propose a new approach for facial expression recognition (FER) using deep covariance descriptors. The solution is based on the idea of encoding local and global deep convolutional neural network (DCNN) features extracted from still images, in compact local and global covariance descriptors. The space geometry of the covariance m...
Automatic analysis of emotions and affects from speech is an inherently challenging problem with a broad range of applications in Human-Computer Interaction (HCI), health informatics, assistive technologies and multimedia retrieval. Understanding human's specific and basic emotions and reacting accordingly can improve HCI. Besides, giving machines...
In this paper, we tackle the problem of action recognition using body skeletons extracted from video sequences. Our approach lies in the continuity of recent works representing video frames by Gramian matrices that describe a trajectory on the Riemannian manifold of positive-semidefinite matrices of fixed rank. In comparison with previous works, th...
In this work, we propose a novel approach for generating videos of the six basic facial expressions given a neutral face image. We propose to exploit the face geometry by modeling the facial landmarks motion as curves encoded as points on a hypersphere. By proposing a conditional version of manifold-valued Wasserstein generative adversarial network...
Experiential avoidance refers to attempts to control or suppress unwanted thoughts, feelings and emotions. We investigated whether experiential avoidance is associated with fewer facial expressions during autobiographical retrieval (i.e. retrieval of memory for personal information). We invited participants to retrieve autobiographical memories, an...
Lipreading or Visual speech recognition is the process of decoding speech from speaker's mouth movements. It is used for people with hearing impairment, to understand patients attained with laryngeal cancer, people with vocal cord paralysis and in noisy environment. In this paper we aim to develop a visual-only speech recognition system based only...
In this paper, we propose a new approach for facial expression recognition. The solution is based on the idea of encoding local and global Deep Convolutional Neural Network (DCNN) features extracted from still images, in compact local and global covariance descriptors. The space geometry of the covariance matrices is that of Symmetric Positive Defi...
In this paper, we propose a novel space-time geometric representation of human landmark configurations and derive tools for comparison and classification. We model the temporal evolution of landmarks as parametrized trajectories on the Riemannian manifold of positive semidefinite matrices of fixed-rank. Our representation has the benefit to bring n...
In this paper, covariance matrices are exploited to encode the deep convolutional neural networks (DCNN) features for facial expression recognition. The space geometry of the covariance matrices is that of Symmetric Positive Definite (SPD) matrices. By performing the classification of the facial expressions using Gaussian kernel on SPD manifold, we...
In this paper, we propose a novel space-time geometric representation of human landmark configurations and derive tools for comparison and classification. We model the temporal evolution of landmarks as parametrized trajectories on the Riemannian manifold of positive semidefinite matrices of fixed-rank. Our representation has the benefit to bring n...