
Lijun Yin- Binghamton University
Lijun Yin
- Binghamton University
About
179
Publications
44,771
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,139
Citations
Introduction
Current institution
Publications
Publications (179)
Fake portrait video generation techniques are posing a new threat to society as photorealistic deepfakes are being used for political propaganda, celebrity imitation, forged pieces of evidences, and other identity-related manipulations. Despite these generation techniques, some detection approaches have also been proven useful due to their high cla...
As a team studying the predictors of complications after lung surgery, we have encountered high missingness of data on one-lung ventilation (OLV) start and end times due to high clinical workload and cognitive overload during surgery. Such missing data limit the precision and clinical applicability of our findings. We hypothesized that available in...
Convolutional neural networks (CNN) have demonstrated good accuracy and speed in spatially registering high signal-to-noise ratio (SNR) structural magnetic resonance imaging (sMRI) images. However, some functional magnetic resonance imaging (fMRI) images, e.g., those acquired from arterial spin labeling (ASL) perfusion fMRI, are of intrinsically lo...
Contrastive learning has shown promising potential for learning robust representations by utilizing unlabeled data. However, constructing effective positive-negative pairs for contrastive learning on facial behavior datasets remains challenging. This is because such pairs inevitably encode the subject-ID information, and the randomly constructed pa...
As a team studying the predictors of complications after lung surgery, we have encountered high missingness of data on one-lung ventilation (OLV) start and end times due to high clinical workload and cognitive overload during surgery. Such missing data limit the precision and clinical applicability of our findings. We hypothesized that available in...
This paper demonstrates the effectiveness of a diversification mechanism for building a more robust multi-attention system in generic facial action analysis. While previous multi-attention (e.g., visual attention and self-attention) research on facial expression recognition (FER) and Action Unit (AU) detection have been thoroughly studied to focus...
Ethical affective computing (AC) requires maximizing the benefits to users while minimizing its harm to obtain trust from users. This requires responsible development and deployment to ensure fairness, bias mitigation, privacy preservation, and accountability. To obtain this, we require methodologies that can quantify, visualize, analyze, and mine...
Recent studies utilizing multi-modal data aimed at building a robust model for facial Action Unit (AU) detection. However, due to the heterogeneity of multi-modal data, multi-modal representation learning becomes one of the main challenges. On one hand, it is difficult to extract the relevant features from multi-modalities by only one feature extra...
Recent studies on the automatic detection of facial action unit (AU) have extensively relied on large-sized annotations. However, manually AU labeling is difficult, time-consuming, and costly. Most existing semi-supervised works ignore the informative cues from the temporal domain, and are highly dependent on densely annotated videos, making the le...
Emotion is an experience associated with a particular pattern of physiological activity along with different physiological, behavioral and cognitive changes. One behavioral change is facial expression, which has been studied extensively over the past few decades. Facial behavior varies with a person's emotion according to differences in terms of cu...
Visual attention has been extensively studied for learning fine-grained features in both facial expression recognition (FER) and Action Unit (AU) detection. A broad range of previous research has explored how to use attention modules to localize detailed facial parts (e,g. facial action units), learn discriminative features, and learn inter-class c...
Multi-modal learning has been intensified in recent years, especially for applications in facial analysis and action unit detection whilst there still exist two main challenges in terms of 1) relevant feature learning for representation and 2) efficient fusion for multi-modalities. Recently, there are a number of works have shown the effectiveness...
Telehealth has the potential to offset the high demand for help during public health emergencies, such as the COVID-19 pandemic. Remote Photoplethysmography (rPPG) - the problem of non-invasively estimating blood volume variations in the microvascular tissue from video - would be well suited for these situations. Over the past few years a number of...
Facial action units (AUs) recognition is a multi-label classification problem, where regular spatial and temporal patterns exist in AU labels due to facial anatomy and human’s behavior habits. Exploiting AU correlation is beneficial for obtaining robust AU detector or reducing the dependency of a large amount of AU-labeled samples. Several related...
The common view of emotional expressions is that certain configurations of facial-muscle movements reliably reveal certain categories of emotion. The principal exemplar of this view is the Duchenne smile, a configuration of facial-muscle movements (i.e., smiling with eye constriction) that has been argued to reliably reveal genuine positive emotion...
Fake portrait video generation techniques have been posing a new threat to the society with photorealistic deep fakes for political propaganda, celebrity imitation, forged evidences, and other identity related manipulations. Following these generation techniques, some detection approaches have also been proved useful due to their high classificatio...
The recent proliferation of fake portrait videos poses direct threats on society, law, and privacy [1]. Believing the fake video of a politician, distributing fake pornographic content of celebrities, fabricating impersonated fake videos as evidence in courts are just a few real world consequences of deep fakes. We present a novel approach to detec...
It is commonly believed that genuine positive emotion is reliably revealed by the "Duchenne smile,"f a smile that includes movement of the orbicularis oculi muscle (i.e., the Duchenne marker, which raises the cheeks and narrows the eyes). However, reviews of the evidence concerning this view identified methodological issues in previous studies and...
Facial action unit (AU) detectors have performed well when trained and tested within the same domain. How well do AU detectors transfer to domains in which they have not been trained? We review literature on cross-domain transfer and conduct experiments to address limitations of prior research. We evaluate generalizability in four publicly availabl...
With the technological advancements in non-invasive heart rate (HR) detection, it becomes more feasible to estimate heart rate using commodity digital cameras. However, achieving high accuracy in HR estimation still remains a challenge. One of the bottlenecks is the lack of sufficient facial videos annotated with corresponding HR signals. In order...
The Duchenne smile hypothesis is that smiles that include eye constriction (AU6) are the product of genuine positive emotion, whereas smiles that do not are either falsified or related to negative emotion. This hypothesis has become very influential and is often used in scientific and applied settings to justify the inference that a smile is either...
The Duchenne smile hypothesis is that smiles that include eye constriction (AU6) are the product of genuine positive emotion, whereas smiles that do not are either falsified or related to negative emotion. This hypothesis has become very influential and is often used in scientific and applied settings to justify the inference that a smile is either...
Facial action unit (AU) detectors have performed well when trained and tested within the same domain. Do AU detectors transfer to new domains in which they have not been trained? To answer this question, we review literature on cross-domain transfer and conduct experiments to address limitations of prior research. We evaluate both deep and shallow...
In 1997 Rosalind Picard introduced fundamental concepts of affect recognition [1]. Since this time, multimodal interfaces such as Brain-computer interfaces (BCIs), RGB and depth cameras, physiological wearables, multimodal facial data and physiological data have been used to study human emotion. Much of the work in this field focuses on a single mo...
Background:
Pain is a multidimensional condition of multiple origins. Determining both intensity and underlying cause are critical for effective management. Utilization of painkillers does not follow any guidelines relying on biomarkers, which effectively eliminates objective treatment. The aim of this study was to evaluate the use of serum cycloo...
In this paper, we propose a deep learning based approach for AU detection by enhancing and cropping regions of interest of face images. The approach is implemented by adding two novel nets: the enhancing layers and the cropping layers, to a pretrained convolutional neural network (CNN) model. For the enhancing layers (E-Net), we have designed an at...
The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditio...
In this paper, we propose a deep learning based approach for facial action unit detection by enhancing and cropping the regions of interest. The approach is implemented by adding two novel nets (layers): the enhancing layers and the cropping layers, to a pretrained CNN model. For the enhancing layers, we designed an attention map based on facial la...
2D alignment of face images works well provided images are frontal or nearly so and pitch and yaw remain modest. In spontaneous facial behavior, these constraints often are violated by moderate to large head rotation. 3D alignment from 2D video has been proposed as a solution. A number of approaches have been explored, but comparisons among them ha...
Recent studies in computer vision have shown that, while practically invisible to a human observer, skin color changes due to blood flow can be captured on face videos and, surprisingly, be used to estimate the heart rate (HR). While considerable progress has been made in the last few years, still many issues remain open. In particular, state-of-th...
In this paper we propose a novel method for detecting and tracking facial landmark features on 3D static and 3D dynamic (a.k.a. 4D) range data. Our proposed method involves fitting a shape index-based statistical shape model (SI-SSM) with both global and local constraints to the input range data. Our proposed model makes use of the global shape of...
Virtual Reality (VR), as a visual immersion technique, provides users with realistic visualization and interactive experiences. However, most of the current VR systems work passively without including human emotions as the feedback, nor as the control. Understanding expressions in real time under the face occlusion due to the VR headsets would enha...
The existing approaches to automatic emotion analysis rely mostly on visible spectrum data, and very few works have been reported using thermal data for spontaneous facial expression analysis. In this paper, we present a novel infra-red thermal video descriptor in order to improve spontaneous emotion recognition. We first represent each thermal vid...
Research on hand gesture recognition has been intensified in the last decade. There has been successful work on gesture recognition by using shape and texture features from 2D images and videos. However, it is still a challenging task in handling various conditions with hand scale variations, hand rotations, and hand ambiguity due to finger occlusi...
Automatic pain expression recognition is a challenging task for pain assessment and diagnosis. Conventional 2D-based approaches to automatic pain detection lack robustness to the moderate to large head pose variation and changes in illumination that are common in real-world settings and with few exceptions omit potentially informative temporal info...
Despite efforts towards evaluation standards in facial expression analysis (e.g. FERA 2011), there is a need for up-to-date standardised evaluation procedures, focusing in particular on current challenges in the field. One of the challenges that is actively being addressed is the automatic estimation of expression intensities. To continue to provid...
It is our great pleasure to welcome you to the 2 d Facial Expression Recognition and Analysis challenge and workshop (FERA 2015), held in conjunction with the 11 th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2015). It's been four years since the first facial expression recognition challenge (FERA 2011), and we're ex...
The purpose of this chapter is to explain and provide a conceptual understanding of the application and design of intelligent tutoring systems in education. The authors will examine potential applications of intelligent systems in the classroom. The first example illustrates a means to educate students within science-education teacher-preparation p...
In this paper, we applied a reverse correlation approach to study the features that humans use to categorize facial expressions. The well-known portrait of Mona Lisa was used as the base image to investigate the features differentiating happy and sad expressions. The base image was blended with sinusoidal noise masks to create the stimulus. Observe...
A gaze direction determining system and method is provided. A two-camera system may detect the face from a fixed, wide-angle camera, estimates a rough location for the eye region using an eye detector based on topographic features, and directs another active pan-tilt-zoom camera to focus in on this eye region. A eye gaze estimation approach employs...
Facial expression is central to human experience. Its efficiency and valid measurement are challenges that automated facial image analysis seeks to address. Most publically available databases are limited to 2D static images or video of posed facial behavior. Because posed and un-posed (aka “spontaneous”) facial expressions differ along several dim...
Head pose is an important indicator of a person's attention, gestures, and communicative behavior with applications in human-computer interaction, multimedia, and vision systems. Robust head pose estimation is a prerequisite for spontaneous facial biometrics-related applications. However, most previous head pose estimation methods do not consider t...
In this paper, we propose a multi-scale topological feature representation for automatic analysis of hand posture. Such topological features have the advantage of being posture-dependent while being preserved under certain variations of illumination, rotation, personal dependency, etc. Our method studies the topology of the holes between the hand r...
True immersion of a player within a game can only occur when the world simulated looks and behaves as close to reality as possible. This implies that the game must correctly read and understand, among other things, the player's focus, attitude toward the objects/persons in focus, gestures, and speech. In this paper, we proposed a novel system that...
In this paper, we propose a novel method for detecting and tracking landmark facial features on purely geometric 3D and 4D range models. Our proposed method involves fitting a new multi-frame constrained 3D temporal deformable shape model (TDSM) to range data sequences. We consider this a temporal based deformable model as we concatenate consecutiv...
Facial expression is central to human experience. Its efficient and valid measurement is a challenge that automated facial image analysis seeks to address. Most publically available databases are limited to 2D static images or video of posed facial behavior. Because posed and un-posed (aka "spontaneous") facial expressions differ along several dime...
In this paper, we propose a new, compact, 4D spatio-temporal “Nebula” feature to improve expression and facial movement analysis performance. Given a spatio-temporal volume, the data is voxelized and fit to a cubic polynomial. A label is assigned based on the principal curvature values, and the polar angles of the direction of least curvature are c...
Welcome to the 10th IEEE International Conference on Automatic Face and Gesture Recognition (FG13) in Shanghai, China. The conference is the premier world conference on vision-based facial and body gesture modeling, analysis, and recognition. Since its first meeting in Zurich, the conference has been held nine times throughout the world. At its ten...
Automatic facial expression recognition constitutes an active research field due to the latest advances in computing technology that make the user's experience a clear priority. The majority of work conducted in this area involves 2D imagery, despite the problems this presents due to inherent pose and illumination variations. In order to deal with...
Automatic facial expression recognition constitutes an active research field due to the latest advances in computing technology that make the user's experience a clear priority. The majority of work conducted in this area involves 2D imagery, despite ...
Shapes with complex geometric and topological features such as tunnels, neighboring sheets, and cavities are susceptible to undersampling and continue to challenge existing reconstruction techniques. In this work we introduce a new measure for point clouds to determine the likely interior and exterior regions of an object. Specifically, we adapt th...
This book contains five survey papers written on the topics of the tutorials presented at the 25th SIBGRAPI - Conference on Graphics, Patterns and Images, held in Ouro Preto, Minas Gerais, Brazil from August 22-25, 2012. This is the fourth year that tutorial papers from SIBGRAPI are published by IEEE CPS. The authors of accepted tutorials are invit...
This paper presents an image construction tool for biological image visualization and education using image matching and stitching approaches. The image matching technique is based on the algorithm SURF (Speeded-up Robust Feature) [3, 4], a successor to the popular feature detection algorithm SIFT (Scale Invariant Feature Transform) [1, 2]. Unlike...
Head pose is an important indicator of a person's attention, gestures, and communicative behavior with applications in human computer interaction, multimedia and vision systems. In this paper, we present a novel head pose estimation system by performing head region detection using the Kinect [2], followed by face detection, feature tracking, and fi...
This paper presents a novel dynamic curvature based approach (dynamic shape-index based approach) for 3D face analysis. This method is inspired by the idea of 2D dynamic texture and 3D surface descriptors. The dynamic texture (DT) based approaches [30][31][32] encode and model the local texture features in the temporal axis, and have achieved great...
3D facial representations have been widely used for face recognition. There has been intensive research on geometric matching and similarity measurement on 3D range data and 3D geometric meshes of individual faces. However, little investigation has been done on geometric measurement for 3D sketch models. In this paper, we study the 3D face recognit...
Research on 3D face models relies on extraction of feature points for segmentation, registration, or recognition. Robust feature
point extraction from pure geometric surface data is still a challenging issue. In this project, we attempt to automatically
extract feature points from 3D range face models without texture information. Human facial surfa...
In this paper, we present a vision-based human-computer interaction system, which integrates control components using multiple gestures, including eye gaze, head pose, hand pointing, and mouth motions. To track head, eye, and mouth movements, we present a two-camera system that detects the face from a fixed, wide-angle camera, estimates a rough loc...
True immersion of a user within a game is only possible when the world simulated looks and behaves as close to reality as possible. This implies that the game must ascertain, among other things, the user’s focus and his/her attitude towards the object or person focused on. As part of the effort to achieve this goal, we propose an eye gaze, head pos...
In this work we propose a new method for estimating the normal orientation of unorganized point clouds. Consistent assignment of normal orientation is a challenging task in the presence of sharp features, nearby surface sheets, noise, undersampling, and missing data. Existing approaches, which consider local geometric properties often fail when ope...
Understanding how humans recognize face sketches drawn by artists is of significant value to both criminal investigators and researchers in computer vision, face biometrics and cognitive psychology. However, large scale experimental studies of hand-drawn face sketches are still very limited in terms of the number of artists, the number of sketches,...
D face scans have been widely used for face modeling and face analysis. Due to the fact that face scans provide variable point clouds across frames, they may not capture complete facial data or miss point-to-point correspondences across various facial scans, thus causing difficulties to use such data for analysis. This paper presents an efficient a...
This Final Technical Report discusses the accomplishments of a research effort to develop a system for real time eye tracking and hand pointing tracking using regular cameras for human computer interaction. Several novel algorithms for eye detection and eyeball estimation were implemented, including a new model for eye gaze estimation and eye detec...
The ability to capture the direction the eyes point in while the subject is a distance away from the camera offers the potential for intuitive human-computer interfaces, allowing for a greater interactivity, more intelligent HCI behavior, and increased flexibility. In this paper, we present a two-camera system that detects the face from a fixed, wi...
Hand pointing has been an intuitive gesture for human interaction with computers. Big challenges are still posted for accurate estimation of finger pointing direction in a 3D space. In this paper, we present a novel hand pointing estimation system based on two regular cameras, which includes hand region detection, hand finger estimation, two views'...
We propose an automatic deformation-driven correspondence algorithm for 3D point sets of non-rigid articulated shapes. Our approach uses simple geometric cages to embed the point set data and extract and match a coarse set of prominent features. We seek feature correspondences which lead to low-distortion deformations of the cages while satisfying...