
Junfeng Yao- Ph.D
- Head of Department at Xiamen University
Junfeng Yao
- Ph.D
- Head of Department at Xiamen University
About
141
Publications
26,022
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,393
Citations
Introduction
Current institution
Additional affiliations
January 2004 - February 2021
School of Informatics,Xiamen University
Position
- Head of Faculty
Publications
Publications (141)
Aspect Sentiment Quad Prediction (ASQP) is the most complex subtask of Aspect-based Sentiment Analysis (ABSA), aiming to predict all sentiment quadruples within the given sentence. Due to the complexity of sentence syntaxes and the diversity of sentiment expressions, generative methods gradually become the mainstream approach in ASQP. However, exis...
RALMs (Retrieval-Augmented Language Models) broaden their knowledge scope by incorporating external textual resources. However, the multilingual nature of global knowledge necessitates RALMs to handle diverse languages, a topic that has received limited research focus. In this work, we propose \textit{Futurepedia}, a carefully crafted benchmark con...
Gesture recognition based on surface electromyography (sEMG) has seen considerable improvements in performance across various tasks and metrics with the rapid development of deep learning. However, challenges still exist in current deep neural networks for sEMG recognition. For instance, convolutional neural networks exhibit poor capturing of globa...
Due to the continuous emergence of new data, version updates have become an indispensable requirement for Large Language Models (LLMs). The training paradigms for version updates of LLMs include pre-training from scratch (PTFS) and continual pre-training (CPT). Preliminary experiments demonstrate that PTFS achieves better pre-training performance,...
Conversational query generation aims at producing search queries from dialogue histories, which are then used to retrieve relevant knowledge from a search engine to help knowledge-based dialogue systems. Trained to maximize the likelihood of gold queries, previous models suffer from the data hunger issue, and they tend to both drop important concep...
The existing methods for audio-driven talking head video editing have the limitations of poor visual effects. This paper tries to tackle this problem through editing talking face images seamless with different emotions based on two modules: (1) an audio-to-landmark module, consisting of the CrossReconstructed Emotion Disentanglement and an alignmen...
Dynamic reconstruction of deformable tissues in endoscopic video is a key technology for robot-assisted surgery. Recent reconstruction methods based on neural radiance fields (NeRFs) have achieved remarkable results in the reconstruction of surgical scenes. However, based on implicit representation, NeRFs struggle to capture the intricate details o...
Cross-document Relation Extraction aims to predict the relation between target entities located in different documents. In this regard, the dominant models commonly retain useful information for relation prediction via bridge entities, which allows the model to elaborately capture the intrinsic interdependence between target entities. However, thes...
Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have been widely employed in medical image segmentation. While CNNs excel in local feature encoding, their ability to capture long-range dependencies is limited. In contrast, ViTs have strong global modeling capabilities. However, existing attention-based ViT models face difficulti...
Medical image segmentation plays a crucial role in medical artificial intelligence. Recent advancements in computer vision have introduced multiscale ViT (Vision Transformer), revealing its robustness and superior feature extraction capabilities. However, the independent processing of data patches by ViT often leads to insufficient attention to fin...
Optical Music Recognition (OMR) is a research field aimed at exploring how computers can read sheet music in music documents. In this paper, we propose an end-to-end OMR model based on memory units optimization and attention mechanisms, named ATTML. Firstly, we replace the original LSTM memory unit with a better Mogrifier LSTM memory unit, which en...
Micro-expressions (MEs) have the characteristics of small motion amplitude and short duration. How to learn discriminative ME features is a key issue in ME recognition. Motivated by the success of PCB model in person retrieval, this paper proposes a ME recognition method called PCB-PCANet+. Considering that the important information of MEs is mainl...
Recent knowledge graph embedding models have shown promising results on link prediction, by employing operations on quaternion space to capture correlations between entities. However, they only used three quaternion embeddings for rotation calculation that fails to capture the interaction between entities and relations. The single relation quaterni...
Recently, pioneering work has improved segmentation performance by combining the self-attention (SA) mechanism with UNet. However, since SA can only model its own features in a single sample, it ignores the potential relevance of the whole dataset. Additionally, medical image datasets are typically small, making it crucial to obtain as many feature...
Molecular representation learning is a fundamental problem in the field of drug discovery and molecular science. Whereas incorporating molecular 3D information in the representations of molecule seems beneficial, which is related to computational chemistry with the basic task of predicting stable 3D structures (conformations) of molecules. Existing...
Detecting pneumonia, especially coronavirus disease 2019 (COVID-19), from chest X-ray (CXR) images is one of the most effective ways for disease diagnosis and patient triage. The application of deep neural networks (DNNs) for CXR image classification is limited due to the small sample size of the well-curated data. To tackle this problem, this arti...
In real-world systems, scaling has been critical for improving the translation quality in autoregressive translation (AT), which however has not been well studied for non-autoregressive translation (NAT). In this work, we bridge the gap by systematically studying the impact of scaling on NAT behaviors. Extensive experiments on six WMT benchmarks ov...
Recovering three-dimensional structure from images is one of the important researches in computer vision. The quality of feature matching is one of the keys to obtaining more accurate results. However, as different objects or different surfaces of objects have similar images with the same elements and different typography, the camera pose estimatio...
Conversational discourse analysis aims to extract the interactions between dialogue turns, which is crucial for modeling complex multi-party dialogues. As the benchmarks are still limited in size and human annotations are costly, the current standard approaches apply pretrained language models, but they still require randomly initialized classifier...
Transformer benefits from the high parallelization of attention networks in fast training, but it still suffers from slow decoding partially due to the linear dependency O(m) of the decoder self-attention on previous target words at inference. In this paper, we propose a generalized average attention network (AAN+) aiming at speeding up decoding by...
The proposed method is to solve the robot navigation problem ina crowded environment. The complexity of navigation in the crowdincreases with the number of people. With the increase of the numberof pedestrians in the crowd, the success rate of the existing methodsis very low. Recent research shows that deep reinforcement learninghas good solving ab...
Recent works on fine-grained visual categorization rely on detecting discriminative regions that correspond to specific visual patterns. Promising progress has been obtained by constructing complicated network architecture, which either involves explicitly or implicitly capturing subtle differences to learn part-level representations. Instead of so...
The Open-Set recognition is an important topic in the pattern recognition research field. Different from the close-set recognition task, in the open-set recognition problem, the test data contains unknown classes that do not appear in the training phase. Consequently, the recognition of the open-set data is much more difficult than that of the clos...
Point cloud processing has received more attention in recent years. Due to the huge amount of data, using supervoxels to pre-segment the points can improve the performance of point cloud processing tasks. There are some supervoxel algorithms generating high-quality results, but their low efficiency hinders the wide application in point cloud proces...
Finding target molecules with specific chemical properties plays a decisive role in drug development. We proposed GEOM-CVAE, a constrained variational autoencoder based on geometric representation for molecular generation with specific properties, which is protein-context-dependent. In terms of machine learning, it includes continuous feature embed...
Motivation
Polypharmacy is the combined use of drugs for the treatment of diseases. However, it often shows a high risk of side effects. Due to unnecessary interactions of combined drugs, the side effects of polypharmacy increase the risk of disease and even lead to death. Thus, obtaining abundant and comprehensive information on the side effects o...
Deep learning has brought a rapid development in the aspect of molecular representation for various tasks, such as molecular property prediction. The prediction of molecular properties is a crucial task in the field of drug discovery for finding specific drugs with good pharmacological activity and pharmacokinetic properties. SMILES string is alway...
Despite the development of computer vision techniques, the micro-expression (ME) recognition task still remains a great challenge because MEs have very low intensity and short duration. However, the ME recognition is of great significance since it provides important clues for real affective states detection. This paper proposes a novel Block Divisi...
In the intelligently processing of the tongue image, one of the most important tasks is to accurately segment the tongue body from a whole tongue image, and the good quality of tongue body edge processing is of great significance for the relevant tongue feature extraction. To improve the performance of the segmentation model for tongue images, we p...
Currently, the most dominant neural code generation models are often equipped with a tree-structured LSTM decoder, which outputs a sequence of actions to construct an Abstract Syntax Tree (AST) via pre-order traversal. However, such a decoder has two obvious drawbacks. First, except for the parent action, other faraway and important history actions...
Event-Related Desynchronization (ERD) or Electroencephalogram (EEG) wavelet is essential for motor imagery (MI) classification and BMI (Brain–Machine Interface) application. However, it is difficult to recognize multiple tasks for non-trained subjects that are indispensable for the complexities of the task or the uncertainties in the environment. T...
Video over-segmentation into supervoxels is an important pre-processing technique for many computer vision tasks. Videos are an order of magnitude larger than images. Most existing methods for generating supervovels are either memory-or time-inefficient, which limits their application in subsequent video processing tasks. In this paper, we present...
Dominant sentence ordering models can be classified into pairwise ordering models and set-to-sequence models. However, there is little attempt to combine these two types of models, which inituitively possess complementary advantages. In this paper, we propose a novel sentence ordering framework which introduces two classifiers to make better use of...
With the development of Electric Vehicle (EV) technology, the new generation of EVs combine the advantages of both Wireless Charging Technology (WCT) and Plug-in Charging Technology (PCT) to extend their transport distance. However, some difficulties are emerged in this hybrid charging strategy based EVs for Vehicle Routing Problem (VRP). First, th...
We propose a framework of efficient nonlinear deformable simulation with both fast continuous collision detection and robust collision resolution. We name this new framework Medial IPC as it integrates the merits from medial elastics, for an efficient and versatile reduced simulation, as well as incremental potential contact, for a robust collision...
Conversational discourse structures aim to describe how a dialogue is organized, thus they are helpful for dialogue understanding and response generation. This paper focuses on predicting discourse dependency structures for multi-party dialogues. Previous work adopts incremental methods that take the features from the already predicted discourse re...
Chunyan Li Wei Wei Jin Li- [...]
Zhihan Lv
Studying the deep learning-based molecular representation has great significance on predicting molecular property, promoted the development of drug screening and new drug discovery, and improving human well-being for avoiding illnesses. It is essential to learn the characterization of drug for various downstream tasks, such as molecular property pr...
Existing machine learning methods for classification and recognition of EEG motor imagery usually suffer from reduced accuracy for limited training data. To address this problem, this paper proposes a multi-rhythm capsule network (FBCapsNet) that uses as little EEG information as possible with key features to classify motor imagery and further impr...
Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions corresponding to the pre-order traversal of an Abstract Syntax Tree. However, such a decoder only exploits the preorder tr...
Code generation aims to automatically generate a piece of code given an input natural language utterance. Currently, among dominant models, it is treated as a sequence-to-tree task, where a decoder outputs a sequence of actions corresponding to the pre-order traversal of an Abstract Syntax Tree. However, such a decoder only exploits the pre-order t...
Motivation:
Geometry-based properties and characteristics of drug molecules play an important role in drug development for virtual screening in computational chemistry. The 3D characteristics of molecules largely determine the properties of the drug and the binding characteristics of the target. However, most of the previous studies focused on 1D...
As an important text coherence modeling task, sentence ordering aims to coherently organize a given set of unordered sentences. To achieve this goal, the most important step is to effectively capture and exploit global dependencies among these sentences. In this paper, we propose a novel and flexible external knowledge enhanced graph-based neural n...
The medial axis transform (MAT) of a 3D shape includes the set of centers and radii of the maximally inscribed spheres, and is a complete shape descriptor that can be used to reconstruct the original shape. It is a compact representation that jointly describes geometry, topology, and symmetry properties of a given shape. In this work, we present P2...
We propose a framework for the interactive simulation of nonlinear deformable objects. The primary feature of our system is the seamless integration of deformable simulation and collision culling, which are often independently handled in existing animation systems. The bridge connecting them is the medial axis transform (MAT), a high-fidelity volum...
Electroencephalography (EEG) topographical representation (ETR) can monitor regional brain activities and is emerging as a successful technique for causally exploring cortical mechanisms and connections. However, it is a challenge to find a robust method supporting high-dimensional EEG data with low signal-to-noise ratios from multiple objects and...
Based on a unified encoder-decoder framework with attentional mechanism, neural machine translation (NMT) models have attracted much attention and become the mainstream in the community of machine translation. Generally, the NMT decoders produce translation in a left-to-right way. As a result, only left-to-right target-side contexts from the genera...
In this paper, we propose a human-marionette interaction system to enhance the interactivity of marionette show. The proposed system is composed of a mechanical arm, an L-shaped screen, a Kinect, a computer, and audio equipment. Using gesture recognition and voice recognition, this system is designed to recognize the audience's gestures and voice t...
Extracting a faithful and compact representation of an animated surface mesh is an important problem for computer graphics. However, the surface‐based methods have limited approximation power for volume preservation when the animated sequences are extremely simplified. In this paper, we introduce Deformable Medial Axis Transform (DMAT), which is de...
Superpixel segmentation is a popular image pre‐processing technique in many computer vision applications. In this paper, we present a novel superpixel generation algorithm by agglomerative clustering with quadratic error minimization. We use a quadratic error metric (QEM) to measure the difference of spatial compactness and colour homogeneity betwe...
The human eye's state of motion and content of interest can express people's cognitive status and emotional status based on their situation. When observing the surrounding things, the human eyes make different eye movements according to the observed objects which reflects human's attention and interest. In this paper, we capture and analyze pattern...
Many appearance-based and geometry-based approaches have been proposed in facial expression recognition. In this paper, we propose a method of learning and combining spatiotemporal features and geometric features for video-based expression recognition. Specifically, we first adopt a multi-layer independent subspace analysis (ISA) network to learn s...
Purpose
In the process of robot shell design, it is necessary to match the shape of the input 3D original character mesh model and robot endoskeleton, in order to make the input model fit for robot and avoid collision. So, the purpose of this paper is to find an object of reference, which can be used for the process of shape matching.
Design/metho...
The medial axis is a natural skeleton for shapes. However, it is rarely used in the existing skeleton-based shape deformation techniques. In this paper, we propose a novel medial-axis-driven skin surface deformation algorithm with volume preservation property. Specifically, an as-rigid-as-possible deformation scheme is used to deform the medial axi...
Translation model containing translation rules with probabilities plays a crucial role in statistical machine translation. Conventional method estimates translation probabilities with only the consideration of cooccurrence frequencies of bilingual translation units, while ignoring document-level context information. In this paper, we extend the con...
Designing a robot’s appearance is a challenging task because the design should be both aesthetically appealing and physically functional. Therefore, this task was previously limited to experts with professional knowledge and experiences. Given the increasing popularity of consumer-level robots, non-professional users are expecting tools that allow...
This poster presents a surgical training system for four medical punctures based on virtual reality and haptic feedback, including a client program developed in the Unity3D game engine and a server program developed by PHP. This system provides the immersive surgery simulation for thoracentesis, lumbar puncture, bone marrow puncture and abdominal p...
Due to the high complexity of vascular system network, the geometry reconstruction of vasculatures from raw medical datasets remains a very challenging task. In this paper, we present a novel skeleton-based method for the geometry reconstruction of vascular structures from standard 3D medical datasets. With the proposed techniques, the geometry of...
In plastic injection industry, manual adjustment operation mode is applied for operational workers to prevent mold from sticking together by observing hand-touching approaches, which is characteristic of extremely low efficiency and high labor cost. In this paper we developed an automatic mold monitoring system which reduces the mold repairing cost...
Researchers have proposed various techniques for cloth simulation in the last decades. The crucial problem in interactive animation for cloth is how to speed up simulation with a stable system. In this paper, we describe a realistic and stable scheme for cloth simulation based on mass-spring system. This scheme modifies semi-implicit integration us...