Fuchun Sun

Fuchun Sun
Tsinghua University | TH · Department of Computer Science and Technology

About

742
Publications
61,731
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
13,470
Citations

Publications

Publications (742)
Article
With the advance of smart material science, robotics is evolving from rigid robots to soft robots. Compared to rigid robots, soft robots can safely interact with the environment, easily navigate in unstructured fields, and be minimized to operate in narrow spaces, owning to the new actuation and sensing technologies developed by the smart materials...
Article
The stochastic Takagi-Sugeno (T-S) fuzzy discrete dynamic output feedback control is investigated for a class of stochastic T-S fuzzy discrete networked control systems. Firstly, the T-S fuzzy model and stochastic Bernoulli theory are employed to approximate the discrete system plant. Secondly, the T-S fuzzy model and stochastic Bernoulli theory ar...
Article
Full-text available
Tongue diagnosis is a convenient and noninvasive clinical practice of traditional Chinese medicine (TCM), having existed for thousands of years. Prickle, as an essential indicator in TCM, appears as a large number of red thorns protruding from the tongue. The term “prickly tongue” has been used to describe the flow of qi and blood in TCM and assess...
Article
For robot in human environment, it has always been expected that the robot can execute specified tasks following language instructions. Most current methods only rely on visual perception to understand the language instruction, while it may be not sufficient to fully interpret some language instructions when visually identical objects exist. In thi...
Preprint
Few-shot learning models learn representations with limited human annotations, and such a learning paradigm demonstrates practicability in various tasks, e.g., image classification, object detection, etc. However, few-shot object detection methods suffer from an intrinsic defect that the limited training data makes the model cannot sufficiently exp...
Article
Full-text available
The natural interaction between the prosthetic hand and the upper limb amputation patient is important and directly affects the rehabilitation effect and operation ability. Most previous studies only focused on the interaction of gestures but ignored the force. This paper proposes a simultaneous recognition method of gestures and forces for interac...
Preprint
Full-text available
The prevailing graph neural network models have achieved significant progress in graph representation learning. However, in this paper, we uncover an ever-overlooked phenomenon: the pre-trained graph representation learning model tested with full graphs underperforms the model tested with well-pruned graphs. This observation reveals that there exis...
Preprint
Full-text available
The ability to handle objects in cluttered environment has been long anticipated by robotic community. However, most of works merely focus on manipulation instead of rendering hidden semantic information in cluttered objects. In this work, we introduce the scene graph for embodied exploration in cluttered scenarios to solve this problem. To validat...
Conference Paper
Recent works explore learning graph representations in a self-supervised manner. In graph contrastive learning, benchmark methods apply various graph augmentation approaches. However, most of the augmentation methods are non-learnable, which causes the issue of generating unbeneficial augmented graphs. Such augmentation may degenerate the represent...
Article
Multifingered hand dexterous manipulation is quite challenging in the domain of robotics. One remaining issue is how to achieve compliant behaviors. In this work, we propose a human-in-the-loop learning-control approach for acquiring compliant grasping and manipulation skills of a multifinger robot hand. This approach takes the depth image of the h...
Article
Keypoint detection and description play a central role in computer vision. Most existing methods are in the form of scene-level prediction, without returning the object classes of different keypoints. In this paper, we propose the object-centric formulation, which, beyond the conventional setting, requires further identifying which object each inte...
Preprint
Full-text available
Graph instance contrastive learning has been proved as an effective task for Graph Neural Network (GNN) pre-training. However, one key issue may seriously impede the representative power in existing works: Positive instances created by current methods often miss crucial information of graphs or even yield illegal instances (such as non-chemically-a...
Article
We propose a deep fine-grained multi-level fusion architecture for monocular 3D object detection, with an additionally designed anti-occlusion optimization process. Conventional monocular 3D object detection methods usually leverage geometry constraints such as keypoints, object shape relationships, and 3D to 2D optimizations to offset the lack of...
Preprint
Full-text available
Detecting 3D keypoints from point clouds is important for shape reconstruction, while this work investigates the dual question: can shape reconstruction benefit 3D keypoint detection? Existing methods either seek salient features according to statistics of different orders or learn to predict keypoints that are invariant to transformation. Neverthe...
Preprint
Full-text available
Many adaptations of transformers have emerged to address the single-modal vision tasks, where self-attention modules are stacked to handle input sources like images. Intuitively, feeding multiple modalities of data to vision transformers could improve the performance, yet the inner-modal attentive weights may also be diluted, which could thus under...
Article
Purpose The purpose of this paper is to present a novel tactile sensor and a visual-tactile recognition framework to reduce the uncertainty of the visual recognition of transparent objects. Design/methodology/approach A multitask learning model is used to recognize intuitive appearance attributes except texture in the visual mode. Tactile mode ado...
Preprint
After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation. Unfortunately, straightforwardly applying local ViTs in domain adaptive semantic segmentation does not bring in expected improvement. We find that the pitfall of local ViTs is due to th...
Preprint
Full-text available
Learning to reason about relations and dynamics over multiple interacting objects is a challenging topic in machine learning. The challenges mainly stem from that the interacting systems are exponentially-compositional, symmetrical, and commonly geometrically-constrained. Current methods, particularly the ones based on equivariant Graph Neural Netw...
Article
Full-text available
Currently, robotic grasping methods based on sparse partial point clouds have attained excellent grasping performance on various objects. However, they often generate wrong grasping candidates due to the lack of geometric information on the object. In this work, we propose a novel and robust sparse shape completion model (TransSC). This model has a...
Article
Sensory perception for dexterous robotic hands is an active research area and recent progress in robotics. Effective dexterous manipulation requires robotic hands to accurately feedback their state or perceive the surrounding environment. This article reviews the state-of-the-art of sensory perception for dexterous robotic manipulation. Two types o...
Preprint
Grasping has long been considered an important and practical task in robotic manipulation. Yet achieving robust and efficient grasps of diverse objects is challenging, since it involves gripper design, perception, control and learning, etc. Recent learning-based approaches have shown excellent performance in grasping a variety of novel objects. How...
Preprint
Full-text available
Audio-visual navigation task requires an agent to find a sound source in a realistic, unmapped 3D environment by utilizing egocentric audio-visual observations. Existing audio-visual navigation works assume a clean environment that solely contains the target sound, which, however, would not be suitable in most real-world applications due to the une...
Preprint
Full-text available
Equivariant Graph neural Networks (EGNs) are powerful in characterizing the dynamics of multi-body physical systems. Existing EGNs conduct flat message passing, which, yet, is unable to capture the spatial/dynamical hierarchy for complex systems particularly, limiting substructure discovery and global information fusion. In this paper, we propose E...
Preprint
Full-text available
Keypoint detection and description play a central role in computer vision. Most existing methods are in the form of scene-level prediction, without returning the object classes of different keypoints. In this paper, we propose the object-centric formulation, which, beyond the conventional setting, requires further identifying which object each inte...
Article
The rich content in various real-world networks such as social networks, biological networks, and communication networks provides unprecedented opportunities for unsupervised machine learning on graphs. This paper investigates the fundamental problem of preserving and extracting abundant information from graph-structured data into embedding space w...
Article
In this article, the problem of exponential mean-square stability analysis is discussed for uncertain networked control systems expressed by a stochastic T-S fuzzy model. In general, the characteristics of random occurrence for multipath packet dropouts often exist in the signal transmission network. For dealing with this difficult point, a dynamic...
Preprint
Full-text available
In the Vision-and-Language Navigation task, the embodied agent follows linguistic instructions and navigates to a specific goal. It is important in many practical scenarios and has attracted extensive attention from both computer vision and robotics communities. However, most existing works only use RGB images but neglect the 3D semantic informatio...
Chapter
The Vision Transformer (ViT) [6] directly applies a Transformer architecture to image classification and achieves an impressive result compared with convolutional networks. This paper presents a new ViT-base camouflaged object segmentation method, called COS Transformer, which aims to identify and segment objects concealed in a complex environment....
Preprint
Full-text available
Recent works explore learning graph representations in a self-supervised manner. In graph contrastive learning, benchmark methods apply various graph augmentation approaches. However, most of the augmentation methods are non-learnable, which causes the issue of generating unbeneficial augmented graphs. Such augmentation may degenerate the represent...
Chapter
This article introduces pedestrian trajectory prediction, which is a crucial step in the perception of autonomous driving. The controller system should predict the person’s motion before making a decision. Pedestrian trajectory prediction can be divided into two sub-problems: modeling historical trajectories and modeling pedestrian social relations...
Article
It has always been a great challenge for the robot to navigate in the visual world following natural language instructions. Recently, several tasks such as the Vision-andLanguage Navigation (VLN) and Remote Embodied Visual Referring Expression in Real Indoor Environments (REVERIE) are proposed trying to solve this challenge. And the most significan...
Article
Full-text available
Embodiment is an important characteristic for all intelligent agents, while existing scene description tasks mainly focus on analyzing images passively and the semantic understanding of the scenario is separated from the interaction between the agent and the environment. In this work, we propose the Embodied Scene Description, which exploits the em...
Article
This paper addresses the delay-dependent Takagi-Sugeno (T-S) fuzzy state feedback control and exponential admissibility analysis for a class of T-S fuzzy singular uncertain systems. Firstly, the T-S fuzzy model is employed to approximate the singular uncertain system with time-varying delay, saturation input and unmatched disturbance. Secondly, the...
Article
Typically, the fingertips of a dexterous robotic hand are designed to be rigid and equipped with a variety of sensors to provide tactile perception. However, this design scheme renders the grasping of small objects difficult because rigid fingertips usually have a small contact area with the objects. In this paper, we propose a novel fingertip desi...
Article
This article studies the robust intelligent control for the longitudinal dynamics of flexible hypersonic flight vehicle with input dead zone. Considering the different time-scale characteristics among the system states, the singular perturbation decomposition is employed to transform the rigid-elastic coupling model into the slow dynamics and the f...
Preprint
Full-text available
Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge -- it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. t...
Article
Full-text available
Convolutional neural network (CNN) has been gradually applied to steady-state visual evoked potential (SSVEP) of the brain-computer interface (BCI). Frequency-domain features extracted by fast Fourier Transform (FFT) or time-domain signals are used as network input. In the frequency-domain diagram, the features at the short time-window are not obvi...
Preprint
Full-text available
In the low-bit quantization field, training Binary Neural Networks (BNNs) is the extreme solution to ease the deployment of deep models on resource-constrained devices, having the lowest storage cost and significantly cheaper bit-wise operations compared to 32-bit floating-point counterparts. In this paper, we introduce Sub-bit Neural Networks (SNN...
Article
Full-text available
The high positioning accuracy of robotic systems is fundamental and critical for dexterous and precise manipulation. In this paper, a novel accurate positioning method based on vision and tactile sensors for object pose estimation in robotic manipulation is proposed to improve the positioning accuracy of robotic systems. The proposed methodology ma...
Preprint
Referring expressions are commonly used when referring to a specific target in people's daily dialogue. In this paper, we develop a novel task of audio-visual grounding referring expression for robotic manipulation. The robot leverages both the audio and visual information to understand the referring expression in the given manipulation instruction...
Preprint
In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given. It is a meaningful task inspiring a surge of relevant research. However, most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor faul...
Preprint
Full-text available
In this paper, we propose a novel Knowledge-based Embodied Question Answering (K-EQA) task, in which the agent intelligently explores the environment to answer various questions with the knowledge. Different from explicitly specifying the target object in the question as existing EQA work, the agent can resort to external knowledge to understand mo...
Article
The interval type-2 Takagi-Sugeno (T-S) fuzzy dynamic output feedback and H-infinity stability analysis is studied for a class of networked control systems with multiple time-varying additive uncertainties, time-varying signal communication delay and external disturbance. Firstly, the interval type-2 T-S fuzzy is employed to denote the system plant...
Preprint
Full-text available
Tactile sensing plays an important role in robotic perception and manipulation tasks. To overcome the real-world limitations of data collection, simulating tactile response in a virtual environment comes as a desirable direction of robotic research. In this paper, we propose Elastic Interaction of Particles (EIP) for tactile simulation. Most existi...
Preprint
Full-text available
We propose a compact and effective framework to fuse multimodal features at multiple layers in a single network. The framework consists of two innovative fusion schemes. Firstly, unlike existing multimodal methods that necessitate individual encoders for different modalities, we verify that multimodal features can be learnt within a shared single n...
Article
Full-text available
In several epidemic diseases, one of the main symptoms exhibited by people is abnormal body temperature. Therefore, monitoring body temperature is crucial for preventing the spread of infectious diseases and facilitating timely responses. This study presents a wearable bracelet that can be used as a temperature monitoring and trajectory analysis sy...
Article
Full-text available
Gestures recognition based on surface electromyography (sEMG) has been widely used for human-computer interaction. However, there are few research studies on overcoming the influence of physiological factors among different individuals. In this paper, a cross-individual gesture recognition method based on long short-term memory (LSTM) networks is p...
Article
The H∞ stability analysis and interval type‐2 Takagi‐Sugeno (T‐S) fuzzy control is studied for a class of interval type‐2 T‐S fuzzy systems. Firstly, the interval type‐2 T‐S fuzzy model is employed to approximate the system plant. Secondly, the delay‐dependent interval type‐2 T‐S fuzzy dynamic output feedback controller is designed. Thirdly, two cl...
Preprint
Currently, robotic grasping methods based on sparse partial point clouds have attained a great grasping performance on various objects while they often generate wrong grasping candidates due to the lack of geometric information on the object. In this work, we propose a novel and robust shape completion model (TransSC). This model has a transformer-...