About
128
Publications
23,864
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,730
Citations
Current institution
Publications
Publications (128)
The reconstruction of low-textured areas is a prominent research focus in multi-view stereo (MVS). In recent years, traditional MVS methods have performed exceptionally well in reconstructing low-textured areas by constructing plane models. However, these methods often encounter issues such as crossing object boundaries and limited perception range...
Recently, patch deformation-based methods have demonstrated significant strength in multi-view stereo by adaptively expanding the reception field of patches to help reconstruct textureless areas. However, such methods mainly concentrate on searching for pixels without matching ambiguity (i.e., reliable pixels) when constructing deformed patches, wh...
Patch deformation-based methods have recently exhibited substantial effectiveness in multi-view stereo, due to the incorporation of deformable and expandable perception to reconstruct textureless areas. However, such approaches typically focus on exploring correlative reliable pixels to alleviate match ambiguity during patch deformation, but ignore...
Recently, patch-deformation methods have exhibited significant effectiveness in multi-view stereo owing to the deformable and expandable patches in reconstructing textureless areas. However, such methods primarily emphasize broadening the receptive field in textureless areas, while neglecting deformation instability caused by easily overlooked edge...
The reconstruction of low-textured areas is a prominent research focus in multi-view stereo (MVS). In recent years, traditional MVS methods have performed exceptionally well in reconstructing low-textured areas by constructing plane models. However, these methods often encounter issues such as crossing object boundaries and limited perception range...
Patch deformation-based methods have recently exhibited substantial effectiveness in multi-view stereo, due to the incorporation of deformable and expandable perception to reconstruct textureless areas. However, such approaches typically focus on exploring correlative reliable pixels to alleviate match ambiguity during patch deformation, but ignore...
Simulating vehicle trajectories at intersections is one of the challenging tasks in traffic simulation. Existing methods are often ineffective due to the complexity and diversity of lane topologies at intersections, as well as the numerous interactions affecting vehicle motion. To address this issue, we propose a deep learning based vehicle traject...
Reconstructing textureless areas in MVS poses challenges due to the absence of reliable pixel correspondences within fixed patch. Although certain methods employ patch deformation to expand the receptive field, their patches mistakenly skip depth edges to calculate areas with depth discontinuity, thereby causing ambiguity. Consequently, we introduc...
Predicting the trajectories of pedestrians is an important and difficult task for many applications, such as robot navigation and autonomous driving. Most of the existing methods believe that an accurate prediction of the pedestrian intention can improve the prediction quality. These works tend to predict a fixed destination coordinate as the agent...
In this paper, we introduce Segmentation-Driven Deformation Multi-View Stereo (SD-MVS), a method that can effectively tackle challenges in 3D reconstruction of textureless areas. We are the first to adopt the Segment Anything Model (SAM) to distinguish semantic instances in scenes and further leverage these constraints for pixelwise patch deformati...
Trajectory prediction with uncertainty is a critical and challenging task for autonomous driving. Nowadays, we can easily access sensor data represented in multiple views. However, cross-view consistency has not been evaluated by the existing models, which might lead to divergences between the multimodal predictions from different views. It is not...
Trajectory prediction with uncertainty is a critical and challenging task for autonomous driving. Nowadays, we can easily access sensor data represented in multiple views. However, cross-view consistency has not been evaluated by the existing models, which might lead to divergences between the multimodal predictions from different views. It is not...
This paper proposes a novel crowd simulation method which integrates not only modelling ideas but also advantages from both data‐driven methods and crowd dynamics methods. To seamlessly integrate these two different modelling ideas, first, a fusion crowd motion model is developed. In this model the motion of crowd are driven dynamically by differen...
Continuous sign language recognition (CSLR) is an essential task for communication between hearing-impaired and people without limitations, which aims at aligning low-density video sequences with high-density text sequences. The current methods for CSLR were mainly based on convolutional neural networks. However, these methods perform poorly in bal...
Multi-view stereo is an important research task in computer vision while still keeping challenging. In recent years, deep learning-based methods have shown superior performance on this task. Cost volume pyramid network-based methods which progressively refine depth map in coarse-to-fine manner, have yielded promising results while consuming less me...
As a typical sequence to sequence task, sign language production (SLP) aims to automatically translate spoken language sentences into the corresponding sign language sequences. The existing SLP methods can be classified into two categories: autoregressive and non-autoregressive SLP. The autoregressive methods suffer from high latency and error accu...
This paper explores the evacuation behavior of crowds during terrorist attacks. We extend a floor field model for a simulation of dual-role crowds in a three-dimensional (3D) space. In this model, pedestrians can bypass obstacles and move to target positions when avoiding attackers. An attacker can bypass obstacles to pursue target pedestrians. In...
Bicycle motion simulation is fundamental to urban transportation planning, virtual reality and other areas. This article proposes a data‐driven based double‐layer bicycle simulation model to consider the cyclist's decision‐making process and the bicycle's kinematic structure. This proposed model consists of two layers, the decision‐making layer and...
Multi-view stereo is an important research task in computer vision while still keeping challenging. In recent years, deep learning-based methods have shown superior performance on this task. Cost volume pyramid network-based methods which progressively refine depth map in coarse-to-fine manner, have yielded promising results while consuming less me...
Fish morphology is an essential basis for fishery management, as it can reflect the growth status of fishes. Noncontact 3D reconstruction of underwater fish is a new way to obtain fish morphology. While it is difficult to reconstruct fish on account of the inadequate information caused by fish swimming and poor underwater imaging. This article intr...
Green screen keying has always been an essential and fundamental part of film and television special effects. In the actual shooting process, captured green screen images vary significantly due to the comprehensive influence of lighting, shooting angle, green cloth material, characters, etc. In order to obtain visually pleasing effects, traditional...
There is a well‐known trade‐off between computational efficiency and computational accuracy in the field of traffic simulation. In this article, we propose a novel recurrent neural network based model with an integrated attention mechanism, called R‐CTM, to simulate heterogeneous traffic flow with multiple types of vehicles. It can effectively extr...
Crowd simulation is a challenging problem, aiming to generate realistic pedestrians motions in virtual environment. Nowadays, ORCA is a widely used simulation algorithm in practice because of its stable and efficient performance. However, this algorithm cannot regenerate continuity and diversity of pedestrian motions in real data, leading to defect...
Small motion can be induced from burst video clips captured by a handheld camera when the shutter button is pressed. Although uncalibrated burst video clip conveys valuable parallax information, it generally has small baseline between frames, making it difficult to reconstruct 3D scenes. Existing methods usually employ a simplified camera parameter...
Sign language production aims to automatically generate coordinated sign language videos from spoken language. As a typical sequence to sequence task, the existing methods are mostly to regard the skeletons as a whole sequence, however, those do not take the rich graph information among both joints and edges into consideration. In this paper, we pr...
Behaving safe and efficient navigation policy without knowing surrounding agents’ intent is a hard problem. This problem is challenging for two reasons: the agent need to face high environment uncertainty for it can’t control other agents in the environment. Moreover, the navigation algorithm need to be resilient to various scenes. Recently reinfor...
Confidence prediction task attempts to infer the correctness of estimated depth hypotheses which has gained popularity recently in stereo matching and boosts the accuracy of disparity estimation. However, less attention is paid on confidence prediction of Multiview stereo (MVS), where multi-view depth estimation is a key step for high-quality recon...
Synthesizing indoor scene layouts is challenging and critical, especially for digital design and gaming entertainment. Although there has been significant research on the indoor layout synthesis of rectangular-shaped or L-shaped architecture, there is little known about synthesizing plausible layouts for more complicated indoor architecture with bo...
Garment transfer from a source mannequin to a shape-varying individual is a vital technique in computer graphics. Existing garment transfer methods are either time consuming or lack designed details especially for clothing with complex styles. In this paper, we propose a data-driven approach to efficiently transfer garments between two distinctive...
This work presents a novel First-person View based Trajectory predicting model (FvTraj) to estimate the future trajectories of pedestrians in a scene given their observed trajectories and the corresponding first-person view images. First, we render first-person view images using our in-house built First-person View Simulator (FvSim), given the grou...
We propose a method for simulating cloth with meshes dynamically refined according to visual saliency. It is a common belief that it is preferable for the regions of an image being viewed to have more details than others. For a certain scene, a low-resolution cloth mesh is first simulated and rendered into images in the preview stage. Pixel salienc...
This work presents a novel First-person View based Trajectory predicting model (FvTraj) to estimate the future trajectories of pedestrians in a scene given their observed trajectories and the corresponding first-person view images. First, we render first-person view images using our in-house built First-person View Simulator (FvSim), given the grou...
Trajectory prediction for objects is challenging and critical for various applications (e.g., autonomous driving, and anomaly detection). Most of the existing methods focus on homogeneous pedestrian trajectories prediction, where pedestrians are treated as particles without size. However , they fall short of handling crowded vehicle-pedestrian-mixe...
Human trajectory prediction is challenging and critical in various applications (e.g., autonomous vehicles and social robots). Because of the continuity and foresight of the pedestrian movements, the moving pedestrians in crowded spaces will consider both spatial and temporal interactions to avoid future collisions. However, most of the existing me...
Recently, mobile devices such as iPhone X start to be equipped with depth cameras, and more applications based on captured depth maps are emerging. Among many depth cameras on the market, Intel RealSense has the ability to capture depth information and is expected to be widely used in mobile devices and laptops. However, depth maps captured by Real...
Multivariate visualization for atmospheric pollution is a challenging research topic. Appropriate algorithms and data structures based on modern graphics hardware are used to obtain high performance. 3D visualization of the atmospheric wind field and pollutant concentrations can easily result in visual perception problems such as occlusion and clut...
Virtualized traffic via various simulation models and real‐world traffic data are promising approaches to reconstruct detailed traffic flows. A variety of applications can benefit from the virtual traffic, including, but not limited to, video games, virtual reality, traffic engineering and autonomous driving. In this survey, we provide a comprehens...
Fluid animation has great value in study and application in many fields, such as video special effects, virtual reality and so on. However, due to the complexity and irregularity of fluid's motion, the existing simulation methods cannot make a good tradeoff between computational efficiency and realism. In this paper, a method of fluid animation syn...
Most of existing traffic simulation methods have been focused on simulating vehicles on freeways or city-scale urban networks. However, relatively little research has been done to simulate intersectional traffic to date despite its broad potential applications. In this paper we propose a novel deep learning-based framework to simulate and edit inte...
We introduce a novel approach for flame volume reconstruction from videos using inexpensive charge-coupled device (CCD) consumer cameras. The approach includes an economical data capture technique using inexpensive CCD cameras. Leveraging the smear feature of the CCD chip, we present a technique for synchronizing CCD cameras while capturing flame v...
Current crowd simulation progresses still fall short of simulating many real-world collective behaviors. Arguably, one of the main reasons is that some essential qualities of human beings such as emotion have not been effectively modeled and incorporated into crowd simulation algorithms. In this paper, we propose a novel computational model for emo...
The rising complexity of large-scale machines means higher skill requirements for maintenance workers. Thence training more workers with practiced maintenance skills is significant to those manufacturers and service providers. This article proposes a method to simulate virtual human’s motion synthesis in virtual maintenance through researching on t...
We introduce a novel approach for flame volume reconstruction from videos using inexpensive charge-coupled device (CCD) consumer cameras. The approach includes an economical data capture technique using inexpensive CCD cameras. Leveraging the smear feature of the CCD chip, we present a technique for synchronizing CCD cameras while capturing flame v...
In this paper, we present a control technique to editing the fire motion with the geometry goal shape, which is designed without connection to physical parameters and physical equation solving. To fulfill this, controlling elements are extracted from the input curves conveying the target shape of fire animation. Then, a force field is obtained acco...
The key obstacle to the use of consumer cameras in computer vision and computer graphics applications is the lack of synchronization hardware. We present a stroboscope based synchronization approach for the charge-coupled device (CCD) consumer cameras. The synchronization is realized by first aligning the frames from different video sequences based...
Volumetric path tracing relies on importance sampling to stochastically construct light transport paths from an emitter to the sensor. Existing techniques incrementally sample path vertices or segments with respect to the local scattering property incorporating the geometry and scattering terms. Thus the joint probability density for drawing a path...
We propose a technique named progressive light volume to support advanced volumetric illumination effects, such as single scattering and multi scattering. The light volume stores direct lighting information for sample points of the volume data. Using the light volume, we are able to compute the direct lighting for any point in the volume data with...
Because of intensive inter-node communications, image compositing has always been a bottleneck in parallel visualization systems. In a heterogeneous networking environment, the variation of link bandwidth and latency adds more uncertainty to the system performance. In this paper, we present a pipelining image compositing algorithm in heterogeneous...
Numerous depth image-based rendering algorithms have been proposed to synthesize the virtual view for the free viewpoint television. However, inaccuracies in the depth map cause visual artifacts in the virtual view. In this paper, we propose a novel virtual view synthesis framework to create the virtual view of the scene. Here, we incorporate a tri...
In real-time rendering transparency is an important multi-fragment effect to visualize the structure of three-dimensional models. The per-pixel transmittance implicitly describes how the light is attenuated by traveling through several transparent fragments. We present a hybrid approach to fit the transmittance using the Heaviside step function and...
In this paper, we propose a new data-driven model to simulate the process of lane-changing in traffic simulation. Specifically, we first extract the features from surrounding vehicles that are relevant to the lane-changing of the subject vehicle. Then, we learn the lane-changing characteristics from the ground-truth vehicle trajectory data using ra...
Human crowd simulation is a new technology in the virtual reality field. Since it could simulate evacuation, it has strong demands in risk assessment for public buildings. In this paper we discuss the development of the main related research topics, including semantic description for virtual environments and crowd models which generate continuum hu...
Depth estimation is a classical problem in computer vision, which typically relies on either a depth sensor or stereo matching alone. The depth sensor provides real-time estimates in repetitive and textureless regions where stereo matching is not effective. However, stereo matching can obtain more accurate results in rich texture regions and object...
Traffic simulation heavily relies on lane model. This paper presents a novel method to model lanes based on the road axis under the Frenet frame. The road axis is generated from the geographic information system data after curve approximation, discretization, and compression. This lane model couples mileage information with three-dimensional geomet...
In recent years, the technology for crowd simulation has been applied in many fields. However, collision avoidance considering of multiple individuals and moving obstacles simultaneously is still a challenging task in this research area. In this paper, we present a novel technique for multi-agent navigation in dynamic scenario. By coupling unified...
Disparity estimation for a scene with complex geometric characteristics such as slanted or highly curved surfaces is a basic and important issue in stereo matching. Traditional methods often use first-order smoothness priors that always lead to low-curvature frontal-parallel disparity maps. We propose a stereo framework that views the scene as a se...
Modeling lane changes realistically play an important role in traffic animations. Existing models for traffic simulations mostly focus on lane-changing decision-making. They cannot describe how lane-changing processes go on. Though some methods in motion planning can be further used to describe the processes, they are time-consuming. In this paper,...
It is a big challenge to generate the traffic scenarios with frequent lane changes in flow-based continuum traffic simulations. In this paper, we present a novel macroscopic method, named interactable cooperative driving lattice hydrodynamic model (Interactable CDL-H model). We describe traffic flow along lanes and flow interactions between lanes i...
Large-scale group performance animation has been an important research topic because of its diverse range of applications including virtual rehearsal and film production. Animating hundreds of virtual actors as what the director wishes is a tough task. In this paper, we address this challenge by introducing an optimization method that generates lar...
Motion planning is an important problem in character animation and interactive simulation. However, few planning methods have considered domain-specific knowledge that governs the agent's behaviors, and none of them is capable of planning the interactive task in which the agent interacts with the objects in the virtual environment. This paper prese...
Polluted water is very common in our world. Vividly rendering polluted water can bring people real, different, and fancy feelings. Especially in under water imagery, taking polluted water into consideration will produce more plausible results. Polluted water consists of many kinds of pollutants, which interact with light differently and make water...
Human face is a complex biomechanical system and non‐linearity is a remarkable feature of facial expressions. However, in blendshape animation, facial expression space is linearized by regarding linear relationship between blending weights and deformed face geometry. This results in the loss of reality in facial animation. To synthesize more realis...
In recent years, motion control has become one of hot topics in virtual assembly and it is an indispensable part in maintenance
process simulation. However, motion control still remains at low level based on key-frame or inverse kinematics, which in
turn leads to an over complicated modeling. The paper proposes a new method to control the virtual h...
Rendering large-scale virtual crowd with high speed is still a challenge for the computer graphics community. In this paper, we introduce a parallel rendering method for scenes with large-scale virtual crowd, which has been successfully applied to the rendering of the scene of 140 thousand square meters with several hundred thousand people on a PC...
Physically based rendering of scenes with volumetric illumination of flames remains a challenging problem due to the complexity of their heterogeneous radiative properties. Current bidirectional importance sampling strategies have been focusing on emissive light sources without anisotropic extinction. In this paper, we present an efficient importan...
We present a skeleton-based control method for fluid animation. Our method is designed to provide an easy and intuitive control
approach while producing visually plausible fluid behavior. In our method, users are allowed to control animated fluid with
skeleton keyframes. Expected results are then obtained by driving fluid towards a sequence of targ...
In order to exhibit panic phenomenon in the crowd simulation, special rules or parameters setting is needed for a given scene.
In this paper, we present a panic model, named PPIB (Panic, Propagation and Influence on Behavior), which could evoke panic
automatically under dangerous situation without manual intervention. PPIB describes panic behavior...
In this paper, we present a novel parallelizing method for crowd simulators constructed with a continuum model rather than an agent-based model. The basic idea is to partition a crowded virtual environment into some districts, each of which keeps its own dynamic continuum fields and has several transitional blocks to make individuals keep continuum...
In recent years, virtual assembly has become an effective technique for large equipment's maintainability analysis and verification. As an indispensable part in maintenance process simulation, motion control for virtual human still remains on low level based on key-frame or inverse kinematics, which in turn leads to an over complicated modeling pro...
This paper presents a novel approach for crowd simulation in complex environments. Our method is based on the continuum model proposed by Treuille et al. [13]. Compared to the original method, our solution is well-suited for complex environments. First, we present an environmental structure and a corresponding discretization scheme that helps us to...
The estimation of human body segment properties (BSPs), including mass, centroid and moments of inertia, is required in the kinetic analysis of human motion. Nowadays, with the development of motion capture technology, motion capture data plays an important role in the kinetic analysis of human motion. An interesting problem is whether BSPs can be...
High-level control of fire is very attractive to artists, as it facilitates a detail-free user interface to make desirable
flame effects. In this paper, a unified framework is proposed for modeling and animating fire under general geometric constraints
and evolving rules. To capture the fire projection on user’s model animation, we develop a modif...
In virtual reality applications, it's difficult to tune the parameters of the cloth model to present vivid features of different fabric materials. In this paper, a method of learning parameters from real data is proposed. First, real data of the fabric motion are captured by motion capture devices. Then, parameters of the motion model are optimized...
This paper presents an automatic image-based modelling method based on shape from silhouettes that does not need any user interactions of camera calibration or image segmentation. Under circular motion constraints, using an iterative optimisation of graph cuts and conjugate direction minimisation, we can label an object's visual hull and minimise s...
This paper presents an efficient method for reconstructing 3D building models from Electronic Architectural Drawings (EADs). EADs are matched via recognising axes and elevation. After recognising Candidate Contours (CCs) of Architectural Components (ACs) in one floor plan, the method incorporates the results into recognising the neighbour floors, w...
Style and variation are two vital components of human motion: style differentiates between examples of the same behavior (slow walk vs. fast walk) while variation differentiates between examples of the same style (vigorous vs. lackadaisical arm swing). This paper presents a novel method to simultaneously model style and variation of motion data cap...
For the convenient reuse of large-scale 3D motion capture data, browsing and searching methods for the data should be explored. In this paper, an efficient indexing and retrieval approach for human motion data is presented based on a novel similarity metric. We divide the human character model into three partitions to reduce the spatial complexity...
Simulating crowds in complex environment is fascinating and challenging, however, modeling of the environment is always neglected in the past, which is one of the essential problems in crowd simulation especially for multilayered complex environment. This paper presents a semantic model for representing the complex environment, where the semantic i...
We present a novel approach to tracking 2D human motion in un- calibrated monocular videos. Human motion usually exhibits time- varying patterns, and we propose to use locally learnt prior models to capture this characteristics. For each input image, our method au- tomatically learns a local probability density model and a local dy- namical model f...
This paper presents a hybrid approach based on the continuum model proposed by Treuille et al.. Compared to the original method, our solution is well suited for complex environment. We first present an environment structure and a corresponding discretization scheme that help us to organize and simulate crowds in large-scale scenarios. Second, addit...
In this paper, we present a multi-view stereo based shaped modeling method. Using images captured from different viewpoints, our approach can provide objects' 3d models with high fidelity details automatically and efficiently. We firstly use a strict plane based sweep stereo method via GPU to compute quasi-dense depth maps which usually have many h...
Graphics recognition aims to recover the geometric information and semantic information in documents. The recognized primitives, which are grouped to stand for particular meanings, could be called symbols. Research in symbol description plays an important role in graphics recognition. In this paper, we proposed a complete descriptor for geometric r...
In this paper, we present an automatic and efficient image based modeling system which can create objects' 3D models directly from images captured from different viewpoints. The system firstly uses structure from motion to generate camera parameters and sparse 3D patches. Then, a conservative plane based sweep stereo method on GPU is used to comput...
Widely used in data-driven computer animation, motion capture data ex- hibits its complexity both spatially and temporally. The indexing and retrieval of mo- tion data is a hard task that is not totally solved. In this paper, we present an ef- ficient motion data indexing and retrieval method based on self-organizing map and Smith-Waterman string s...
Virtual human is a digital representation of the geometric and behavioral property of human beings in the virtual environment generated by computer. The research goal of virtual human synthesis is to gen- erate realistic human body models and natural human motion behavior. This paper introduces the devel- opment of the related researches on these t...
It is still an open problem to reuse the motion capture data in an intuitive way. In this paper, we present a novel technology to synthesize animations from the low-dimensional semantic signals. The semantic signals are defined as the meanings which are visible to the animators, such as the angles of joints rotation around axis, the trajectories of...
This paper presents a labor-saving method to construct optimal facial animation blendshapes from given blendshape sketches and facial motion capture data. At first, a mapping function is established between target ''Marker Face'' and performer's face by RBF interpolating selected feature points. Sketched blendshapes are transferred to performer's '...