About
102
Publications
35,001
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,759
Citations
Current institution
Publications
Publications (102)
We introduce VOODOO XP: a 3D-aware one-shot head reenactment method that can generate highly expressive facial expressions driven by an input video from a single 2D portrait. Our approach is real-time, view-consistent, and can be instantly used without calibration or fine-tuning. We demonstrate our solution in a monocular video setting and an end-t...
We introduce VOODOO XP: a 3D-aware one-shot head reenactment method that can generate highly expressive facial expressions from any input driver video and a single 2D portrait. Our solution is real-time, view-consistent, and can be instantly used without calibration or fine-tuning. We demonstrate our solution on a monocular video setting and an end...
The amidoxime group has received widespread attention owing to its high affinity for uranium. Since this property was discovered, various amidoxime-based adsorbents have been developed for uranium adsorption. However, the effect of chain conformation on the uranium adsorption performance (in terms of adsorption capacity, stability, selectivity, etc...
High-fidelity face digitization solutions often combine multi-view stereo (MVS) techniques for 3D reconstruction and a non-rigid registration step to establish dense correspondence across identities and expressions. A common problem is the need for manual clean-up after the MVS step, as 3D scans are typically affected by noise and outliers and cont...
Disentangling data into interpretable and independent factors is critical for controllable generation tasks. With the availability of labeled data, supervision can help enforce the separation of specific factors as expected. However, it is often expensive or even impossible to label every single factor to achieve fully-supervised disentanglement. I...
We introduce a highly robust GAN-based framework for digitizing a normalized 3D avatar of a person from a single unconstrained photo. While the input image can be of a smiling person or taken in extreme lighting conditions, our method can reliably produce a high-quality textured model of a person's face in neutral expression and skin textures under...
A deep generative model that describes human motions can benefit a wide range of fundamental computer vision and graphics tasks, such as providing robustness to video-based human pose estimation, predicting complete body movements for motion capture systems during occlusions, and assisting key frame animation with plausible movements. In this paper...
The marine applicability of adsorbents intended for recovering uranium from seawater is crucial. For such applicability, the materials must exhibit anti-biofouling properties, seawater pH adaptability (pH~8), and salt tolerance. Extracting uranium from seawater is a long-term project; hence, biofouling, high salt concentrations, and weak alkaline e...
We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800x800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability...
Features that are equivariant to a larger group of symmetries have been shown to be more discriminative and powerful in recent studies. However, higher-order equivariant features often come with an exponentially-growing computational cost. Furthermore, it remains relatively less explored how rotation-equivariant features can be leveraged to tackle...
The creation of high-fidelity computer-generated (CG) characters for films and games is tied with intensive manual labor, which involves the creation of comprehensive facial assets that are often captured using complex hardware. To simplify and accelerate this digitization process, we propose a framework for the automatic generation of high-quality...
We present the first approach to volumetric performance capture and novel-view rendering at real-time speed from monocular video, eliminating the need for expensive multi-view systems or cumbersome pre-acquisition of a personalized template model. Our system reconstructs a fully textured 3D human from each frame by leveraging Pixel-Aligned Implicit...
The creation of high-fidelity computer-generated (CG) characters used in film and gaming requires intensive manual labor and a comprehensive set of facial assets to be captured with complex hardware, resulting in high cost and long production cycles. In order to simplify and accelerate this digitization process, we propose a framework for the autom...
We present the first approach to volumetric performance capture and novel-view rendering at real-time speed from monocular video, eliminating the need for expensive multi-view systems or cumbersome pre-acquisition of a personalized template model. Our system reconstructs a fully textured 3D human from each frame by leveraging Pixel-Aligned Implicit...
Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental step called rasterization, which prevents rendering to be differentiable....
In this paper, we provide some careful analysis of certain pathological behavior of Euler angles and unit quaternions encountered in previous works related to rotation representation in neural networks. In particular, we show that for certain problems, these two representations will provably produce completely wrong results for some inputs, and tha...
Learning latent representations of registered meshes is useful for many 3D tasks. Techniques have recently shifted to neural mesh autoencoders. Although they demonstrate higher precision than traditional methods, they remain unable to capture fine-grained deformations. Furthermore, these methods can only be applied to a template-specific surface me...
The ability to generate complex and realistic human body animations at scale, while following specific artistic constraints, has been a fundamental goal for the game and animation industry for decades. Popular techniques include key-framing, physics-based simulation, and database methods via motion graphs. Recently, motion generators based on deep...
We present a deep learning-based framework for portrait reenactment from a single picture of a target (one-shot) and a video of a driving subject. Existing facial reenactment methods suffer from identity mismatch and produce inconsistent identities when a target and a driving subject are different (cross-subject), especially in one-shot settings. I...
We present an interactive approach to synthesizing realistic variations in facial hair in images, ranging from subtle edits to existing hair to the addition of complex and challenging hair in images of clean-shaven subjects. To circumvent the tedious and computationally expensive tasks of modeling, rendering and compositing the 3D geometry of the t...
In this paper, we propose ARCH (Animatable Reconstruction of Clothed Humans), a novel end-to-end framework for accurate reconstruction of animation-ready 3D clothed humans from a monocular image. Existing approaches to digitize 3D humans struggle to handle pose variations and recover details. Also, they do not produce models that are animation read...
Based on a combined data set of 4000 high resolution facial scans, we introduce a non-linear morphable face model, capable of producing multifarious face geometry of pore-level resolution, coupled with material attributes for use in physically-based rendering. We aim to maximize the variety of identities, while increasing the robustness of correspo...
Uranium extraction from seawater has attracted attention owing to the growing demand for nuclear energy. However, it remains challenging because of the complexities of the marine environment, such as competing metal ions, high salinity, alkaline pH (8.3), and continuous biofouling. In this study, a new hyperbranched poly(amido amine)-modified adsor...
The recovery of uranium from seawater is of great concern owing to the growing demand for nuclear energy. Though amidoxime-functionalized adsorbents as the most promising adsorbents have been widely used for this purpose, their low selectivity and vulnerability to biofouling have limited their application in real marine environments. Herein, a new...
From angling smiles to duck faces, all kinds of facial expressions can be seen in selfies, portraits, and Internet pictures. These photos are taken from various camera types, and under a vast range of angles and lighting conditions. We present a deep learning framework that can fully normalize unconstrained face images, i.e., remove perspective dis...
Recent advances in 3D deep learning have shown that it is possible to train highly effective deep models for 3D shape generation, directly from 2D images. This is particularly interesting since the availability of 3D models is still limited compared to the massive amount of accessible 2D images, which is invaluable for training. The representation...
While hair is an essential component of virtual humans, it is also one of the most challenging digital assets to create. Existing automatic techniques lack the generality and flexibility to create rich hair variations, while manual authoring interfaces often require considerable artistic skills and efforts, especially for intricate 3D hair structur...
While hair is an essential component of virtual humans, it is also one of the most challenging and time-consuming digital assets to create. Existing automatic techniques lack the generality and flexibility for users to create the exact intended hairstyles. Meanwhile, manual authoring interfaces often require considerable skills and experiences from...
Existing methods for AI-generated artworks still struggle with generating high-quality stylized content, where high-level semantics are preserved, or separating fine-grained styles from various artists. We propose a novel Generative Adversarial Disentanglement Network which can fully decompose complex anime illustrations into style and content. Tra...
Near-range portrait photographs often contain perspective distortion artifacts that bias human perception and challenge both facial recognition and reconstruction techniques. We present the first deep learning based approach to remove such artifacts from unconstrained portraits. In contrast to the previous state-of-the-art approach, our method hand...
We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a si...
We propose a novel approach to performing fine-grained 3D manipulation of image content via a convolutional neural network, which we call the Transformable Bottleneck Network (TBN). It applies given spatial transformations directly to a volumetric bottleneck within our encoder-bottleneck-decoder architecture. Multi-view supervision encourages the n...
Rendering bridges the gap between 2D vision and 3D scenes by simulating the physical process of image formation. By inverting such renderer, one can think of a learning approach to infer 3D information from 2D images. However, standard graphics renderers involve a fundamental discretization step called rasterization, which prevents the rendering pr...
Rendering is the process of generating 2D images from 3D assets, simulated in a virtual environment, typically with a graphics pipeline. By inverting such renderer, one can think of a learning approach to predict a 3D shape from an input image. However, standard rendering pipelines involve a fundamental discretization step called rasterization, whi...
We introduce a new silhouette-based representation for modeling clothed human bodies using deep generative models. Our method can reconstruct a complete and textured 3D model of a person wearing clothes from a single input picture. Inspired by the visual hull algorithm, our implicit representation uses 2D silhouettes and 3D joints of a body pose to...
Three-dimensional object recognition has recently achieved great progress thanks to the development of effective point cloud-based learning frameworks, such as PointNet and its extensions. However, existing methods rely heavily on fully connected layers, which introduce a significant amount of parameters, making the network harder to train and pron...
Recent advances in single-view 3D hair digitization have made the creation of high-quality CG characters scalable and accessible to end-users, enabling new forms of personalized VR and gaming experiences. To handle the complexity and variety of hair structures, most cutting-edge techniques rely on the successful retrieval of a particular hair model...
With the rising interest in personalized VR and gaming experiences comes the need to create high quality 3D avatars that are both low-cost and variegated. Due to this, building dynamic avatars from a single unconstrained input image is becoming a popular application. While previous techniques that attempt this require multiple input images or rely...
We present a deep learning based volumetric approach for performance capture using a passive and highly sparse multi-view capture system. State-of-the-art performance capture systems require either pre-scanned actors, large number of cameras or active sensors. In this work, we focus on the task of template-free, per-frame 3D surface reconstruction...
We introduce a deep learning-based method to generate full 3D hair geometry from an unconstrained image. Our method can recover local strand details and has real-time performance. State-of-the-art hair modeling techniques rely on large hairstyle collections for nearest neighbor retrieval and then perform ad-hoc refinement. Our deep learning approac...
We present an adversarial network for rendering photorealistic hair as an alternative to conventional computer graphics pipelines. Our deep learning approach does not require low-level parameter tuning nor ad-hoc asset design. Our method simply takes a strand-based 3D hair model as input and provides intuitive user-control for color and lighting th...
A deep learning-based technology for generating photo-realistic 3D avatars with dynamic facial textures from a single input image is presented. Real-time performance-driven animations and renderings are demonstrated on an iPhone X and we show how these avatars can be integrated into compelling virtual worlds and used for 3D chats.
We present a deep learning-based technique to infer high-quality facial reflectance and geometry given a single unconstrained image of the subject, which may contain partial occlusions and arbitrary illumination conditions. The reconstructed high-resolution textures, which are generated in only a few seconds, include high-resolution skin surface re...
We introduce a deep learning-based method to generate full 3D hair geometry from an unconstrained image. Our method can recover local strand details and has real-time performance. State-of-the-art hair modeling techniques rely on large hairstyle collections for nearest neighbor retrieval and then perform ad-hoc refinement. Our deep learning approac...
We present a learning-based approach for synthesizing facial geometry at medium and fine scales from diffusely-lit facial texture maps. When applied to an image sequence,
the synthesized detail is temporally coherent. Unlike current state-of-the-art methods [17, 5], which assume ”dark is deep”, our model is trained with measured facial detail
colle...
Deep learning-based style transfer between images has recently become a popular area of research. A common way of encoding "style" is through a feature representation based on the Gram matrix of features extracted by some pre-trained neural network or some other form of feature statistics. Such a definition is based on an arbitrary human decision a...
We present a fully automatic framework that digitizes a complete 3D head with hair from a single unconstrained image. Our system offers a practical and consumer-friendly end-to-end solution for avatar personalization in gaming and social VR applications. The reconstructed models include secondary components (eyes, teeth, tongue, and gums) and provi...
With this fully automatic framework for creating a complete 3D avatar from a single unconstrained image, users can upload any photograph to build a high-quality head model within seconds. The model can be immediately animated via performance capture using a webcam. It digitizes the entire model using a textured-mesh representation for the head and...
We present a real-time deep learning framework for video-based facial performance capture---the dense 3D tracking of an actor's face given a monocular video. Our pipeline begins with accurately capturing a subject using a high-end production facial capture pipeline based on multi-view stereo tracking and artist-enhanced animations. With 5--10 minut...
We present a data-driven inference method that can synthesize a photorealistic texture map of a complete 3D face model given a partial 2D view of a person in the wild. After an initial estimation of shape and low-frequency albedo, we compute a high-frequency partial texture map, without the shading component, of the visible face area. To extract th...
Registration algorithms are an essential component of many computer graphics and computer vision systems. With recent technological advances in RGBD sensors (color plus depth), an active area of research is in techniques combining color, geometry, and learnt priors for robust real-time registration. The goal of this course is to introduce the mathe...
The age of social media and immersive technologies has created a growing need for processing detailed visual representations of ourselves as virtual and augmented reality is growing into the next generation platform for online communication, connecting hundreds of millions of users. A realistic simulation of our presence in a mixed reality environm...
Significant challenges currently prohibit expressive interaction in virtual reality (VR). Occlusions introduced by head-mounted displays (HMDs) make existing facial tracking techniques intractable, and even state-of-the-art techniques used for real-time facial tracking in unconstrained environments fail to capture subtle details of the user's facia...
We present an end-to-end system for reconstructing complete watertight and textured models of moving subjects such as clothed humans and animals, using only three or four handheld sensors. The heart of our framework is a new pairwise registration algorithm that minimizes, using a particle swarm strategy, an alignment error metric based on mutual vi...
Modeling the human body is of special interest in computer graphics to create "virtual humans", but material and optical properties of biological tissues are complex and not easily captured. This course will cover the major topics and challenges in using image acquisition to model the human body.
Creating and animating realistic 3D human faces is an important element of virtual reality, video games, and other areas that involve interactive 3D graphics. In this paper, we propose a system to generate photorealistic 3D blendshape-based face models automatically using only a single consumer RGB-D sensor. The capture and processing requires no a...
We present an end-to-end system for reconstructing complete watertight and textured models of moving subjects such as clothed humans and animals, using only three or four handheld sensors. The heart of our framework is a new pairwise registration algorithm that minimizes, using a particle swarm strategy, an alignment error metric based on mutual vi...
We propose a deep learning approach for finding dense correspondences between
3D scans of people. Our method requires only partial geometric information in
the form of two depth maps or partial reconstructed surfaces, works for humans
in arbitrary poses and wearing any clothing, does not require the two people to
be scanned from similar viewpoints,...
There are currently no solutions for enabling direct face-to-face interaction between virtual reality (VR) users wearing head-mounted displays (HMDs). The main challenge is that the headset obstructs a significant portion of a user's face, preventing effective facial capture with traditional techniques. To advance virtual reality as a next-generati...
Human hair presents highly convoluted structures and spans an extraordinarily wide range of hairstyles, which is essential for the digitization of compelling virtual avatars but also one of the most challenging to create. Cutting-edge hair modeling techniques typically rely on expensive capture devices and significant manual labor. We introduce a n...
We introduce a data-driven hair capture framework based on example strands generated through hair simulation. Our method can robustly reconstruct faithful 3D hair models from unprocessed input point clouds with large amounts of outliers. Current state-of-the-art techniques use geometrically-inspired heuristics to derive global hair strand structure...
3D scanning pose change output reconstruction textured reconstruction large variety of examples 3D print Figure 1: With our system, users can scan themselves with a single 3D sensor by rotating the same pose for a few different views (typically eight, ⇠45 degrees apart) to cover the full body. Our method robustly registers and merges different scan...
Existing hair capture systems fail to produce strands that reflect the structures of real-world hairstyles. We introduce a system that reconstructs coherent and plausible wisps aware of the underlying hair structures from a set of still images without any special lighting. Our system first discovers locally coherent wisp structures in the reconstru...
Reconstructing realistic 3D hair geometry is challenging due to omnipresent occlusions, complex discontinuities and specular appearance. To address these challenges, we propose a multi-view hair reconstruction algorithm based on orientation fields with structure-aware aggregation. Our key insight is that while hair's color appearance is view-depend...
We present a novel shape completion technique for creating temporally coherent watertight surfaces from real-time captured dynamic performances. Because of occlusions and low surface albedo, scanned mesh sequences typically exhibit large holes that persist over extended periods of time. Most conventional dynamic shape reconstruction techniques rely...
We introduce a novel framework for image-based 3D reconstruction of urban buildings based on symmetry priors. Starting from image-level edges, we generate a sparse and approximate set of consistent 3D lines. These lines are then used to simultaneously detect symmetric line arrangements while refining the estimated 3D model. Operating both on 2D ima...
In this demo we present our system for performance-based character animation that enables any user to control the facial expressions of a digital avatar in realtime. Compared to existing technologies, our system is easy to deploy and does not require any face markers, intrusive lighting, or complex scanning hardware. Instead, the user is recorded i...
This paper presents a system for performance-based character animation that enables any user to control the facial expressions of a digital avatar in realtime. The user is recorded in a natural environment using a non-intrusive, commercially available 3D sensor. The simplicity of this acquisition device comes at the cost of high noise levels in the...
The realistic reconstruction of hair motion is challenging because of hair's complex occlusion, lack of a well-defined surface, and non-Lambertian material. We present a system for passive capture of dynamic hair performances using a set of high-speed video cameras. Our key insight is that, while hair color is unlikely to match across multiple view...
We introduce a method for generating facial blendshape rigs from a set of example poses of a CG character. Our system transfers controller semantics and expression dynamics from a generic template to the target blendshape model, while solving for an optimal reproduction of the training poses. This enables a scalable design process, where the user c...
We present a framework and algorithms for robust geometry and motion reconstruction of complex deforming shapes. Our method makes use of a smooth template that provides a crude approximation of the scanned object and serves as a geometric and topological prior for reconstruction. Large-scale motion of the acquired object is recovered using a novel...