Article

DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling

Authors:
  • Chinese University of Hong Kong, Shenzhen
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Face modeling has been paid much attention in the field of visual computing. There exist many scenarios, including cartoon characters, avatars for social media, 3D face caricatures as well as face-related art and design, where low-cost interactive face modeling is a popular approach especially among amateur users. In this paper, we propose a deep learning based sketching system for 3D face and caricature modeling. This system has a labor-efficient sketching interface, that allows the user to draw freehand imprecise yet expressive 2D lines representing the contours of facial features. A novel CNN based deep regression network is designed for inferring 3D face models from 2D sketches. Our network fuses both CNN and shape based features of the input sketch, and has two independent branches of fully connected layers generating independent subsets of coefficients for a bilinear face representation. Our system also supports gesture based interactions for users to further manipulate initial face models. Both user studies and numerical results indicate that our sketching system can help users create face models quickly and effectively. A significantly expanded face database with diverse identities, expressions and levels of exaggeration is constructed to promote further research and evaluation of face modeling techniques.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Based on these large-scale datasets, several powerful parametric models [5,12,37,41] have been developed to facilitate the reconstruction and analysis of human shapes, actions, and interactions. With the help of parametric models, deep learning techniques have shown the potential to efficiently infer accurate 3D digital humans from single-view images [28,41] or even sparse sketches [9,25,52]. Most recently, there are some works [38,44] that devote to exploring the intelligent generation of cartoon-like character heads. ...
... Although these real-captured datasets are widely used in realistic human digitalization, they are unsuitable for imaginary character generation. For designing character data with computers, researchers try to perform deformation on real 3D human faces or bodies to construct exaggerated shapes programmatically [11,25,50,55]. Their results lack diversity and are far from satisfactory. ...
... Based on the above parametric models, researchers have made remarkable progress in human digitization, such as reconstruction from simple inputs (e.g., a single image or sparse sketches) [6,15,25,28,42] and real-time pose retargeting [14,31,54]. For instance, SMPLify [6] estimates 3D body shape and pose parameters automatically from 2D joints with multiple ellipsoids. ...
Preprint
Full-text available
Assisting people in efficiently producing visually plausible 3D characters has always been a fundamental research topic in computer vision and computer graphics. Recent learning-based approaches have achieved unprecedented accuracy and efficiency in the area of 3D real human digitization. However, none of the prior works focus on modeling 3D biped cartoon characters, which are also in great demand in gaming and filming. In this paper, we introduce 3DBiCar, the first large-scale dataset of 3D biped cartoon characters, and RaBit, the corresponding parametric model. Our dataset contains 1,500 topologically consistent high-quality 3D textured models which are manually crafted by professional artists. Built upon the data, RaBit is thus designed with a SMPL-like linear blend shape model and a StyleGAN-based neural UV-texture generator, simultaneously expressing the shape, pose, and texture. To demonstrate the practicality of 3DBiCar and RaBit, various applications are conducted, including single-view reconstruction, sketch-based modeling, and 3D cartoon animation. For the single-view reconstruction setting, we find a straightforward global mapping from input images to the output UV-based texture maps tends to lose detailed appearances of some local parts (e.g., nose, ears). Thus, a part-sensitive texture reasoner is adopted to make all important local areas perceived. Experiments further demonstrate the effectiveness of our method both qualitatively and quantitatively. 3DBiCar and RaBit are available at gaplab.cuhk.edu.cn/projects/RaBit.
... There are 13 studies [6,32,44,[53][54][55][56][57][58][59][60][61][62] that provide user interfaces. The user interface application serves as a way to show the effectiveness of the proposed deep learning approach, which can also better facilitate human-AI interaction for creative designs. ...
... [44,62] provide user interfaces in VR and AR settings, respectively, which can further improve the user experience of human-computer interaction in immersive design. Additionally, 12 studies [35,36,44,53,54,[56][57][58][59][60][61]63] conducted user studies to further validate their methods and user applications. User studies can serve as a way to hear from human users so that researchers can improve the proposed methods from users' feedback. ...
... We also excluded some methods that apply deep learning techniques, but require predefined geometric models to guide 3D reconstruction, such as the methods presented in Refs. [58,110]. We focus on reviewing deep learning-based methods without using predefined geometric models that require the design of rules. ...
Article
Conceptual design is the foundational stage of a design process that translates ill-defined design problems into low-fidelity design concepts and prototypes. While deep learning approaches are widely applied in later design stages for design automation, we see fewer attempts in conceptual design for three reasons: 1) the data in this stage exhibit multiple modalities: natural language, sketches, and 3D shapes, and these modalities are challenging to represent in deep learning methods; 2) it requires knowledge from a larger source of inspiration instead of focusing on a single design task; and 3) it requires translating designers' intent and feedback, and hence needs more interaction with humans, either designers or users. With recent advances in deep learning of cross-modal tasks (DLCMT) and the availability of large cross-modal datasets, we see opportunities to apply these learning-based methods to the conceptual design of product shapes. In this paper, we conduct a systematic review on the methods for DLCMT that involve three design modalities: natural language, sketches, and 3D shapes, which revealed 50 articles in the fields of computer graphics, computer vision, and engineering design. This review work identifies the key challenges and opportunities in applying DLCMT in the conceptual design of engineered products. The authors also propose a list of five research questions that point to future directions and call on the community to devote itself to principled research investigations that help translate knowledge from computer science to engineering design.
... The distinguishing feature of this technology is the need for an initial mesh. It usually generates a target mesh based on geometric information from image [19], [20], [23], [29], [31], [35], [40], [42], [44], [47], [49], [50], [56], [57], [62], [64], [71], [72], [75], [86], [89], [90], [97], [99], [101], [110], [117], point cloud [30], [60], [67], [68], [92] or voxel [48], [111], etc. Spherical or ellipsoidal meshes are often chosen as templates, as shown in Fig.4. Due to the presence of initial mesh, this type of method reduces the difficulty of mesh generation to some extent. ...
... With the development of deep learning technology, a sheer volume of practical methods have emerged in recent years. Among them, most of the early deformation-based IMG works were to deform the same basic model to get the target mesh, such as clothing [19], face [20], body [35] and so on. They can only generate deformation meshes of specific target objects. ...
... Sketch depicts rich geometric features of 3D shapes such as silhouettes, occluding and suggestive contours, ridges and valleys and hatching lines, and thus provides a succinct and intuitive way for mesh generation. In those IMGs we collected, boundary based mesh generation methods [7], [8], [13], [57], [106], [108], [109] are usually for 2D planar meshes, while sketch based methods [20], [21], [25], [54], [71], [104], [111] are generally for 3D object surface meshes. ...
Preprint
Intelligent mesh generation (IMG) refers to a technique to generate mesh by machine learning, which is a relatively new and promising research field. Within its short life span, IMG has greatly expanded the generalizability and practicality of mesh generation techniques and brought many breakthroughs and potential possibilities for mesh generation. However, there is a lack of surveys focusing on IMG methods covering recent works. In this paper, we are committed to a systematic and comprehensive survey describing the contemporary IMG landscape. Focusing on 110 preliminary IMG methods, we conducted an in-depth analysis and evaluation from multiple perspectives, including the core technique and application scope of the algorithm, agent learning goals, data types, targeting challenges, advantages and limitations. With the aim of literature collection and classification based on content extraction, we propose three different taxonomies from three views of key technique, output mesh unit element, and applicable input data types. Finally, we highlight some promising future research directions and challenges in IMG. To maximize the convenience of readers, a project page of IMG is provided at \url{https://github.com/xzb030/IMG_Survey}.
... Deep Learning Techniques, have more recently, shown substantial progress in inferring 3d models from sketches: a deep regression network was used to infer the parameters of a bilinear morph over a face mesh [HGY17] training over the FaceWarehouse dataset [CWZ+14]; a generative encoder-decoder framework was used to infer a fixed-size point-cloud [FSG17] from a real world photograph using a distance metric, trained over ShapeNet dataset [CFG+15]; U-Net-like networks [RFB15] were proven effective, to predict a volumetric model from a single and multiple views [DAI+18] using a diverse dataset. ...
... 2D to 3D is a problem ill-posed in general sense, but the methods mostly hope to solve a subset of the problem inspired by human visual intelligence; e.g. point-cloud representation for cars [FSG17], signed distance functions for faces [HGY17] and so forth. Deep volumetric inference [DAI+18] attempted a more general purpose inference, but suffers from artefacts of low resolution in geometry space, which it partially mitigates with normal estimation [DCLB19]. ...
... There have been a number of tools that have been developed to ease the workflow for the artists. These include impressive work in terms of nonphotorealistic rendering (npr) that obtain sketches given 3d models [HZ00; GI13], others that solve for obtaining 3d models given sketch as an input using interaction [HGY17;DAI+18], and several others that aim to ease the animation task [BCK+13; JFA+15; FJS+17]. However, these works still rely on the input sketch being fully generated by the artist. ...
Preprint
Full-text available
Sketching is used as a ubiquitous tool of expression by novices and experts alike. In this thesis I explore two methods that help a system provide a geometric machine-understanding of sketches, and in-turn help a user accomplish a downstream task. The first work deals with interpretation of a 2D-line drawing as a graph structure, and also illustrates its effectiveness through its physical reconstruction by a robot. We setup a two-step pipeline to solve the problem. Formerly, we estimate the vertices of the graph with sub-pixel level accuracy. We achieve this using a combination of deep convolutional neural networks learned under a supervised setting for pixel-level estimation followed by the connected component analysis for clustering. Later we follow it up with a feedback-loop-based edge estimation method. To complement the graph-interpretation, we further perform data-interchange to a robot legible ASCII format, and thus teach a robot to replicate a line drawing. In the second work, we test the 3D-geometric understanding of a sketch-based system without explicit access to the information about 3D-geometry. The objective is to complete a contour-like sketch of a 3D-object, with illumination and texture information. We propose a data-driven approach to learn a conditional distribution modelled as deep convolutional neural networks to be trained under an adversarial setting; and we validate it against a human-in-the-loop. The method itself is further supported by synthetic data generation using constructive solid geometry following a standard graphics pipeline. In order to validate the efficacy of our method, we design a user-interface plugged into a popular sketch-based workflow, and setup a simple task-based exercise, for an artist. Thereafter, we also discover that form-exploration is an additional utility of our application.
... However, they might produce over-smooth or unrealistic results due to the lack of the domain knowledge of character heads. To address this issue, DeepSketch2Face [12] proposed an intuitive sketch modeling system for modeling human faces with diversified shapes and exaggerated facial expressions, but their system is confined to the parametric human face space. Since parametrizing animalmorphic heads into a low-dimensional parametric space is inherently difficult, SAniHead [8] poses a view-surface collaborative network to model animalmorphic head models via 3D template mesh deformation guided by 2D sketches. ...
... Although SAniHead can generate detailed animalmorphic heads by using a few strokes, it does not support accurate control of global shape. Recently, coarse-to-fine strategies have been widely adopted for 3D reconstruction and sketch-based modeling [12,29,30,36]. For instance, DeepSketch2Face [12] consists an Initial Sketching Mode to create a coarse face and a Gesture-based Refinement Mode to refine the coarse shape. ...
... Recently, coarse-to-fine strategies have been widely adopted for 3D reconstruction and sketch-based modeling [12,29,30,36]. For instance, DeepSketch2Face [12] consists an Initial Sketching Mode to create a coarse face and a Gesture-based Refinement Mode to refine the coarse shape. Since animalmorphic heads are highly diversified on global shapes and contain dedicated surface details, existing systems either require excessive human efforts or struggle to generate controllable outputs. ...
Preprint
Full-text available
Head shapes play an important role in 3D character design. In this work, we propose SimpModeling, a novel sketch-based system for helping users, especially amateur users, easily model 3D animalmorphic heads - a prevalent kind of heads in character design. Although sketching provides an easy way to depict desired shapes, it is challenging to infer dense geometric information from sparse line drawings. Recently, deepnet-based approaches have been taken to address this challenge and try to produce rich geometric details from very few strokes. However, while such methods reduce users' workload, they would cause less controllability of target shapes. This is mainly due to the uncertainty of the neural prediction. Our system tackles this issue and provides good controllability from three aspects: 1) we separate coarse shape design and geometric detail specification into two stages and respectively provide different sketching means; 2) in coarse shape designing, sketches are used for both shape inference and geometric constraints to determine global geometry, and in geometric detail crafting, sketches are used for carving surface details; 3) in both stages, we use the advanced implicit-based shape inference methods, which have strong ability to handle the domain gap between freehand sketches and synthetic ones used for training. Experimental results confirm the effectiveness of our method and the usability of our interactive system. We also contribute to a dataset of high-quality 3D animal heads, which are manually created by artists.
... 41,No. 6 other 3D tasks [Arsalan Soltani et al. 2017;Han et al. 2017;Omran et al. 2018;Qi et al. 2017;Richardson et al. 2016;Socher et al. 2012] and its fast inference properties, has recently shown interest in neural networks as a suitable alternative for fast garment animation. ...
... One of their main advantages is a fast inference time. Then, given their success in challenging 3D problems [Arsalan Soltani et al. 2017;Han et al. 2017;Omran et al. 2018;Qi et al. 2017;Richardson et al. 2016;Socher et al. 2012], researchers have already turned to deep-based solutions for garment animation. Most of the current literature on the domain relies on supervised learning [Bertiche et al. , 2021bGundogdu et al. 2019;Patel et al. 2020;Pfaff et al. 2020;Santesteban et al. 2019Santesteban et al. , 2021Wang et al. 2019;Zhang et al. 2021]. ...
Preprint
Full-text available
We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.
... Using sketches to depict shape and form is a natural and intuitive choice for humans. Consequently, numerous sketch-based interfaces and systems have been proposed for 3D modeling [Han et al. 2017;Li et al. 2017a;Lun et al. 2017] and other content creation tasks. Many of them carry the claim of being friendly even for novice users, i.e., users with limited drawing training/experience. ...
... In other words, such methods treat the inputs as hard constraints and generate shapes strictly corresponding to the inputs. Another group of methods adopt indirect reconstruction strategies: they generate shapes by deforming or refining intermediate shape proxies [Du et al. 2020;Guillard et al. 2021;Han et al. 2017;Zhang et al. 2021] to better approximate the geometric features contained in the input drawings. As mentioned earlier, the algorithm-generated line drawings strictly resemble the geometry features of source images or models. ...
Preprint
Full-text available
Multiple sketch datasets have been proposed to understand how people draw 3D objects. However, such datasets are often of small scale and cover a small set of objects or categories. In addition, these datasets contain freehand sketches mostly from expert users, making it difficult to compare the drawings by expert and novice users, while such comparisons are critical in informing more effective sketch-based interfaces for either user groups. These observations motivate us to analyze how differently people with and without adequate drawing skills sketch 3D objects. We invited 70 novice users and 38 expert users to sketch 136 3D objects, which were presented as 362 images rendered from multiple views. This leads to a new dataset of 3,620 freehand multi-view sketches, which are registered with their corresponding 3D objects under certain views. Our dataset is an order of magnitude larger than the existing datasets. We analyze the collected data at three levels, i.e., sketch-level, stroke-level, and pixel-level, under both spatial and temporal characteristics, and within and across groups of creators. We found that the drawings by professionals and novices show significant differences at stroke-level, both intrinsically and extrinsically. We demonstrate the usefulness of our dataset in two applications: (i) freehand-style sketch synthesis, and (ii) posing it as a potential benchmark for sketch-based 3D reconstruction. Our dataset and code are available at https://chufengxiao.github.io/DifferSketching/.
... One approach for building a deformable 3D caricature model is to expand regular face dataset with computational exaggeration. DeepSketch2Face [Han et al. 2017] extends regular face dataset by exaggerating faces synthetically using [Sela et al. 2015] to build a bi-linear model of 3D caricatures. CaricatureShop [Han et al. 2018] uses synthetic 3D caricatures as training data. ...
... DeepSketch2Face [Han et al. 2017] extends regular face dataset by exaggerating faces synthetically using [Sela et al. 2015] to build a bi-linear model of 3D caricatures. CaricatureShop [Han et al. 2018] uses synthetic 3D caricatures as training data. DeepSketch2Face and CarictureShop only focuses on sketch-based 3D caricature editing. ...
Preprint
A 3D caricature is an exaggerated 3D depiction of a human face. The goal of this paper is to model the variations of 3D caricatures in a compact parameter space so that we can provide a useful data-driven toolkit for handling 3D caricature deformations. To achieve the goal, we propose an MLP-based framework for building a deformable surface model, which takes a latent code and produces a 3D surface. In the framework, a SIREN MLP models a function that takes a 3D position on a fixed template surface and returns a 3D displacement vector for the input position. We create variations of 3D surfaces by learning a hypernetwork that takes a latent code and produces the parameters of the MLP. Once learned, our deformable model provides a nice editing space for 3D caricatures, supporting label-based semantic editing and point-handle-based deformation, both of which produce highly exaggerated and natural 3D caricature shapes. We also demonstrate other applications of our deformable model, such as automatic 3D caricature creation.
... However, the proposed approach suffers from the dependency of synthetic data. Han et al. [68] proposed a sketching system for 3D faces and caricatured Fig. 11 3D face shape recovery a 2D image, b 3D depth image, c Texture projection, and d Albedo histogram [59] modelling using CNN-based deep learning. Generally, the rich facial expressions were generated through MAYA and ZBrush. ...
... Real Images [68] 2017 Deep learning-based sketching GPU 8 ...
Article
Full-text available
3D face reconstruction is the most captivating topic in biometrics with the advent of deep learning and readily available graphical processing units. This paper explores the various aspects of 3D face reconstruction techniques. Five techniques have been discussed, namely, deep learning, epipolar geometry, one-shot learning, 3D morphable model, and shape from shading methods. This paper provides an in-depth analysis of 3D face reconstruction using deep learning techniques. The performance analysis of different face reconstruction techniques has been discussed in terms of software, hardware, pros and cons. The challenges and future scope of 3d face reconstruction techniques have also been discussed.
... However, the proposed approach suffers from the dependency of synthetic data. Han et al. [68] proposed a sketching system for 3D faces and caricatured Fig. 11 3D face shape recovery a 2D image, b 3D depth image, c Texture projection, and d Albedo histogram [59] modelling using CNN-based deep learning. Generally, the rich facial expressions were generated through MAYA and ZBrush. ...
... Real Images [68] 2017 Deep learning-based sketching GPU 8 ...
... Moran et al. [28] UI element's assembling after detection Dou et al. [84] Quantification of website aesthetics Halter et al. [85] Annotation tool for films Bao et al. [86] Programming code extraction Han et al. [87] 3D sketching system Bell et al. [88] product visual similarity Nishida et al. [89] sketching of urban models Shao et al. [90] Semantic modeling of indoor scenes Schlattner et al. [66] Prediction of the element's property value Bylinskii et al. [91] User's focus areas prediction Liu et al. [61] Created semantic annotations for Rico dataset. Yeo et al. [92] Pose recognition by using wearable device Kong et al. [93] Smart glass UI for the selection of home appliances Mairittha et al. [94] Mobile UIs personalization detection and perdiction Stiehl et al. [95] UI for sign wirting (hand gesture) detection Tensmeyer et al. [96] Font recognition and classification ...
Preprint
In this paper, we present a review of the recent work in deep learning methods for user interface design. The survey encompasses well known deep learning techniques (deep neural networks, convolutional neural networks, recurrent neural networks, autoencoders, and generative adversarial networks) and datasets widely used to design user interface applications. We highlight important problems and emerging research frontiers in this field. We believe that the use of deep learning for user interface design automation tasks could be one of the high potential fields for the advancement of the software development industry.
... Stylized avatar systems also exist. Some methods utilize sketches as the prior condition for generation [12,13]. Other methods are guided by position and landmarks, extracting human facial features used to deform textures and meshes [40,5,20,38]. ...
Preprint
Full-text available
Avatar creation from human images allows users to customize their digital figures in different styles. Existing rendering systems like Bitmoji, MetaHuman, and Google Cartoonset provide expressive rendering systems that serve as excellent design tools for users. However, twenty-plus parameters, some including hundreds of options, must be tuned to achieve ideal results. Thus it is challenging for users to create the perfect avatar. A machine learning model could be trained to predict avatars from images, however the annotators who label pairwise training data have the same difficulty as users, causing high label noise. In addition, each new rendering system or version update requires thousands of new training pairs. In this paper, we propose a Tag-based annotation method for avatar creation. Compared to direct annotation of labels, the proposed method: produces higher annotator agreements, causes machine learning to generates more consistent predictions, and only requires a marginal cost to add new rendering systems.
... Some works generate caricature meshes by exaggerating or deforming real face meshes, with Wu et al. 2018] or without [Lewiner et al. 2011;Vieira et al. 2013] caricature image input. Sketches can be used to guide the creation [Han et al. 2017[Han et al. , 2018. Recent works Ye et al. 2021] use GANs to generate 3D caricatures given real images. ...
Preprint
Full-text available
Stylized 3D avatars have become increasingly prominent in our modern life. Creating these avatars manually usually involves laborious selection and adjustment of continuous and discrete parameters and is time-consuming for average users. Self-supervised approaches to automatically create 3D avatars from user selfies promise high quality with little annotation cost but fall short in application to stylized avatars due to a large style domain gap. We propose a novel self-supervised learning framework to create high-quality stylized 3D avatars with a mix of continuous and discrete parameters. Our cascaded domain bridging framework first leverages a modified portrait stylization approach to translate input selfies into stylized avatar renderings as the targets for desired 3D avatars. Next, we find the best parameters of the avatars to match the stylized avatar renderings through a differentiable imitator we train to mimic the avatar graphics engine. To ensure we can effectively optimize the discrete parameters, we adopt a cascaded relaxation-and-search pipeline. We use a human preference study to evaluate how well our method preserves user identity compared to previous work as well as manual creation. Our results achieve much higher preference scores than previous work and close to those of manual creation. We also provide an ablation study to justify the design choices in our pipeline.
... Sketch-based interfaces for modeling (SBIM) is the major branch of geometric-based methods [44] and we do not review this line of work in light of the review scope. We also excluded some methods that apply deep learning techniques but require predefined geometric models to guide 3D reconstruction, such as the methods presented in [45,46]. We focus on reviewing the end-to-end deep learning-based methods. ...
Conference Paper
Full-text available
Conceptual design is the foundational stage of a design process, translating ill-defined design problems to low-fidelity design concepts and prototypes. While deep learning approaches are widely applied in later design stages for design automation, we see fewer attempts in conceptual design for three reasons: 1) the data in this stage exhibit multiple modalities: natural language, sketches, and 3D shapes, and these modalities are challenging to represent in deep learning methods; 2) it requires knowledge from a larger source of inspiration instead of focusing on a single design task; and 3) it requires translating designers’ intent and feedback, and hence needs more interaction with designers and/or users. With recent advances in deep learning of cross-modal tasks (DLCMT) and the availability of large cross-modal datasets, we see opportunities to apply these learning methods to the conceptual design of product shapes. In this paper, we review 30 recent journal articles and conference papers across computer graphics, computer vision, and engineering design fields that involve DLCMT of three modalities: natural language, sketches, and 3D shapes. Based on the review, we identify the challenges and opportunities of utilizing DLCMT in 3D shape concepts generation, from which we propose a list of research questions pointing to future research directions.
... Some research suggested designing and improving the learning ability of active animation learning system and designing a random forest model for automatic animation production so as to draw lessons from the accumulated animation data and guide animation production [7]. e content structure characteristics of online animation learning resources are comprehensively analyzed, the feature description model of content structure is developed, the description of animation content structure is improved, and the infrastructure of animation content structure is established [8]. ...
Article
Full-text available
With the development of computer technology, animation is more and more used because of its simple, effective, and higher performance. Machine learning has become the core of artificial intelligence at present. Intelligent learning algorithms are widely used in practical problems such as evaluation. Knowledge-based automatic animation production system faces two challenges: (1) lack of learning ability and waste of data on the website; (2) the quality of animation produced that depends on the level of system designer and the inability of system users to participate in animation production.In order to solve these two problems, an active animation learning system enables the animation system to constantly learn experience and produce the most popular animation, for the first time, for animation production system design and implementation of applied research. Image retrieval technology is a research center in the field of image application. It is widely used in many fields, such as electronic commerce. Animation design will use dynamic image and machine learning to innovate.
... A deep learning based sketching system has also been used for 3D face and caricature modeling [HGY17]. 2D lines representing the contours of facial features represent the input to a CNN. ...
Article
Creative processes of artists often start with hand‐drawn sketches illustrating an object. Pre‐visualizing these keyframes is especially challenging when applied to volumetric materials such as smoke. The authored 3D density volumes must capture realistic flow details and turbulent structures, which is highly non‐trivial and remains a manual and time‐consuming process. We therefore present a method to compute a 3D smoke density field directly from 2D artist sketches, bridging the gap between early‐stage prototyping of smoke keyframes and pre‐visualization. From the sketch inputs, we compute an initial volume estimate and optimize the density iteratively with an updater CNN. Our differentiable sketcher is embedded into the end‐to‐end training, which results in robust reconstructions. Our training data set and sketch augmentation strategy are designed such that it enables general applicability. We evaluate the method on synthetic inputs and sketches from artists depicting both realistic smoke volumes and highly non‐physical smoke shapes. The high computational performance and robustness of our method at test time allows interactive authoring sessions of volumetric density fields for rapid prototyping of ideas by novice users.
... Huang et al. [24] and Nishida et al. [25] utilized deep convolutional neural networks (CNN) to predict procedural modeling parameters from the input sketch. Han et al. [26] proposed a sketching system for 3D face and caricature modeling, using a CNN-based deep regression network to infer the coefficients for a bilinear face representation. Lun et al. [7] proposed an encoder-decoder network to infer multi-view depth and normal maps from the input sketch, and consolidated them into a point cloud via optimization. ...
Article
Full-text available
The freeform architectural modeling process often involves two important stages: concept design and digital modeling. In the first stage, architects usually sketch the overall 3D shape and the panel layout on a physical or digital paper briefly. In the second stage, a digital 3D model is created using the sketch as a reference. The digital model needs to incorporate geometric requirements for its components, such as the planarity of panels due to consideration of construction costs, which can make the modeling process more challenging. In this work, we present a novel sketch-based system to bridge the concept design and digital modeling of freeform roof-like shapes represented as planar quadrilateral (PQ) meshes. Our system allows the user to sketch the surface boundary and contour lines under axonometric projection and supports the sketching of occluded regions. In addition, the user can sketch feature lines to provide directional guidance to the PQ mesh layout. Given the 2D sketch input, we propose a deep neural network to infer in real-time the underlying surface shape along with a dense conjugate direction field, both of which are used to extract the final PQ mesh. To train and validate our network, we generate a large synthetic dataset that mimics architect sketching of freeform quadrilateral patches. The effectiveness and usability of our system are demonstrated with quantitative and qualitative evaluation as well as user studies.
... Literature [24] points out that computer animation creation software is a complex system, which requires the knowledge integration of computer experts, artists, and other producers with completely different knowledge structures. Literature [25] applied evolutionary computing technology to the innovative design of animation image, proposed a new method of computer-aided generation of three-dimensional animation character image, and established an intelligent animation character image design CAD system based on NURBS. In this paper, the art design and modeling method of animation image based on 3D modeling technology is proposed. ...
Article
Full-text available
This paper starts with the external visual performance of animation characters, discusses the design style of three-dimensional animation characters, and integrates with traditional art, and makes a new research and attempt from the combination of art and technology, so that nonprofessionals can easily design three-dimensional animation characters. Aiming at the problem of low recognition level of behavior control data points in traditional 3D virtual animation model method, a method of role modeling and behavior control in 3D virtual animation design is designed. Based on the physical engine, a dynamic model of the character skeleton is established, and the joint motion trajectory is simulated to complete the real-time rendering of the effect. Combined with the case analysis, it is discussed from the aspects of animation character modeling, user experience, and so on. Experimental results show that, compared with traditional methods, the data points collected by the proposed method in the process of character behavior control are more dense, the animation effect is more realistic, and it is highly effective and superior.
... Huang et al. [22] and Nishida et al. [23] utilized deep convolutional neural networks (CNN) to predict the procedural model parameters from the input sketch for procedural modeling. Han et al. [24] proposed a sketching system for 3D face and caricature modeling, using a CNN-based deep regression network to infer the coefficients for a bilinear face representation. Lun et al. [7] proposed an encoder-decoder network to infer multi-view depth and normal maps from the input sketch, which are then consolidated into a 3D point cloud via optimization. ...
Preprint
The freeform architectural modeling process often involves two important stages: concept design and digital modeling. In the first stage, architects usually sketch the overall 3D shape and the panel layout on a physical or digital paper briefly. In the second stage, a digital 3D model is created using the sketching as the reference. The digital model needs to incorporate geometric requirements for its components, such as planarity of panels due to consideration of construction costs, which can make the modeling process more challenging. In this work, we present a novel sketch-based system to bridge the concept design and digital modeling of freeform roof-like shapes represented as planar quadrilateral (PQ) meshes. Our system allows the user to sketch the surface boundary and contour lines under axonometric projection and supports the sketching of occluded regions. In addition, the user can sketch feature lines to provide directional guidance to the PQ mesh layout. Given the 2D sketch input, we propose a deep neural network to infer in real-time the underlying surface shape along with a dense conjugate direction field, both of which are used to extract the final PQ mesh. To train and validate our network, we generate a large synthetic dataset that mimics architect sketching of freeform quadrilateral patches. The effectiveness and usability of our system are demonstrated with quantitative and qualitative evaluation as well as user studies.
... This enables non-artists to transform simple black-and-white sketches into more abstract, intricate paintings. Also, with the widespread use of touch screens, new scenarios for sketch-based applications are emerging, such as sketch-based photo editing [4], sketch-based image retrieval for 2D [5] and 3D shapes [6], and 3D modelling from sketches. ...
Article
Full-text available
This paper aims to demonstrate the efficiency of the Adversarial Open Domain Adaption framework for sketch-to-photo synthesis. The unsupervised open domain adaption for generating realistic photos from a hand-drawn sketch is challenging as there is no such sketch of that class for training data. The absence of learning supervision and the huge domain gap between both the freehand drawing and picture domains make it hard. We present an approach that learns both sketch-to-photo and photo-to-sketch generation to synthesise the missing freehand drawings from pictures. Due to the domain gap between synthetic sketches and genuine ones, the generator trained on false drawings may produce unsatisfactory results when dealing with drawings of lacking classes. To address this problem, we offer a simple but effective open-domain sampling and optimization method that “tricks” the generator into considering false drawings as genuine. Our approach generalises the learnt sketch-to-photo and photo-to-sketch mappings from in-domain input to open-domain categories. On the Scribble and SketchyCOCO datasets, we compared our technique to the most current competing methods. For many types of open-domain drawings, our model outperforms impressive results in synthesising accurate colour, substance, and retaining the structural layout.
... This enables non-artists to transform simple black-and-white sketches into more abstract, intricate paintings. Also, with the widespread use of touch screens, new scenarios for sketch-based applications are emerging, such as sketch-based photo editing [4], sketch-based image retrieval for 2D [5] and 3D shapes [6], and 3D modelling from sketches. ...
Preprint
Full-text available
This paper aims to demonstrate the efficiency of the Adversarial Open Domain Adaption framework for sketch-to-photo synthesis. The unsupervised open domain adaption for generating realistic photos from a hand-drawn sketch is challenging as there is no such sketch of that class for training data. The absence of learning supervision and the huge domain gap between both the freehand drawing and picture domains make it hard. We present an approach that learns both sketch-to-photo and photo-to-sketch generation to synthesise the missing freehand drawings from pictures. Due to the domain gap between synthetic sketches and genuine ones, the generator trained on false drawings may produce unsatisfactory results when dealing with drawings of lacking classes. To address this problem, we offer a simple but effective open-domain sampling and optimization method that tricks the generator into considering false drawings as genuine. Our approach generalises the learnt sketch-to-photo and photo-to-sketch mappings from in-domain input to open-domain categories. On the Scribble and SketchyCOCO datasets, we compared our technique to the most current competing methods. For many types of open-domain drawings, our model outperforms impressive results in synthesising accurate colour, substance, and retaining the structural layout.
... To produce high-quality faces from rough or incomplete sketches, present DeepFaceDrawing, which takes a local-to-global approach and leverages manifold projection to enhance the generation quality and robustness from freehand sketches. Besides 2D image generation, Han et al. [2017] develop a sketching system for 3D face and caricature modeling. The above works use sketches to depict target geometry, and have little or no control of appearance during editing. ...
Chapter
Diffusion probabilistic model has been proven effective in generative tasks. However, its variants have not yet delivered on its effectiveness in practice of cross-dimensional multimodal generation task. Generating 3D models from single free-hand sketches is a typically tricky cross-domain problem that grows even more important and urgent due to the widespread emergence of VR/AR technologies and usage of portable touch screens. In this paper, we introduce a novel Sketch-to-Point Diffusion-ReFinement model to tackle this problem. By injecting a new conditional reconstruction network and a refinement network, we overcome the barrier of multimodal generation between the two dimensions. By explicitly conditioning the generation process on a given sketch image, our method can generate plausible point clouds restoring the sharp details and topology of 3D shapes, also matching the input sketches. Extensive experiments on various datasets show that our model achieves highly competitive performance in sketch-to-point generation task. The code is available at https://github.com/Walterkd/diffusion-refine-sketch2point.
Article
Catching the criminal on the basis of eyewitness description sketches which are generated by any software or hand drawn this technique becomes useful when there is deficiency of evidence. Recognising the sketches with face photos to find the results. A facial recognition is a technology capable of verifying a person by using deep learning is the part of data science and this creating more importance in law enforcement agencies. Deep learning generally used for recognising application like voice recognition, face recognition, audio and video recognition. In this system, we are working larger dataset because deep learning not useful for small amount of data. We first learn feature embeddings of key face components, and push corresponding parts of input sketches towards underlying component manifolds defined by the feature vectors of face component samples. We also propose another deep neural network to learn the mapping from the embedded component features to realistic images with multi-channel feature maps as intermediate results to improve the information flow. Our method essentially uses input sketches as soft constraints and is thus able to produce high-quality face images even from rough and/or incomplete sketches. Our tool is easy to use even for non-artists, while still supporting fine-grained control of shape details. Both qualitative and quantitative evaluations show the superior generation ability of our system to existing and alternative solutions. The usability and expressiveness of our system are confirmed by a user study.
Article
We present a general framework for the garment animation problem through unsupervised deep learning inspired in physically based simulation. Existing trends in the literature already explore this possibility. Nonetheless, these approaches do not handle cloth dynamics. Here, we propose the first methodology able to learn realistic cloth dynamics unsupervisedly, and henceforth, a general formulation for neural cloth simulation. The key to achieve this is to adapt an existing optimization scheme for motion from simulation based methodologies to deep learning. Then, analyzing the nature of the problem, we devise an architecture able to automatically disentangle static and dynamic cloth subspaces by design. We will show how this improves model performance. Additionally, this opens the possibility of a novel motion augmentation technique that greatly improves generalization. Finally, we show it also allows to control the level of motion in the predictions. This is a useful, never seen before, tool for artists. We provide of detailed analysis of the problem to establish the bases of neural cloth simulation and guide future research into the specifics of this domain.
Article
Multiple sketch datasets have been proposed to understand how people draw 3D objects. However, such datasets are often of small scale and cover a small set of objects or categories. In addition, these datasets contain freehand sketches mostly from expert users, making it difficult to compare the drawings by expert and novice users, while such comparisons are critical in informing more effective sketch-based interfaces for either user groups. These observations motivate us to analyze how differently people with and without adequate drawing skills sketch 3D objects. We invited 70 novice users and 38 expert users to sketch 136 3D objects, which were presented as 362 images rendered from multiple views. This leads to a new dataset of 3,620 freehand multi-view sketches, which are registered with their corresponding 3D objects under certain views. Our dataset is an order of magnitude larger than the existing datasets. We analyze the collected data at three levels, i.e., sketch-level, stroke-level, and pixel-level, under both spatial and temporal characteristics, and within and across groups of creators. We found that the drawings by professionals and novices show significant differences at stroke-level, both intrinsically and extrinsically. We demonstrate the usefulness of our dataset in two applications: (i) freehand-style sketch synthesis, and (ii) posing it as a potential benchmark for sketch-based 3D reconstruction. Our dataset and code are available at https://chufengxiao.github.io/DifferSketching/.
Article
Generating 3D models from 2D images or sketches is a widely studied important problem in computer graphics. We describe the first method to generate a 3D human model from a single sketched stick figure. In contrast to the existing human modeling techniques, our method does not require a statistical body shape model. We exploit Variational Autoencoders to develop a novel framework capable of transitioning from a simple 2D stick figure sketch, to a corresponding 3D human model. Our network learns the mapping between the input sketch and the output 3D model. Furthermore, our model learns the embedding space around these models. We demonstrate that our network can generate not only 3D models, but also 3D animations through interpolation and extrapolation in the learned embedding space. In addition to 3D human models, we produce 3D horse models in order to show the generalization ability of our framework. Extensive experiments show that our model learns to generate compatible 3D models and animations with 2D sketches.
Article
In this paper, we present a predictive and generative design approach to supporting the conceptual design of product shapes in 3D meshes. We develop a target-embedding variational autoencoder (TEVAE) neural network architecture, which consists of two modules: 1) a training module with two encoders and one decoder (E^2D network); and 2) an application module performing the generative design of new 3D shapes and the prediction of a 3D shape from its silhouette. We demonstrate the utility and effectiveness of the proposed approach in the design of 3D car body and mugs. The results show that our approach can generate a large number of novel 3D shapes and successfully predict a 3D shape based on a single silhouette sketch. The resulting 3D shapes are watertight polygon meshes with high-quality surface details, which have better visualization than voxels and point clouds, and are ready for downstream engineering evaluation (e.g., drag coefficient) and prototyping (e.g., 3D printing).
Article
2.5D cartoon models are popular methods used for simulating three-dimensional (3D) movements, such as out-of-plane rotation, from two-dimensional (2D) shapes in different views without 3D models. However, cartoon objects and characters have several exaggerations that do not correspond to any real 3D positions (e.g., Mickey Mouse’s ears), which implies that existing methods are unsuitable for designing such exaggerations. Hence, we incorporated view-dependent deformation (VDD) techniques, which have been proposed in the field of 3D character animation, into 2.5D cartoon models. The exaggerations in an arbitrary viewpoint are automatically obtained by blending the user-specified 2D shapes of key views. Several examples demonstrated the robustness of our method over previous methods. In addition, we conducted a user study and confirmed that the proposed method is effective for animating classic cartoon characters.
Article
Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community.
Article
We present a methodology to automatically obtain Pose Space Deformation (PSD) basis for rigged garments through deep learning. Classical approaches rely on Physically Based Simulations (PBS) to animate clothes. These are general solutions that, given a sufficiently fine-grained discretization of space and time, can achieve highly realistic results. However, they are computationally expensive and any scene modification prompts the need of re-simulation. Linear Blend Skinning (LBS) with PSD offers a lightweight alternative to PBS, though, it needs huge volumes of data to learn proper PSD. We propose using deep learning, formulated as an implicit PBS, to un-supervisedly learn realistic cloth Pose Space Deformations in a constrained scenario: dressed humans. Furthermore, we show it is possible to train these models in an amount of time comparable to a PBS of a few sequences. To the best of our knowledge, we are the first to propose a neural simulator for cloth. While deep-based approaches in the domain are becoming a trend, these are data-hungry models. Moreover, authors often propose complex formulations to better learn wrinkles from PBS data. Supervised learning leads to physically inconsistent predictions that require collision solving to be used. Also, dependency on PBS data limits the scalability of these solutions, while their formulation hinders its applicability and compatibility. By proposing an unsupervised methodology to learn PSD for LBS models (3D animation standard), we overcome both of these drawbacks. Results obtained show cloth-consistency in the animated garments and meaningful pose-dependant folds and wrinkles. Our solution is extremely efficient, handles multiple layers of cloth, allows unsupervised outfit resizing and can be easily applied to any custom 3D avatar.
Article
We introduce TM-NET, a novel deep generative model for synthesizing textured meshes in a part-aware manner. Once trained, the network can generate novel textured meshes from scratch or predict textures for a given 3D mesh, without image guidance. Plausible and diverse textures can be generated for the same mesh part, while texture compatibility between parts in the same shape is achieved via conditional generation. Specifically, our method produces texture maps for individual shape parts, each as a deformable box, leading to a natural UV map with limited distortion. The network separately embeds part geometry (via a PartVAE) and part texture (via a TextureVAE) into their respective latent spaces, so as to facilitate learning texture probability distributions conditioned on geometry. We introduce a conditional autoregressive model for texture generation, which can be conditioned on both part geometry and textures already generated for other parts to achieve texture compatibility. To produce high-frequency texture details, our TextureVAE operates in a high-dimensional latent space via dictionary-based vector quantization. We also exploit transparencies in the texture as an effective means to model complex shape structures including topological details. Extensive experiments demonstrate the plausibility, quality, and diversity of the textures and geometries generated by our network, while avoiding inconsistency issues that are common to novel view synthesis methods.
Article
Caricature is a kind of artistic style of human faces that attracts considerable attention in entertainment industry. So far a few 3D caricature generation methods exist and all of them require some caricature information (e.g., a caricature sketch or 2D caricature) as input. This kind of input, however, is difficult to provide by non-professional users. In this paper, we propose an end-to-end deep neural network model that generates high-quality 3D caricature directly from a simple normal face photo. The most challenging issue in our system is that the source domain of face photos (characterized by 2D normal faces) is significantly different from the target domain of 3D caricatures (characterized by 3D exaggerated face shapes and texture). To address this challenge, we (1) build a large dataset of 6,100 3D caricature meshes and use it to establish a PCA model in the 3D caricature shape space, (2) reconstruct a 3D normal full head from the input face photo and use its PCA representation in the 3D caricature shape space to set up correspondence between the input photo and 3D caricature shape, and (3) propose a novel character loss and a novel caricature loss based on previous psychological studies on caricatures. Experiments including a novel two-level user study show that our system can generate high-quality 3D caricatures directly from normal face photos.
Article
How to quickly, accurately retrieve and effectively reuse 3D CAD models that conform to user’s design intention has become an urgent problem in product design. However, there are several problems with the existing retrieval methods, like not being fast, or accurate, or hard to use. Hence it is difficult to meet the actual needs of the industry. In this paper, we propose a 3D CAD model retrieval approach that considers the speed, accuracy and ease of use at the same time, based on sketches and unsupervised learning. Firstly, the loop is used as the fundamental element of sketch/view, and automatic structural semantics capture algorithms are proposed to extract and construct attributed loop relation tree; Secondly, the recursive neural network based deep variational autoencoders is constructed and optimized to transform arbitrary shapes and sizes of loop relation tree into fixed length descriptor; Finally, based on the fixed length vector descriptor, the sketches and views of 3D CAD models are embedded into the same target feature space, and k-nearest neighbors algorithm is adopted to conduct fast CAD model matching on the feature space. In this manner, a prototype 3D CAD model retrieval system is developed. Experiments on the dataset containing about two thousand 3D CAD models validate the feasibility and effectiveness of the proposed approach.
Article
Exemplar-based portrait stylization is widely attractive and highly desired. Despite recent successes, it remains challenging, especially when considering both texture and geometric styles. In this paper, we present the first framework for one-shot 3D portrait style transfer, which can generate 3D face models with both the geometry exaggerated and the texture stylized while preserving the identity from the original content. It requires only one arbitrary style image instead of a large set of training examples for a particular style, provides geometry and texture outputs that are fully parameterized and disentangled, and enables further graphics applications with the 3D representations. The framework consists of two stages. In the first geometric style transfer stage, we use facial landmark translation to capture the coarse geometry style and guide the deformation of the dense 3D face geometry. In the second texture style transfer stage, we focus on performing style transfer on the canonical texture by adopting a differentiable renderer to optimize the texture in a multi-view framework. Experiments show that our method achieves robustly good results on different artistic styles and outperforms existing methods. We also demonstrate the advantages of our method via various 2D and 3D graphics applications.
Article
Recent facial image synthesis methods have been mainly based on conditional generative models. Sketch-based conditions can effectively describe the geometry of faces, including the contours of facial components, hair structures, as well as salient edges (e.g., wrinkles) on face surfaces but lack effective control of appearance, which is influenced by color, material, lighting condition, etc. To have more control of generated results, one possible approach is to apply existing disentangling works to disentangle face images into geometry and appearance representations. However, existing disentangling methods are not optimized for human face editing, and cannot achieve fine control of facial details such as wrinkles. To address this issue, we propose DeepFaceEditing, a structured disentanglement framework specifically designed for face images to support face generation and editing with disentangled control of geometry and appearance. We adopt a local-to-global approach to incorporate the face domain knowledge: local component images are decomposed into geometry and appearance representations, which are fused consistently using a global fusion module to improve generation quality. We exploit sketches to assist in extracting a better geometry representation, which also supports intuitive geometry editing via sketching. The resulting method can either extract the geometry and appearance representations from face images, or directly extract the geometry representation from face sketches. Such representations allow users to easily edit and synthesize face images, with decoupled control of their geometry and appearance. Both qualitative and quantitative evaluations show the superior detail and appearance control abilities of our method compared to state-of-the-art methods.
Article
Caricature generation aims to translate real photos into caricatures with artistic styles and shape exaggerations while maintaining the identity of the subject. Different from generic image-to-image translation, drawing caricatures automatically is a more challenging task due to the existence of various spatial deformations. Previous caricature generation methods are obsessed with predicting definite image warping from a given photo while ignoring the intrinsic representation and distribution of geometric exaggerations in caricatures. This limits their ability on diverse exaggeration generation. In this paper, we generalize the caricature generation problem from instance-level warping prediction to distribution-level deformation modeling. Based on this assumption, we present the first exploration for unpaired CARIcature generation with Multiple Exaggerations (CariMe) . Technically, we propose a Multi-exaggeration Warper network to learn the distribution-level mapping from photos to facial exaggerations. This makes it possible to generate diverse and reasonable exaggerations from randomly sampled warp codes given one input photo. To better represent the facial exaggeration and produce fine-grained warping, a deformation-field-based warping method is also proposed, which captures more detailed exaggerations than previous point-based warping methods. Experiments and two perceptual studies prove the superiority of our method comparing with other state-of-the-art methods, showing the improvement of our work on caricature generation. The source code is available at https://github.com/edward3862/CariMe-pytorch .
Conference Paper
Full-text available
We introduce the concept of unconstrained real-time 3D facial performance capture through explicit semantic segmentation in the RGB input. To ensure robustness, cutting edge supervised learning approaches rely on large training datasets of face images captured in the wild. While impressive tracking quality has been demonstrated for faces that are largely visible, any occlusion due to hair, accessories, or hand-to-face gestures would result in significant visual artifacts and loss of tracking accuracy. The modeling of occlusions has been mostly avoided due to its immense space of appearance variability. To address this curse of high dimensionality, we perform tracking in unconstrained images assuming non-face regions can be fully masked out. Along with recent breakthroughs in deep learning, we demonstrate that pixel-level facial segmentation is possible in real-time by repurposing convolutional neural networks designed originally for general semantic segmentation. We develop an efficient architecture based on a two-stream deconvolution network with complementary characteristics, and introduce carefully designed training samples and data augmentation strategies for improved segmentation accuracy and robustness. We adopt a state-of-the-art regression-based facial tracking framework with segmented face images as training, and demonstrate accurate and uninterrupted facial performance capture in the presence of extreme occlusion and even side views. Furthermore, the resulting segmentation can be directly used to composite partial 3D face models on the input images and enable seamless facial manipulation tasks, such as virtual make-up or face replacement.
Article
Full-text available
3D modeling remains a notoriously difficult task for novices despite significant research effort to provide intuitive and automated systems. We tackle this problem by combining the strengths of two popular domains: sketch-based modeling and procedural modeling. On the one hand, sketch-based modeling exploits our ability to draw but requires detailed, unambiguous drawings to achieve complex models. On the other hand, procedural modeling automates the creation of precise and detailed geometry but requires the tedious definition and parameterization of procedural models. Our system uses a collection of simple procedural grammars, called snippets, as building blocks to turn sketches into realistic 3D models. We use a machine learning approach to solve the inverse problem of finding the procedural model that best explains a user sketch. We use non-photorealistic rendering to generate artificial data for training convolutional neural networks capable of quickly recognizing the procedural rule intended by a sketch and estimating its parameters. We integrate our algorithm in a coarse-to-fine urban modeling system that allows users to create rich buildings by successively sketching the building mass, roof, facades, windows, and ornaments. A user study shows that by using our approach non-expert users can generate complex buildings in just a few minutes.
Article
Full-text available
We develop a system for 3D object retrieval based on sketched feature lines as input. For objective evaluation, we collect a large number of query sketches from human users that are related to an existing data base of objects. The sketches turn out to be generally quite abstract with large local and global deviations from the original shape. Based on this observation, we decide to use a bag-of-features approach over computer generated line drawings of the objects. We develop a targeted feature transform based on Gabor filters for this system. We can show objectively that this transform is better suited than other approaches from the literature developed for similar tasks. Moreover, we demonstrate how to optimize the parameters of our, as well as other approaches, based on the gathered sketches. In the resulting comparison, our approach is significantly better than any other system described so far.
Article
Full-text available
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (approx 2 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.
Article
Full-text available
User interfaces in modeling have traditionally followed the WIMP (Window, Icon, Menu, Pointer) paradigm. Though functional and very powerful, they can also be cumbersome and daunting to a novice user, and creating a complex model requires considerable expertise and effort. A recent trend is toward more accessible and natural interfaces, which has lead to sketch-based interfaces for modeling (SBIM). The goal is to allow sketches-hasty freehand drawings-to be used in the modeling process, from rough model creation through to fine detail construction. Mapping a 2D sketch to a 3D modeling operation is a difficult task, rife with ambiguity. To wit, we present a categorization based on how a SBIM application chooses to interpret a sketch, of which there are three primary methods: to create a 3D model, to add details to an existing model, or to deform and manipulate a model. Additionally, in this paper we introduce a survey of sketch-based interfaces focused on 3D geometric modeling applications. The canonical and recent works are presented and classified, including techniques for sketch acquisition, filtering, and interpretation. The survey also provides an overview of some specific applications of SBIM and a discussion of important challenges and open problems for researchers to tackle in the coming years.
Conference Paper
Full-text available
Caricatures are models of persons or things in which certain striking characteristics are exaggerated in order to create a comic or grotesque effect. This paper is concerned with a strategy for automatically generating caricatures of threedimensional models based on anthropometric measures and geometric manipulations by influence zones. In the proposed strategy, measures from a reference model serve as means of comparison with the corresponding measures in the model to be caricatured. Deformations are applied to the features that differ most from the corresponding features in the reference model. This method is independent of mesh topology. Unlike other techniques, it is possible to generate variations of caricatures, adopting different sequences of deformations, application of asymmetry and expressions. Our method also does not need 3D model databases to be used as a base of combinations or comparisons of models.
Article
Full-text available
We present FaceWarehouse, a database of 3D facial expressions for visual computing applications. We use Kinect, an off-the-shelf RGBD camera, to capture 150 individuals aged 7-80 from various ethnic backgrounds. For each person, we captured the RGBD data of her different expressions, including the neutral expression and 19 other expressions such as mouth-opening, smile, kiss, etc. For every RGBD raw data record, a set of facial feature points on the color image such as eye corners, mouth contour, and the nose tip are automatically localized, and manually adjusted if better accuracy is required. We then deform a template facial mesh to fit the depth data as closely as possible while matching the feature points on the color image to their corresponding points on the mesh. Starting from these fitted face meshes, we construct a set of individual-specific expression blendshapes for each person. These meshes with consistent topology are assembled as a rank-3 tensor to build a bilinear face model with two attributes: identity and expression. Compared with previous 3D facial databases, for every person in our database, there is a much richer matching collection of expressions, enabling depiction of most human facial actions. We demonstrate the potential of FaceWarehouse for visual computing with four applications: facial image manipulation, face component transfer, real-time performance-based facial image animation, and facial animation retargeting from video to image.
Article
Full-text available
We facilitate the creation of 3D-looking shaded production drawings from concept sketches. The key to our approach is a class of commonly used construction curves known as cross-sections, that function as an aid to both sketch creation and viewer understanding of the depicted 3D shape. In particular, intersections of these curves, or cross-hairs, convey valuable 3D information, that viewers compose into a mental model of the overall sketch. We use the artist-drawn cross-sections to automatically infer the 3D normals across the sketch, enabling 3D-like rendering. The technical contribution of our work is twofold. First, we distill artistic guidelines for drawing cross-sections and insights from perception literature to introduce an explicit mathematical formulation of the relationships between cross-section curves and the geometry they aim to convey. We then use these relationships to develop an algorithm for estimating a normal field from cross-section curve networks and other curves present in concept sketches. We validate our formulation and algorithm through a user study and a ground truth normal comparison. As demonstrated by the examples throughout the paper, these contributions enable us to shade a wide range of concept sketches with a variety of rendering styles.
Conference Paper
Full-text available
We develop a system for 3D object retrieval based on sketched feature lines as input. For objective evaluation, we collect a large number of query sketches from human users that are related to an existing data base of objects. The sketches turn out to be generally quite abstract with large local and global deviations from the original shape. Based on this observation, we decide to use a bag-of-features approach over computer generated line drawings of the objects. We develop a targeted feature transform based on Gabor filters for this system. We can show objectively that this transform is better suited than other approaches from the literature developed for similar tasks. Moreover, we demonstrate how to optimize the parameters of our, as well as other approaches, based on the gathered sketches. In the resulting comparison, our approach is significantly better than any other system described so far.
Conference Paper
Full-text available
We present a sketching interface for quickly and easily designing freeform models such as stuffed animals and other rotund objects. The user draws several 2D freeform strokes interactively on the screen and the system automatically constructs plausible 3D polygonal surfaces. Our system supports several modeling operations, including the operation to construct a 3D polygonal surface from a 2D silhouette drawn by the user: it inflates the region surrounded by the silhouette making wide areas fat, and narrow areas thin. Teddy, our prototype system, is implemented as a Java™ program, and the mesh construction is done in real-time on a standard PC. Our informal user study showed that a first-time user typically masters the operations within 10 minutes, and can construct interesting 3D models within minutes.
Article
Full-text available
We address the problem of correcting an undesirable expression on a face photo by transferring local facial components, such as a smiling mouth, from another face photo of the same person which has the desired expression. Direct copying and blending using existing compositing tools results in semantically unnatural composites, since expression is a global effect and the local component in one expression is often incompatible with the shape and other components of the face in another expression. To solve this problem we present Expression Flow, a 2D flow field which can warp the target face globally in a natural way, so that the warped face is compatible with the new facial component to be copied over. To do this, starting with the two input face photos, we jointly construct a pair of 3D face shapes with the same identity but different expressions. The expression flow is computed by projecting the difference between the two 3D shapes back to 2D. It describes how to warp the target face photo to match the expression of the reference photo. User studies suggest that our system is able to generate face composites with much higher fidelity than existing methods.
Article
Full-text available
We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.
Article
Full-text available
Achieving intuitive control of animated surface deformation while observing a specific style is an important but challenging task in computer graphics. Solutions to this task can find many applica- tions in data-driven skin animation, computer puppetry, and com- puter games. In this paper, we present an intuitive and powerful animation interface to simultaneously control the deformation of a large number of local regions on a deformable surface with a min- imal number of control points. Our method learns suitable defor- mation subspaces from training examples, and generate new de- formations on the fly according to the movements of the control points. Our contributions include a novel deformation regression method based on kernel Canonical Correlation Analysis (CCA) and a Poisson-based translation solving technique for easy and fast de- formation control based on examples. Our run-time algorithm can be implemented on GPUs and can achieve a few hundred frames per second even for large datasets with hundreds of training examples. CR Categories: I.3.7 (Computer Graphics): Three Dimensional Graphics and Realism—Animation
Article
Full-text available
This paper presents a system for performance-based character animation that enables any user to control the facial expressions of a digital avatar in realtime. The user is recorded in a natural environment using a non-intrusive, commercially available 3D sensor. The simplicity of this acquisition device comes at the cost of high noise levels in the acquired data. To effectively map low-quality 2D images and 3D depth maps to realistic facial expressions, we introduce a novel face tracking algorithm that combines geometry and texture registration with pre-recorded animation priors in a single optimization. Formulated as a maximum a posteriori estimation in a reduced parameter space, our method implicitly exploits temporal coherence to stabilize the tracking. We demonstrate that compelling 3D facial dynamics can be reconstructed in realtime without the use of face markers, intrusive lighting, or complex scanning hardware. This makes our system easy to deploy and facilitates a range of new applications, e.g. in digital gameplay or social interactions.
Article
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Article
We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.
Article
Procedural modeling techniques can produce high quality visual content through complex rule sets. However, controlling the outputs of these techniques for design purposes is often notoriously difficult for users due to the large number of parameters involved in these rule sets and also their non-linear relationship to the resulting content. To circumvent this problem, we present a sketch-based approach to procedural modeling. Given an approximate and abstract hand-drawn 2D sketch provided by a user, our algorithm automatically computes a set of procedural model parameters, which in turn yield multiple, detailed output shapes that resemble the user's input sketch. The user can then select an output shape, or further modify the sketch to explore alternative ones. At the heart of our approach is a deep Convolutional Neural Network (CNN) that is trained to map sketches to procedural model parameters. The network is trained by large amounts of automatically generated synthetic line drawings. By using an intuitive medium, i.e., freehand sketching as input, users are set free from manually adjusting procedural model parameters, yet they are still able to create high quality content. We demonstrate the accuracy and efficacy of our method in a variety of procedural modeling scenarios including design of man-made and organic shapes.
Article
We present a novel image-based representation for dynamic 3D avatars, which allows effective handling of various hairstyles and headwear, and can generate expressive facial animations with fine-scale details in real-time. We develop algorithms for creating an image-based avatar from a set of sparsely captured images of a user, using an off-the-shelf web camera at home. An optimization method is proposed to construct a topologically consistent morphable model that approximates the dynamic hair geometry in the captured images. We also design a real-time algorithm for synthesizing novel views of an image-based avatar, so that the avatar follows the facial motions of an arbitrary actor. Compelling results from our pipeline are demonstrated on a variety of cases.
Article
We present the Sketchy database, the first large-scale collection of sketch-photo pairs. We ask crowd workers to sketch particular photographic objects sampled from 125 categories and acquire 75,471 sketches of 12,500 objects. The Sketchy database gives us fine-grained associations between particular photos and sketches, and we use this to train cross-domain convolutional networks which embed sketches and photographs in a common feature space. We use our database as a benchmark for fine-grained retrieval and show that our learned representation significantly outperforms both hand-crafted features as well as deep features trained for sketch or photo classification. Beyond image retrieval, we believe the Sketchy database opens up new opportunities for sketch and image understanding and synthesis.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
The question whether a caricature of a 2D sketch, or an object in 3D can be generated automatically is probably as old as the attempt to answer the question of what defines art. In an attempt to provide a partial answer, we propose a computational approach for automatic caricaturization. The idea is to rely on intrinsic geometric properties of a given model that are invariant to poses, articulations, and gestures. A property of a surface that is preserved while it undergoes such deformations is self-isometry. In other words, while smiling, running, and posing, we do not change much the intrinsic geometry of our facial surface, the area of our body, or the size of our hands. The proposed method locally amplifies the area of a given surface based on its Gaussian curvature. It is shown to produce a natural comic exaggeration effect which can be efficiently computed as a solution of a Poisson equation. We demonstrate the power of the proposed method by applying it to a variety of meshes such as human faces, statues, and animals. The results demonstrate enhancement and exaggeration of the shape's features into an artistic caricature. As most poses and postures are almost isometries, the use of the Gaussian curvature as the scaling factor allows the proposed method to handle animated sequences while preserving the identity of the animated creature.
Conference Paper
We propose a new approach for automatic surfacing of 3D curve networks, a long standing computer graphics problem which has garnered new attention with the emergence of sketch based modeling systems capable of producing such networks. Our approach is motivated by recent studies suggesting that artist-designed curve networks consist of descriptive curves that convey intrinsic shape properties, and are dominated by representative flow lines designed to convey the principal curvature lines on the surface. Studies indicate that viewers complete the intended surface shape by envisioning a surface whose curvature lines smoothly blend these flow-line curves. Following these observations we design a surfacing framework that automatically aligns the curvature lines of the constructed surface with the representative flow lines and smoothly interpolates these representative flow, or curvature directions while minimizing undesired curvature variation. Starting with an initial triangle mesh of the network, we dynamically adapt the mesh to maximize the agreement between the principal curvature direction field on the surface and a smooth flow field suggested by the representative flow-line curves. Our main technical contribution is a framework for curvature-based surface modeling, that facilitates the creation of surfaces with prescribed curvature characteristics. We validate our method via visual inspection, via comparison to artist created and ground truth surfaces, as well as comparison to prior art, and confirm that our results are well aligned with the computed flow fields and with viewer perception of the input networks.
Article
Deformation transfer applies the deformation exhibited by a source triangle mesh onto a different target triangle mesh, Our approach is general and does not require the source and target to share the same number of vertices or triangles, or to have identical connectivity. The user builds a correspondence map between the triangles of the source and those of the target by specifying a small set of vertex markers. Deformation transfer computes the set of transformations induced by the deformation of the source mesh, maps the transformations through the correspondence from the source to the target, and solves an optimization problem to consistently apply the transformations to the target shape. The resulting system of linear equations can be factored once, after which transferring a new deformation to the target mesh requires only a backsubstitution step. Global properties such as foot placement can be achieved by constraining vertex positions. We demonstrate our method by retargeting full body key poses, applying scanned facial deformations onto a digital character, and remapping rigid and non-rigid animation sequences from one mesh onto another.
Article
True2Form is a sketch-based modeling system that reconstructs 3D curves from typical design sketches. Our approach to infer 3D form from 2D drawings is a novel mathematical framework of insights derived from perception and design literature. We note that designers favor viewpoints that maximally reveal 3D shape information, and strategically sketch descriptive curves that convey intrinsic shape properties, such as curvature, symmetry, or parallelism. Studies indicate that viewers apply these properties selectively to envision a globally consistent 3D shape. We mimic this selective regularization algorithmically, by progressively detecting and enforcing applicable properties, accounting for their global impact on an evolving 3D curve network. Balancing regularity enforcement against sketch fidelity at each step allows us to correct for inaccuracy inherent in free-hand sketching. We perceptually validate our approach by showing agreement between our algorithm and viewers in selecting applicable regularities. We further evaluate our solution by: reconstructing a range of 3D models from diversely sourced sketches; comparisons to prior art; and visual comparison to both ground-truth and 3D reconstructions by designers.
Article
This article presents an intuitive and easy-to-use system for interactively posing 3D facial expressions. The user can model and edit facial expressions by drawing freeform strokes, by specifying distances between facial points, by incrementally editing curves on the face, or by directly dragging facial points in 2D screen space. Designing such an interface for 3D facial modeling and editing is challenging because many unnatural facial expressions might be consistent with the user's input. We formulate the problem in a maximum a posteriori framework by combining the user's input with priors embedded in a large set of facial expression data. Maximizing the posteriori allows us to generate an optimal and natural facial expression that achieves the goal specified by the user. We evaluate the performance of our system by conducting a thorough comparison of our method with alternative facial modeling techniques. To demonstrate the usability of our system, we also perform a user study of our system and compare with state-of-the-art facial expression modeling software (Poser 7).
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make train-ing faster, we used non-saturating neurons and a very efficient GPU implemen-tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
Conference Paper
Creating faces is important in a number of application areas. Faces can be constructed using commercial mod-elling tools, existing faces can be transferred to a digital form using equipment such as laser scanners, and law enforcement agencies use sketch artists and photo-fit software to produce faces of suspects. We present a tech-nique that can create a 3-dimensional head using intuitive, artistic 2-dimensional sketching techniques. Our work involves bringing together two types of graphics applications: sketching interfaces and systems used to create 3-dimensional faces, through the mediation of a statistical model. We present our results where we sketch a nose and search for a geometric face model in a database whose nose best matches the sketched nose.
Article
Modeling 3D objects is difficult, especially for the user who lacks the knowledge on 3D geometry or even on 2D sketching. In this paper, we present a novel sketch‐based modeling system which allows novice users to create 3D custom models by assembling parts based on a database of pre‐segmented 3D models. Different from previous systems, our system supports the user with visualized and meaningful shadow guidance under his strokes dynamically to guide the user to convey his design concept easily and quickly. Our system interprets the user's strokes as similarity queries into database to generate the shadow image for guiding the user's further drawing and returns the 3D candidate parts for modeling simultaneously. Moreover, our system preserves the high‐level structure in generated models based on prior knowledge pre‐analyzed from the database, and allows the user to create custom parts with geometric variations. We demonstrate the applicability and effectiveness of our modeling system with human subjects and present various models designed using our system.
Article
This work presents Sketch2Scene, a framework that automatically turns a freehand sketch drawing inferring multiple scene objects to semantically valid, well arranged scenes of 3D models. Unlike the existing works on sketch-based search and composition of 3D models, which typically process individual sketched objects one by one, our technique performs co-retrieval and co-placement of 3D relevant models by jointly processing the sketched objects. This is enabled by summarizing functional and spatial relationships among models in a large collection of 3D scenes as structural groups. Our technique greatly reduces the amount of user intervention needed for sketch-based modeling of 3D scenes and fits well into the traditional production pipeline involving concept design followed by 3D modeling. A pilot study indicates that it is promising to use our technique as an alternative but more efficient tool of standard 3D modeling for 3D scene construction.
Article
We present a new algorithm for realtime face tracking on commodity RGB-D sensing devices. Our method requires no user-specific training or calibration, or any other form of manual assistance, thus enabling a range of new applications in performance-based facial animation and virtual interaction at the consumer level. The key novelty of our approach is an optimization algorithm that jointly solves for a detailed 3D expression model of the user and the corresponding dynamic tracking parameters. Realtime performance and robust computations are facilitated by a novel subspace parameterization of the dynamic facial expression space. We provide a detailed evaluation that shows that our approach significantly simplifies the performance capture workflow, while achieving accurate facial tracking for realtime applications.
Article
A common variant of caricature relies on exaggerating characteristics of a shape that differs from a reference template, usually the distinctive traits of a human portrait. This work introduces a caricature tool that interactively emphasizes the differences between two three-dimensional meshes. They are represented in the manifold harmonic basis of the shape to be caricatured, providing intrinsic controls on the deformation and its scales. It further provides a smooth localization scheme for the deformation. This lets the user edit the caricature part by part, combining different settings and models of exaggeration, all expressed in terms of harmonic filter. This formulation also allows for interactivity, rendering the resulting 3d shape in real time.
Article
We present a real-time performance-driven facial animation system based on 3D shape regression. In this system, the 3D positions of facial landmark points are inferred by a regressor from 2D video frames of an ordinary web camera. From these 3D points, the pose and expressions of the face are recovered by fitting a user-specific blendshape model to them. The main technical contribution of this work is the 3D regression algorithm that learns an accurate, user-specific face alignment model from an easily acquired set of training data, generated from images of the user performing a sequence of predefined facial poses and expressions. Experiments show that our system can accurately recover 3D face shapes even for fast motions, non-frontal faces, and exaggerated expressions. In addition, some capacity to handle partial occlusions and changing lighting conditions is demonstrated.
Article
Designing 3D objects from scratch is difficult, especially when the user intent is fuzzy and lacks a clear target form. We facilitate design by providing reference and inspiration from existing model contexts. We rethink model design as navigating through different possible combinations of part assemblies based on a large collection of pre-segmented 3D models. We propose an interactive sketch-to-design system, where the user sketches prominent features of parts to combine. The sketched strokes are analysed individually, and more importantly, in context with the other parts to generate relevant shape suggestions via adesign galleryinterface. As a modelling session progresses and more parts get selected, contextual cues become increasingly dominant, and the model quickly converges to a final form. As a key enabler, we use pre-learned part-based contextual information to allow the user to quickly explore different combinations of parts. Our experiments demonstrate the effectiveness of our approach for efficiently designing new variations from existing shape collections.
Article
Caricatures are a form of humorous visual art, usually created by skilled artists for the intention of amusement and entertainment. In this paper, we present a novel approach for automatic generation of digital caricatures from facial photographs, which capture artistic deformation styles from hand-drawn caricatures. We introduced a pseudo stress-strain model to encode the parameters of an artistic deformation style using “virtual” physical and material properties. We have also developed a software system for performing the caricaturistic deformation in 3D which eliminates the undesirable artifacts in 2D caricaturization. We employed a Multilevel Free-Form Deformation (MFFD) technique to optimize a 3D head model reconstructed from an input facial photograph, and for controlling the caricaturistic deformation. Our results demonstrated the effectiveness and usability of the proposed approach, which allows ordinary users to apply the captured and stored deformation styles to a variety of facial photographs.
Conference Paper
Recently, 3D caricature generation and applications have attracted wide attention from both the research community and the entertainment industry. This paper proposes a novel interactive approach for various and interesting 3D caricature generation based on double sampling. Firstly, according to user's operation, we obtain a coarse 3D caricature with local features transformation by sampling in well-built principle component analysis (PCA) subspace. Secondly, to utilize information of the 2D caricature dataset, we sample in the local linear embedding (LLE) manifold subspace. Finally, we use the learned 2D caricature information to further refine the coarse caricature by applying Kriging interpolation. The experiments show that the 3D caricature generated by our method can preserve highly artistic styles and also reflect the user's intention.
Conference Paper
Surface editing operations commonly require geometric details of the surface to be preserved as much as possible. We argue that geometric detail is an intrinsic property of a surface and that, consequently, surface editing is best performed by operating over an intrinsic surface representation. We provide such a representation of a surface, based on the Laplacian of the mesh, by encoding each vertex relative to its neighborhood. The Laplacian of the mesh is enhanced to be invariant to locally linearized rigid transformations and scaling. Based on this Laplacian representation, we develop useful editing operations: interactive free-form deformation in a region of interest based on the transformation of a handle, transfer and mixing of geometric details between two surfaces, and transplanting of a partial surface mesh onto another surface. The main computation involved in all operations is the solution of a sparse linear system, which can be done at interactive rates. We demonstrate the effectiveness of our approach in several examples, showing that the editing operations change the shape while respecting the structural geometric detail.
Conference Paper
We present ILoveSketch, a 3D curve sketching system that captures some of the affordances of pen and paper for pro- fessional designers, allowing them to iterate directly on concept 3D curve models. The system coherently integrates existing techniques of sketch-based interaction with a num- ber of novel and enhanced features. Novel contributions of the system include automatic view rotation to improve curve sketchability, an axis widget for sketch surface selec- tion, and implicitly inferred changes between sketching techniques. We also improve on a number of existing ideas such as a virtual sketchbook, simplified 2D and 3D view navigation, multi-stroke NURBS curve creation, and a co- hesive gesture vocabulary. An evaluation by a professional designer shows the potential of our system for deployment within a real design process.
Conference Paper
Finding effective interactive deformation techniques for complex geometric objects continues to be a challenging problem in mod- eling and animation. We present an approach that is inspired by armatures used by sculptors, in which wire curves give definition to an object and shape its deformable features. We also introduce domain curves that define the domain of deformation about an ob- ject. A wire together with a collection of domain curves provide a new basis for an implicit modeling primitive. Wires directly re- flect object geometry, and as such they provide a coarse geometric representation of an object that can be created through sketching. Furthermore, the aggregate deformation from several wires is easy to define. We show that a single wire is an appealing direct manipu- lation deformation technique; we demonstrate that the combination of wires and domain curves provide a new way to outline the shape of an implicit volume in space; and we describe techniques for the aggregation of deformations resulting from multiple wires, domain curves and their interaction with each other and other deformation techniques. The power of our approach is illustrated using appli- cations of animating figures with flexible articulations, modeling wrinkled surfaces and stitching geometry together.
Article
Abstract Recently, automatic 3D caricature generation has attracted much attention from both the research community and the game industry. Machine learning has been proven effective in the automatic generation of caricatures. However, the lack of 3D caricature samples makes it challenging to train a good model. This paper addresses this problem by two steps. First, the training set is enlarged by reconstructing 3D caricatures. We reconstruct 3D caricatures based on some 2D caricature samples with a Principal Component Analysis (PCA)-based method. Secondly, between the 2D real faces and the enlarged 3D caricatures, a regressive model is learnt by the semi-supervised manifold regularization (MR) method. We then predict 3D caricatures for 2D real faces with the learnt model. The experiments show that our novel approach synthesizes the 3D caricature more effectively than traditional methods. Moreover, our system has been applied successfully in a massive multi-user educational game to provide human-like avatars.