Conference PaperPDF Available

Non-Photorealistic Rendering of Portraits

Authors:

Abstract and Figures

We describe an image-based non-photorealistic rendering pipeline for creating portraits in two styles: The first is a somewhat ''puppet'' like rendering, that treats the face like a relatively uniform smooth surface, with the geometry being emphasised by shading. The second style is inspired by the artist Julian Opie, in which the human face is reduced to its essentials, i.e. homogeneous skin, thick black lines, and facial features such as eyes and the nose represented in a cartoon manner. Our method is able to automatically generate these stylisations without requiring the input images to be tightly cropped, direct frontal view, and moreover perform abstraction while maintaining the distinctiveness of the portraits (i.e. they should remain recognisable).
Content may be subject to copyright.
A preview of the PDF is not available
... This section demonstrates the use of NPRportrait 1.0 to evaluate 11 NPR algorithms which cover a wide range of styles and methods: neural style transfer [37], XDoG [41], oil painting [44], pebble mosaic [45], artistic sketch method [38], APDrawingGAN [39], puppet style [40], engraving [42], hedcut [43], Julian Opie style [40], watercolour [15]. In addition, the results from analysing these stylisations allow us to confirm the requirement (detailed in Section 3) that the benchmark provides a clear range of difficulty across the three levels. ...
... This section demonstrates the use of NPRportrait 1.0 to evaluate 11 NPR algorithms which cover a wide range of styles and methods: neural style transfer [37], XDoG [41], oil painting [44], pebble mosaic [45], artistic sketch method [38], APDrawingGAN [39], puppet style [40], engraving [42], hedcut [43], Julian Opie style [40], watercolour [15]. In addition, the results from analysing these stylisations allow us to confirm the requirement (detailed in Section 3) that the benchmark provides a clear range of difficulty across the three levels. ...
... On the other hand, the Julian Opie style [40] tends to make people look more masculine and a little younger. Under the signed EMD distance, opposite sign movements (differences) cancel out, so it is useful to look at the unsigned EMD distances to check the overall discrepancy. ...
Article
Full-text available
Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset.
... Rosin and Lai's algorithm [44] first stylises the image with abstracted regions of flat colours plus black and white lines [45], then fits a partial face model to the input image and attempts to detect the skin region. Shading and line rendering is stylised in the skin region, and in addition, the face model helps inform portrait-specific enhancements: reducing line clutter; improving eye detail; colouring the lips and teeth; and inserting synthesised highlights. ...
... We conducted Experiment 1 described in section 3.6 and applied it to 11 NPR algorithms which cover a wide range of styles and methods: neural style transfer [40], XDoG [42], oil painting [48], pebble mosaic [49], artistic sketch method [41], APDrawingGAN [43], puppet style [44] engraving [46], hedcut [47], Julian Opie style [44], watercolour [18]. The 11 NPR algorithms are run on the full 60-image benchmark and so the first user study to collect the four face characteristics contained 660 stylised photos. ...
... We conducted Experiment 1 described in section 3.6 and applied it to 11 NPR algorithms which cover a wide range of styles and methods: neural style transfer [40], XDoG [42], oil painting [48], pebble mosaic [49], artistic sketch method [41], APDrawingGAN [43], puppet style [44] engraving [46], hedcut [47], Julian Opie style [44], watercolour [18]. The 11 NPR algorithms are run on the full 60-image benchmark and so the first user study to collect the four face characteristics contained 660 stylised photos. ...
Preprint
Full-text available
Despite the recent upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer, the state of performance evaluation in this field is limited, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and neural style transfer) using the new benchmark dataset.
... However, if there are errors in the estimated 3D geometry this could lead to a mismatch between the orientation and thickness cues present in the engraving lines. To avoid an "uncanny valley" effect, instead we use the simple and rough proxy geometry that was previously described in Rosin and Lai's portrait stylisation algorithm [18] for creating a shading effect. ...
... This is done by applying anisotropic Gaussian filtering in the direction of the head orientation estimated by OpenFace [7]. Figure 4 demonstrates these steps. Whereas in Rosin and Lai's algorithm [18] the shading image was applied to the stylised image to create a shading effect, for engraving we use it instead to warp the dither matrix so that the lines curve around the face, providing a pseudo-3D effect. For its current application the portrait engraving has been applied to faces that are roughly frontal facing and vertically aligned. ...
Preprint
Full-text available
This paper describes a simple image-based method that applies engraving stylisation to portraits using ordered dithering. Face detection is used to estimate a rough proxy geometry of the head consisting of a cylinder, which is used to warp the dither matrix, causing the engraving lines to curve around the face for better stylisation. Finally, an application of the approach to colour engraving is demonstrated.
... Because of the importance of portraits, there are many NPR methods developed specifically for dealing with portraits [13]. Similar to general NPR methods ( [14], [15], [16]), NPR portrait methods can be categorized into stroke-based methods [17], [18], [19], regionbased methods [20], [21], [22], [23] and texture-transferbased methods [24], [25]. Stroke-based methods render portraits by simulating the stroke placement of artists. ...
... Some methods [17] warp strokes from a training artistic image, while others [18], [19] learn a stroke style model by observing stroke characteristics from artist strokes. Region-based methods decompose the input image into components, and then either match them with templates from training artistic images and recombine matched templates into a portrait [20], [21], [22], or render different facial components using different specialized algorithms [23]. Texture-transferbased methods transfer textures from the exemplar artistic image to the output rendering, using parametric texture synthesis [24] or non-parametric texture synthesis [25]. ...
Article
Full-text available
Despite significant effort and notable success of neural style transfer, it remains challenging for highly abstract styles, in particular line drawings. In this paper, we propose APDrawingGAN++, a generative adversarial network (GAN) for transforming portrait photos to artistic portrait drawings (APDrawings), which addresses substantial challenges including highly abstract style, different drawing techniques for different facial features, and high perceptual sensitivity to artifacts. To address these, we propose a composite GAN architecture that consists of local networks (to learn effective representations for specific facial features) and a global network (to capture the overall content). We provide a theoretical explanation for the necessity of this composite GAN structure by proving that any GAN with a single generator cannot generate artistic styles like APDrawings. We further introduce a classification-and-synthesis approach for lips and hair where different drawing styles are used by artists, which applies suitable styles for a given input. To better cope with lines with small misalignments while penalizing large discrepancy, a novel distance transform loss with nonlinear mapping is introduced, which improves the line quality. We also develop dedicated data augmentation and pre-training to further improve results. Experiments show that our method outperforms state-of-the-art methods, both qualitatively and quantitatively.
... In the field of NPR, many methods have been developed for generating portraits [29]. Rosin and Lai [28] proposed a method to stylize portraits using highly abstracted flat color regions. Wang et al. [38] proposed a learningbased method to stylize images into portraits which are composed of curved brush strokes. ...
... Facial components are matched with avatar template components, drawn by artist, based solely on comparison of contours between the sketch and the selfie using a normalized Hausdorff distance variant. Rosin and Lai (2015) described an image-based non-photorealistic rendering pipeline for creating portraits in two-styles-one somewhat "puppet" like rendering, and the other inspired by artist Julian Opie in which the human face is reduced to essentials. Diego-Mas and Alcaide-Marzal (2015) created a system to generate avatar by using genetic algorithms and artificial neural networks. ...
Chapter
This chapter reports on research activities originally reported at the 2015 WIPTTE conference, the last one held prior to the conference name change. The chapter, entitled “A Model and Research Agenda for Teacher and Student Collaboration Using Tablets in Digital Media Making: Results from Sub-Saharan Workshops” reported on a series of science, technology, engineering, and mathematics (STEM) projects funded by the Kenyan Ministry of Education, Microsoft Research, and the National Science Foundation. The research has continued since then, with additional support from the Namibian Ministry of Education, Arts and Culture, the US State Department’s Fulbright Program, and new, multiyear funding from NSF. The chapter describes the evolution of the project and five themes emerging from it, in addition to describing a new four-year effort related to the original paper. The original paper focused on activities in Kenya, Ghana, Uganda, and Namibia. This chapter centers on follow-up activities in Namibia.
... A wide variety of methods for the NPR of portraits have been proposed recently, some of which include algorithmic methods [26,27]. This kind of technique uses information such as Difference of Gaussian (DOG), tangent flow, and edge detection extracted from the original image for the stylization. ...
Article
Full-text available
With the advent of the deep learning method, portrait video stylization has become more popular. In this paper, we present a robust method for automatically stylizing portrait videos that contain small human faces. By extending the Mask Regions with Convolutional Neural Network features (R-CNN) with a CNN branch which detects the contour landmarks of the face, we divided the input frame into three regions: the region of facial features, the region of the inner face surrounded by 36 face contour landmarks, and the region of the outer face. Besides keeping the facial features region as it is, we used two different stroke models to render the other two regions. During the non-photorealistic rendering (NPR) of the animation video, we combined the deformable strokes and optical flow estimation between adjacent frames to follow the underlying motion coherently. The experimental results demonstrated that our method could not only effectively reserve the small and distinct facial features, but also follow the underlying motion coherently.
... To create the faces' artificial analogues, modifications were applied to the facial texture in each image or video frame (25 fps; see Rosin & Lai, 2015). This produced realistic cartoons of the same identities. ...
Article
Full-text available
Recent research has linked facial expressions to mind perception. Specifically, Bowling and Banissy (2017) found that ambiguous doll-human morphs were judged as more likely to have a mind when smiling. Herein, we investigate 3 key potential boundary conditions of this “expression-to-mind” effect. First, we demonstrate that face inversion impairs the ability of happy expressions to signal mindful states in static faces; however, inversion does not disrupt this effect for dynamic displays of emotion. Finally, we demonstrate that not all emotions have equivalent effects. Whereas happy faces generate more mind ascription compared to neutral faces, we find that expressions of disgust actually generate less mind ascription than those of happiness.
Chapter
This chapter introduces a novel system that allows users to generate customized cartoon avatars through a sketching interface. The rise of social media and personalized gaming has given a need for personalized virtual appearances. Avatars, self-curated and customized images to represent oneself, have become a common means of expressing oneself in these new media. Avatar creation platforms face the challenges of granting user significant control over the avatar creation and of encumbering the user with too many choices in their avatar customization. This chapter demonstrates a sketch-guided avatar customization system and its potential to simplify the avatar creation process.
Conference Paper
Full-text available
Input Decomposed facial components The nearest neighbor in dataset Input ustyle qqShow wCup Output Figure 1: Given a portrait image, we decompose the face into separated facial components and search for the corresponding cartoon components in a dataset by feature matching. The cartoon components are composed together to construct a cartoon face. We can easily generate cartoon faces of different styles and perform artistic beautification by using our framework. Abstract This paper presents a data-driven framework for generating cartoon-like facial representations from a given portrait image. We solve our problem by an optimization that simultaneously considers a desired artistic style, image-cartoon relationships of facial components as well as automatic adjustment of the image composition. The stylization operation consists of two steps: a face parsing step to localize and extract facial components from the input image; a cartoon generation step to cartoonize the face according to the extracted information. The components of the cartoon are assembled from a database of stylized facial components. Quantifying the similarity between facial components of input and cartoon is done by image feature matching. We incorporate prior knowledge about photo-cartoon relationships and the optimal composition of cartoon facial components extracted from a set of cartoon faces to maintain a natural and attractive look of the results.
Article
Full-text available
Skilled artists, using traditional media or modern computer painting tools, can create a variety of expressive styles that are very appealing in still images, but have been unsuitable for animation. The key difficulty is that existing techniques lack adequate temporal coherence to animate these styles effectively. Here we augment the range of practical animation styles by extending the guided texture synthesis method of Image Analogies [Hertzmann et al. 2001] to create temporally coherent animation sequences. To make the method art directable, we allow artists to paint portions of keyframes that are used as constraints. The in-betweens calculated by our method maintain stylistic continuity and yet change no more than necessary over time.
Conference Paper
Full-text available
We address the problem of interactive facial feature localization from a single image. Our goal is to obtain an accurate segmentation of facial features on high-resolution images under a variety of pose, expression, and lighting conditions. Although there has been significant work in facial feature localization, we are addressing a new application area, namely to facilitate intelligent high-quality editing of portraits, that brings requirements not met by existing methods. We propose an improvement to the Active Shape Model that allows for greater independence among the facial components and improves on the appearance fitting step by introducing a Viterbi optimization process that operates along the facial contours. Despite the improvements, we do not expect perfect results in all cases. We therefore introduce an interaction model whereby a user can efficiently guide the algorithm towards a precise solution. We introduce the Helen Facial Feature Dataset consisting of annotated portrait images gathered from Flickr that are more diverse and challenging than currently existing datasets. We present experiments that compare our automatic method to published results, and also a quantitative evaluation of the effectiveness of our interactive method.
Article
Nowadays, avatars are widely used in games and Internet environments. Especially, video game consoles such as Wii (Nintendo) use avatars for representing the user's alter ego. There are several ways to generate avatars. Most existing games or Internet services provide manual systems for generating avatars. Many researchers have suggested automatic avatar generation methods, most of which generate avatars by simplifying images using non-photorealistic rendering techniques. In this research, we suggest an example-based method that generates avatars by first matching the most similar avatar components for each facial feature and compositing them. We built a system that generates an avatar which is similar to the input front-view photograph by matching each facial feature to the corresponding feature of the avatar. The system first extracts six facial features from the input image: 2 eyes, 2 eyebrows, a mouth, and a face outline. Then it matches them to the corresponding avatar components using graph similarity and Hausdorff distance. Finally, the system generates an avatar by compositing the most similar components. We have also experimented on the effectiveness of our approach, and the results show that our system generates avatars which successfully represent their corresponding photographs.
Conference Paper
We present PortraitSketch, an interactive drawing system that helps novices create pleasing, recognizable face sketches without requiring prior artistic training. As the user traces over a source portrait photograph, PortraitSketch automatically adjusts the geometry and stroke parameters (thickness, opacity, etc.) to improve the aesthetic quality of the sketch. We present algorithms for adjusting both outlines and shading strokes based on important features of the underlying source image. In contrast to automatic stylization systems, PortraitSketch is designed to encourage a sense of ownership and accomplishment in the user. To this end, all adjustments are performed in real-time, and the user ends up directly drawing all strokes on the canvas. The findings from our user study suggest that users prefer drawing with some automatic assistance, thereby producing better drawings, and that assistance does not decrease the perceived level of involvement in the creative process.
Book
Non-photorealistic rendering (NPR) is a combination of computer graphics and computer vision that produces renderings in various artistic, expressive or stylized ways such as painting and drawing. This book focuses on image and video based NPR, where the input is a 2D photograph or a video rather than a 3D model. 2D NPR techniques have application in areas as diverse as consumer and professional digital photography and visual effects for TV and film production. The book covers the full range of the state of the art of NPR with every chapter authored by internationally renowned experts in the field, covering both classical and contemporary techniques. It will enable both graduate students in computer graphics, computer vision or image processing and professional developers alike to quickly become familiar with contemporary techniques, enabling them to apply 2D NPR algorithms in their own projects.
Conference Paper
We present a novel algorithm for stylizing photographs into portrait paintings comprised of curved brush strokes. Rather than drawing upon a prescribed set of heuristics to place strokes, our system learns a flexible model of artistic style by analyzing training data from a human artist. Given a training pair - A source image and painting of that image-a non-parametric model of style is learned by observing the geometry and tone of brush strokes local to image features. A Markov Random Field (MRF) enforces spatial coherence of style parameters. Style models local to facial features are learned using a semantic segmentation of the input face image, driven by a combination of an Active Shape Model and Graph-cut. We evaluate style transfer between a variety of training and test images, demonstrating a wide gamut of learned brush and shading styles.
Article
We use a data-driven approach to study both style and abstraction in sketching of a human face. We gather and analyze data from a number of artists as they sketch a human face from a reference photograph. To achieve different levels of abstraction in the sketches, decreasing time limits were imposed -- from four and a half minutes to fifteen seconds. We analyzed the data at two levels: strokes and geometric shape. In each, we create a model that captures both the style of the different artists and the process of abstraction. These models are then used for a portrait sketch synthesis application. Starting from a novel face photograph, we can synthesize a sketch in the various artistic styles and in different levels of abstraction.
We present a novel discriminative regression based approach for the Constrained Local Models (CLMs) framework, referred to as the Discriminative Response Map Fitting (DRMF) method, which shows impressive performance in the generic face fitting scenario. The motivation behind this approach is that, unlike the holistic texture based features used in the discriminative AAM approaches, the response map can be represented by a small set of parameters and these parameters can be very efficiently used for reconstructing unseen response maps. Furthermore, we show that by adopting very simple off-the-shelf regression techniques, it is possible to learn robust functions from response maps to the shape parameters updates. The experiments, conducted on Multi-PIE, XM2VTS and LFPW database, show that the proposed DRMF method outperforms state-of-the-art algorithms for the task of generic face fitting. Moreover, the DRMF method is computationally very efficient and is real-time capable. The current MATLAB implementation takes 1 second per image. To facilitate future comparisons, we release the MATLAB code and the pre-trained models for research purposes.
Article
Automatic bas-relief generation from 2D photographs potentially has applications to coinage, commemorative medals and souvenirs. However, current methods are not yet ready for real use in industry due to insufficient artistic effect, noticeable distortion, and unbalanced contrast. We previously proposed a shape-from-shading (SFS) based method to automatically generate bas-reliefs from single frontal photographs of human faces; however, suppression of unwanted details remained a problem. Here, we give experimental results showing how incorporating non-photorealistic rendering (NPR) into our previous framework enables us to both suppress unwanted detail, and yet also emphasize important features. We have consider an alternative approach to recovering relief shape, based on photometric stereo instead of SFS for surface orientation estimation. This can effectively reduce the computational time.