Conference PaperPDF Available

ProsumerFX: Mobile Design of Image Stylization Components

Authors:

Abstract and Figures

With the continuous advances of mobile graphics hardware, high-quality image stylization—e.g., based on image filtering, stroke-based rendering, and neural style transfer—is becoming feasible and increasingly used in casual creativity apps. The creative expression facilitated by these mobile apps, however, is typically limited with respect to the usage and application of pre-defined visual styles, which ultimately does not include their design and composition—an inherent requirement of prosumers. We present ProsumerFX, a GPU-based app that enables to interactively design parameterizable image stylization components on-device by reusing building blocks of image processing effects and pipelines. Furthermore, the presentation of the effects can be customized by modifying the icons, names, and order of parameters and presets. Thereby, the customized visual styles are defined as platform-independent effects and can be shared with other users via a web-based platform and database. Together with the presented mobile app, this system approach supports collaborative works for designing visual styles, including their rapid prototyping, A/B testing, publishing, and distribution. Thus, it satisfies the needs for creative expression of both professionals as well as the general public.
Content may be subject to copyright.
ProsumerFX: Mobile Design of Image Stylization Components
Tobias Dürschmid
Hasso Plattner Institute,
University of Potsdam, Germany
Maximilian Söchting
Hasso Plattner Institute,
University of Potsdam, Germany
Amir Semmo
Hasso Plattner Institute,
University of Potsdam, Germany
Matthias Trapp
Hasso Plattner Institute,
University of Potsdam, Germany
Jürgen Döllner
Hasso Plattner Institute,
University of Potsdam, Germany
ABC D E
Figure 1: Use case of the presented app for creating and sharing new visual styles: (A) Original photo; (B) adding a neural style
transfer eect with corresponding preset; (C) adding a watercolor lter with adjusted parameters; (D) adding a color lter with
corresponding preset; (E) sharing the created parameterizable visual eect and using it on another device.
ABSTRACT
With the continuous advances of mobile graphics hardware, high-
quality image stylization—e.g., based on image ltering, stroke-
based rendering, and neural style transfer—is becoming feasible and
increasingly used in casual creativity apps. The creative expression
facilitated by these mobile apps, however, is typically limited with
respect to the usage and application of pre-dened visual styles,
which ultimately do not include their design and composition—
an inherent requirement of prosumers. We present ProsumerFX, a
GPU-based app that enables to interactively design parameteriz-
able image stylization components on-device by reusing building
blocks of image processing eects and pipelines. Furthermore, the
presentation of the eects can be customized by modifying the
icons, names, and order of parameters and presets. Thereby, the
customized visual styles are dened as platform-independent ef-
fects and can be shared with other users via a web-based platform
and database. Together with the presented mobile app, this system
approach supports collaborative works for designing visual styles,
including their rapid prototyping, A/B testing, publishing, and dis-
tribution. Thus, it satises the needs for creative expression of both
professionals as well as the general public.
SA ’17 Symposium on Mobile Graphics & Interactive Applications , November 27-30, 2017,
Bangkok, Thailand
©2017 Copyright held by the owner/author(s).
This is the author’s version of the work. It is posted here for your personal use. Not
for redistribution. The denitive Version of Record was published in Proceedings of SA
’17 Symposium on Mobile Graphics & Interactive Applications , https://doi.org/10.1145/
3132787.3139208.
CCS CONCEPTS
Human-centered computing Collaborative content cre-
ation;Computing methodologies Image manipulation;
KEYWORDS
Image stylization, design, eect composition, mobile devices, inter-
action, rapid prototyping
ACM Reference Format:
Tobias Dürschmid, Maximilian Söchting, Amir Semmo, Matthias Trapp,
and Jürgen Döllner. 2017. ProsumerFX: Mobile Design of Image Stylization
Components. In Proceedings of SA ’17 Symposium on Mobile Graphics &
Interactive Applications . ACM, New York, NY, USA, 8 pages. https://doi.org/
10.1145/3132787.3139208
1 INTRODUCTION
Interactive image stylization enjoys a growing popularity in mobile
expressive rendering [Dev 2013]. Prominent mobile apps (e.g., In-
stagram, Snapchat, or Prisma) attract millions of users each day—a
popularity that may also be reasoned by the increased likelihood
of stylized photos to be viewed and commented on [Bakhshi et al
.
2015]. Typically, these apps provide stylization components with
predened parameter values to synthesize artistic renditions of
user-generated contents, and thus are primarily directed towards
“consuming” rather than “producing” custom parameterizable vi-
sual styles to facilitate creative expression.
Providing users with interactive tools for low-level and high-
level parameterization of image stylization components is a desired
SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand T. Dürschmid et al.
goal in three respects: (1) to facilitate creative expression at multi-
ple levels of control—e.g., for interactive non-photorealistic render-
ing [Isenberg 2016; Semmo et al
.
2016b]; (2) for the prosumption of
image and video processing operations (IVOs), i.e., by combining
aspects of consumption and production [Ritzer et al
.
2012; Ritzer
and Jurgenson 2010]; and (3) for the creation and manipulation
of visual styles and their user interface (UI) to enable rapid proto-
typing of IVOs. In particular, these are important aspects of app
development to shorten the time-to-market, facilitate user-centered
design, and thus increase customer satisfaction. However, the de-
velopment of a system that considers these aspects faces multiple
technical challenges:
Interactivity:
The modications of the IVOs should be ap-
plied with interactive performance to provide immediate
visual feedback, e.g., when adding or removing an eect, or
reordering the eect pipeline. In this respect, however, mo-
bile devices typically provide very constrained processing
capabilities and memory [Capin et al. 2008].
Device Heterogeneity:
Mobile graphics hardware and APIs
often vary considerably. On the one hand, IVOs should be
hardware optimized, but on the other hand, the IVOs need
to be platform-independent to be used on dierent devices.
Technology Transparency:
The adaption of image process-
ing operations to new technology (e.g., Vulkan or newer
OpenGL versions) and new hardware features, as well as
APIs, must be as simple as possible.
Reusable Building Blocks:
IVOs should be handled as mod-
ular units that can be referenced by others and reused for
the design of new image stylization components.
In previous works [Dürschmid et al
.
2017; Semmo et al
.
2016b], we
described a framework for interactive parameterization of image
ltering operations on three levels of control: (1) convenience pre-
sets; (2) global parameters; and (3) local parameter adjustments
using on-screen painting metaphors [Semmo et al
.
2016a]. Based
on that framework, this paper presents a system that enables its
users to easily modify and recombine IVOs, i.e., to create and share
new image stylization components that support these three inter-
action concepts. To summarize, the contributions of this paper are
as follows: (1) a concept for mobile prosumption of image styl-
ization components is proposed, which enables rapid prototyping
for professionals and casual creativity for the general public; (2)
a document format for platform-independent persistence of IVOs
is provided that allows to exchange image stylization components
across dierent devices and platforms; and (3) a web-based platform
demonstrates how created IVOs can be shared among users, thus
supporting collaboration.
2 RELATED WORK
This section describes the basic background of the prosumer culture
and discusses previous work of interactive composition as well as
design of image stylization components on desktop and mobile
platforms.
2.1 Prosumer Culture
A prosumer is a person “who is both producer and consumer” [Ritzer
et al
.
2012]. The term traces back to Alvin Toer in 1980 [Toer
1980]. However, the 21st century gives rise to the prosumer, espe-
cially because of the Web 2.0, which popularizes user-generated
content [Fuchs 2010; Ritzer et al
.
2012]. Prominent examples are
Wikipedia, Facebook, YouTube, and Flickr. Nowadays, consumers
want to become prosumers, co-create value, and be creative on their
own [Prahalad and Ramaswamy 2004].
2.2 Desktop Applications
Artistic stylization of images and video has been explored par-
ticularly on desktop systems for the task of non-photorealistic
rendering (NPR) [Kyprianidis et al
.
2013; Rosin and Collomosse
2013]. Applications using NPR techniques, however, primarily im-
plement stylization techniques as invariable rendering pipelines
of pre-dened parameter sets that cannot be individually designed
or composed. First interactive tools such as Gratin [Vergne and
Barla 2015] or ImagePlay
1
enable interactive eect specication
and editing of such pipelines, whereas contemporary 3D modeling
software (e.g., Autodesk
®
Maya and 3ds Max) traditionally supports
the professional creation of rendering eects, but which cannot be
shared and applied to images or video on mobile platforms.
2.3 Mobile Applications
In recent years, mobile expressive rendering [Dev 2013] has gained
increasing interest to simulate popular media and eects such as
cartoon [Fischer et al
.
2008], watercolor [DiVerdi et al
.
2013; Oh et al
.
2012], and oil paint [Kang and Yoon 2015; Wexler and Dezeustre
2012] for casual creativity and image abstraction [Winnemöller
2013]. For instance, the popular image ltering app Instagram uses
vignettes and color look-up tables [Selan 2004] to create retro looks
through color transformations. Thereby, users are able to adjust
global parameters, such as contrast and brightness. More complex
stylization eects are provided by neural style transfer apps [Gatys
et al
.
2016; Johnson et al
.
2016] such as Prisma, which simulate
artistic styles of famous artists and epochs, but typically do not pro-
vide explicit parameterizations of style components or phenomena.
Conversely, this can be achieved by mask-based parameter painting
apps such as BeCasso [Pasewaldt et al
.
2016; Semmo et al
.
2016a]
and PaintCan [Benedetti et al
.
2014], where image ltering tech-
niques that simulate watercolor, oil paint, and cartoon styles can be
adjusted with a high, medium and low level-of-control [Isenberg
2016]. The combination of neural style transfer with post processing
via image ltering can be done with Pictory [Semmo et al. 2017b].
Given these applications to reect the state of the art in semi-
automatic image and video stylization, however, mobile users are
only able to consume eects. By contrast, this work seeks to enable
the modication and platform-independent sharing of user-dened
visual styles. To achieve this, a system approach is proposed for
collaborative designing of visual styles and their application across
dierent platforms by providing reusable IVOs. Sharing reusable
aspects of IVOs has already been addressed in previous apps, but
which are typically limited to sharing parameter congurations
for pre-dened eects, e.g., the app Snapseed uses QR codes. By
contrast, the proposed system and app provide a more generalized
approach by also enabling to share user-dened parameterizable
eects.
1http://imageplay.io/
ProsumerFX: Mobile Design of Image Stylization Components SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand
low medium high
Consumption
Prosumption
Preset Selection
Parameter Adjustment
Parameter Painting
Parameter Modication Preset Modication
Eect Recombination
Level of Control
Figure 2: Creativity space with regard to level of control of the interactions. Low-level control is the adjustment of image
details, e.g., local parameter painting. High-level control is the convenient modication of more abstract image properties,
e.g., using presets or by adding / removing eects. The other dimension indicates whether the interaction just modies the
image by consuming an eect, or rather produces a new eect with a custom presentation.
3 DESIGNING EFFECTS ON MOBILE DEVICES
In contrast to existing image manipulation apps, the proposed ap-
proach enables users to modify image stylization components, save
the new components, and share them online. These components
consist of image and video processing operations (IVOs) (Sec. 3.1).
The client application covers two major IVO aspects: (1) the pro-
cessing function, i.e., the visual transformation function applied to
the input image to create an output image (Sec. 3.2), and (2) the pre-
sentation, i.e., the oered user experience (Sec. 3.3). The presented
mobile app provides modications of both aspects bundled in a
single view to adjust both concurrently.
The proposed concept enables its users to parameterize IVOs
on three levels of control [Isenberg 2016]: convenience presets,
global parameters, and local parameter adjustments using on-screen
painting [Semmo et al
.
2016b]. Furthermore, prosumers can modify
and recombine IVOs to create a new visual style that supports these
three levels of control. A classication of the presented interaction
concepts is given in Fig. 2.
3.1 Image and Video Processing Operations
To enable manipulation and combination of IVOs, a modular rep-
resentation is required. To provide such a modular structure, hier-
archical decomposition into eect pipelines, eects, and passes is
used [Semmo et al. 2016b] and briey described in the following.
Eect pipeline. An eect pipeline represents the highest-level
abstraction of IVOs. It has a single input image, which can originate
from dierent sources (e.g., camera or gallery), and produces a
single output image to be displayed to the user. A pipeline mainly
consists of an ordered list of eects that denes the processing steps
of the component. By recombining eects, a variety of visual styles
can be created.
Eect. An eect is a parameterizable IVO that receives a single
image as input, and outputs one resulting image. It constitutes one
atomic, sequenceable element of a visual style. Eects can delegate
reoccurring processing steps (e.g., the computation of contours
using a dierence-of-Gaussians lter [Winnemöller et al
.
2012])
to eect fragments. An eect fragment is a reusable part that can
be shared among many eects. It can have multiple inputs and
outputs and can delegate steps to other eect fragments. Eects
and eect fragments can be parameterized by users via presets and
parameters that map to technical inputs of rendering passes.
Rendering pass. Arendering pass is the lowest-level IVO. In the
context of OpenGL ES, a rendering pass is basically a shader pro-
gram having multiple inputs (textures or parameter values) and it
usually renders one output texture or buer. More complex passes
(e.g., based on convolutional neural networks, ping-pong rendering,
compute passes, or passes that execute Java code) can be used in
combination with the standard render-to-texture passes.
3.2 Modications of Processing Function
Modications of the processing function (i.e., the visual transfor-
mation of an input image into an output image) of IVOs are the
most crucial modications, users can perform. Users get instant
feedback, because these modications inuence the output image
directly and immediately. Hence, this enables rapid prototyping of
IVOs.
To extend the visual style, users can add an eect to the eects
list. Subsequently, they can interactively explore the eect data-
base of the web-based platform and choose one eect to add. The
eect is downloaded, parsed, and appended to the current pipeline.
To reduce the network trac and to provide support for oine
situations, the web pages and the downloaded eects are cached
locally. Moreover, to remove an element of the visual style, eects
can be removed from the current pipeline using a swipe gesture.
To test new combinations, the execution order of the eects in the
current pipeline can be changed by using a Drag & Drop interaction
technique. The real-time rendering performance provides immedi-
ate feedback of the current eect order, even before releasing the
currently dragged eect.
3.3 Modications of the Presentation
An IVO can be presented in dierent forms tailored to dierent
target groups, e.g., technical parameterization for experts or nu-
merous convenient presets for novice users. The proposed app
encourages the users’ creativity by enabling them to adjust the
user interface presentation of the IVO components. To provide a
consistent set of convenient parameter congurations, users can
add, remove, and reorder presets similar to the eect interactions
described previously. Furthermore, users can remove and reorder
parameters similar to presets. To customize the appearance of the
parameters, presets or eect, users can rename them and change
their representative images, e.g., icons, teaser images.
SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand T. Dürschmid et al.
Serializer
IVO
domain
model
Rendering
User
User
interface MVC
XML
Local Asset
Storage
Network
I/O
Client User Server
View
Renderer
Asset
Bundler
Asset
Publisher
Archive
HTML
requests
Data Server
REST
API
Asset Storage
File System
Meta Data Storage
Relational D atabase
JSON
Archives
downloads
Archives
requests
uploads
Input
Image
Output
Image
Archive
XML
displays
downloads
downloads
uploads
displays web pages
reads and writes
meta data
writes
reads
reads and writes
asset files
Name Software
Module
Name
Application
Boundaries
Network
Communication
File System
Access
Database
Access
Local Module
Communication
Figure 3: Overview of the system structure: Direct manipulation of image and video processing operations (IVOs) is provided
by a Model-View-Controller (MVC) [Apple Inc. 2015]. The IVOs are persisted to extensible markup language (XML) les by a
serializer [Riehle et al. 1997]. To exchange these eect XML les in conjunction with corresponding meta data, the user server
oers a REST API. The data server stores this in form of assets (i.e, reusable eect modules), and the associated meta data.
4 IMPLEMENTATION
An overview of the proposed client-server system, which enables
the design of visual eects on mobile devices, is given in Fig. 3.
To handle the complexity of the IVOs implementations, a domain
model is used, which represents the IVOs as classes and objects
(Sec. 4.1). The direct manipulation of the domain model for IVOs
is implemented using a Model-View-Controller [Apple Inc. 2015]
(Sec. 4.2). To persist the state of the model, a serializer [Riehle
et al
.
1997] is used (Sec. 4.3). Based on this developed persistence
concept, the server components implement the management and
provisioning of IVO assets (Sec. 4.4) according to [Dürschmid 2017].
4.1 Domain Model of IVOs
The states and associations of IVOs can become very complex.
Therefore, the IVOs described in Sec. 3.1 are implemented in corre-
sponding classes and objects with their relationships (Fig. 4). Since
eects and eect fragments share common responsibilities, such
as maintaining passes, parameters as well as presets, a common
super class is extracted. This object-oriented layer generalizes the
graphics APIs by encapsulating calls to them in convenience meth-
ods. By abstracting a common interface for image processing, the
domain model facilitates technology transparency.
Effect
EffectFragment
EffectBase
Parameter
EffectPipeline
Preset
<<Interface>>
IRenderable
1..*
0..*
1
0..*
1
1
0..*
1
0..*
0..*
Figure 4: Overview of the architectural domain model of
IVOs (blue) and related classes. Notation: UML 2.5 class di-
agram.
4.2 Direct Manipulation of IVOs
IVO changes performed by the user should directly inuence the
rendering state of the IVOs. Addressing usability, the user-perceived
latency of modications has to be interactive while keeping the
memory consumption low. Furthermore, the persistence of the
current state of the model needs to be as easy as possible.
To provide an architecture that supports all of these require-
ments, a Model-View-Controller (MVC) [Apple Inc. 2015] is used.
Hence, the domain model has the role of the model that represents
the domain logic of IVOs, stores the data and denes respective
operations. The user interface has the role of the view that directs
input events such as drag and drop gestures, or clicks on buttons
to the controller that transforms the events to model changes. The
controller calls the corresponding methods to manipulate the state
of the model. Afterwards, the model noties the views that regis-
tered of changes of the model (e.g., the name of a parameter) using
the Observer design pattern [Gamma et al. 1995].
By oering information hiding [Parnas 1972], this separation
provides domain model evolution. Since all information is kept
in the model, the implementation of persistence is simplied. Re-
moving an eect or preset immediately removes the underlying
model. Thereby, memory consumption is minimized. The MVC pro-
vides interactive manipulation performance, because it immediately
updates GPU resources.
4.3 Persistence of IVOs
Users should be able to share modied eects. Therefore, performed
modications should be durable in a platform-independent form.
To reuse common building blocks (e.g., common algorithms, tex-
tures, or other resources) without duplication, the document format
should support modularity.
IVOs are persisted using multiple XML les (Fig. 6) that dene its
content in a high-level, human readable form. To provide platform
independence, each IVOs is separated in a platform-independent
part (the denition) and possibly multiple platform-specic parts
ProsumerFX: Mobile Design of Image Stylization Components SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand
(the implementations). This separation is designed to modularize the
abstract, user-modiable parts of an IVO in a single le. To support
platform independence, the app keeps implementations transparent,
i.e., users cannot modify implementations from within the app. Fur-
thermore, to avoid duplication, reusable parts of implementations
can be modularized in eect fragments.
The eect denition basically species the name of the IVO, its
parameters (icon, name, type, and range), and its presets (icon,
name, and values for each parameter). Furthermore, it can con-
tain textures or geometry that dene the visual appearance of the
IVOs (e.g., canvas textures or color look-up tables) to be shared
among all implementations. The eect implementation species the
processing algorithm by dening rendering passes, corresponding
shader programs, and a control ow that denes the execution
order of the passes. An implementation set le denes the mapping
of a denition to a corresponding implementation for each device
specication.
4.4 Server-based Provisioning of IVOs
In order for eect authors to eectively share their work and prot
from each others’ creativity, a web-based platform with an eect
database has been developed and set up. It enables users of the
presented app and future apps to browse, download, and upload
IVOs. In addition thereto, the server stores meta data such as met-
rics, example images, and textual descriptions of the IVO. Instead
of simply uploading and manage complete IVOs, a modular asset
format has been developed that reduces redundancy in uploaded
eect les, optimizes bandwidth, and enables versioning of assets.
Asset Modularization. An asset can either contain an eect de-
nition, an eect implementation, an implementation set, or an eect
fragment in combination with corresponding resources, e.g., tex-
tures, icons, shaders, and feed-forward convolutional neural net-
works. To reduce duplication, reusable resources can be stored in
separate, common assets and then be referenced (cf. Fig. 6). Since
AB
CD
Figure 5: Four views of the web store interface: (A) a scrol-
lable gallery; (B) a full screen carousel; (C) a basic list; (D) a
detailed list. The two former views are based on asset listings
and therefore include an example image and a short descrip-
tion for each asset. The latter two views display all available
assets and are intended for development purposes only.
Effect definition
Implementation set
Effect
implementation
(Vulkan)
Effect
implementation
(OpenGL ES)
Common asset
(canvas textures)
Common asset
(icons)
Implementation set
Effect
implementation
(WebGL)
Effect
implementation
(OpenGL)
IVO 2
IVO 1
Effect definition
Depends on
Asset
Figure 6: Two example IVOs, separated into appropriately
categorized assets. IVO 1 is implemented for the two plat-
forms OpenGL ES and Vulkan. The specic render pipelines
are described in the respective eect implementation les.
Due to the asset separation, all implementations of IVO 1
and IVO 2 can reference and reuse the same set of canvas
textures.
assets are usually small sized, atomic units, setting up dependen-
cies between dierent assets belonging to the same IVO is possible.
These references are resolved by the user server once a client re-
quests an asset with dependencies, delivering all depended assets
using the asset bundler module. The separation into assets also en-
ables versioning of atomic assets, i.e., users can submit updated
versions of their assets and iteratively improve them. Older ver-
sions of assets can still be referenced and downloaded, but are
not visible for consumption on the database web site. Since the
device heterogeneity is addressed using implementation sets and
implementations, the server can utilize this concept to deliver only
compatible implementations. On requesting IVOs from the server,
clients can submit a device specication describing their hardware
specication and supported graphics APIs within the query. The
server then resolves the optimal implementation that is compatible
for the requesting device and delivers it as part of the IVO archive.
This reduces required bandwidth and client storage space.
Server Architecture. In order to fulll the three basic use cases
(browsing, downloading, and uploading IVOs), a modularization
over two dierent servers has been considered appropriate during
the design stage (cf. Fig. 3).
The Data Server provides a REST (Representational State Trans-
fer) API that allows for creation and retrieval of asset meta data,
asset les, and user accounts. Authentication with a user account
is required for the creation and deletion of assets. The data server
REST API is designed for developers, as all responses are delivered
as machine-readable JSON objects. The Data Server contains mini-
mal domain knowledge as it knows of the properties of the assets
but does not know any semantic meaning for any of the properties.
The User Server enables a more user-friendly interaction with
the data provided by the Data Server. First, the User Server oers
rendered HTML pages that allow for easy asset exploration through
a web-store-like interface (cf. Fig. 5). Furthermore, the server allows
users to request asset bundles, which are executable sets of assets
put into one single archive [Söchting 2017].
SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand T. Dürschmid et al.
5 USE CASES AND STYLIZATION EFFECTS
The proposed concept and implementation provides manifold use
cases and a exible design of stylization eects, which are outlined
in the following.
5.1 Use Cases
Use cases and scenarios are directed to two main target groups: (1)
novice users that utilize the system to share results of their casual
creativity, and (2) technical artists that use the system for rapid
prototyping and A/B testing of new image stylization components.
Consumption of IVOs. Components of IVOs can be consumed
and applied to user-generated contents by the users of both target
groups. In addition thereto, the individual parameters and presets
of the respective IVOs can be edited and stored locally on device for
reuse. This functionality represents also the basis for both target
groups to create variants of IVOs. Furthermore, it can be coupled
with notication or subscription models to keep users up-to-date.
Production of IVOs. By manipulating the processing function and
the presentation of IVOs, users can produce a new IVO. This use case
is common for both target groups. Rapid prototyping enables easy
and quick creation of IVOs. In the context of casual creativity, this
is the foundation to provide a tool that can be used by novice users.
Technical artists can apply the rapid prototyping features to quickly
modify the presentation of IVOs between A/B testing sessions.
Therefore, they can simply conduct multiple iterations and instantly
respond to collected feedback by adjusting the parameter order,
exchanging icons, or using more descriptive names. Afterwards,
the created visual style can be integrated in a tailored stand-alone
app, using a product line approach [Dürschmid et al. 2017].
Community as Channel for Prosumers. The web-based platform
oers possibilities of downloading (consuming) and uploading (pro-
ducing) IVOs. Therefore, it provides a channel for prosumers of IVOs
to share their created components to dierent users and groups.
Furthermore, they can rate IVOs and share them using linking or
transferred using push notications. This use case supports casual
creativity by proving an open social media channel that enables
sharing of the created results. Technical artists can use the com-
munity to distribute their work and to sell it to a wide range of
customers.
5.2 Stylization Eects
By oering a modular document format that supports to reference
common eects, eect fragments, shaders, or textures, the con-
cept supports the creation and usage of re-usable building blocks.
Thereby, two main objectives are of particular interest: the imple-
mentation of state-of-the-art stylization eects that were initially
dedicated to desktop platforms; and the convenient combination of
popular eects or buildings blocks within a single eect pipeline.
The following examples are described for the paradigms of image
ltering and example-based rendering [Kyprianidis et al. 2013].
Recreating state-of-the-art eects. We used generalized building
blocks, such as bilateral ltering and ow-based smoothing for
color abstraction as well as dierence-of-Gaussians ltering for
edge enhancement, together with eect-specic fragments to simu-
late popular media and styles. In particular, this comprises a Car-
toon eect that additionally uses color quantization [Winnemöller
et al
.
2006]; oil paint ltering using an a specialized paint texture
synthesis [Semmo et al
.
2016c]; watercolor rendering using a dis-
tinguished wet-in-wet pass as described in [Wang et al
.
2014] and
a composition pass that blends multiple texture assets to simulate
phenomena such as pigment dispersion [Bousseau et al
.
2006]; and
a pencil hatching eect that uses orientation information to align
tonal art maps [Praun et al. 2001].
Designing new eect compositions. We used our system to com-
bine each of the previously described ltering eects with a neural
style transfer (NST, e.g., known from Prisma) and color transforma-
tion (e.g., known from Instagram), as shown in Fig. 1. This combined
approach enables to transform images into high-quality artistic ren-
ditions on a global and local scale as proposed in [Semmo et al
.
2017a]. Thereby, shading-based implementations of Johnson et al.’s
feed-forward NST [Johnson et al
.
2016] are used in a rst processing
stage to yield an intermediate result. Subsequently, joint bilateral up-
sampling [Kopf et al
.
2007] of a low-resolution NST is used with the
original input image to lter high-resolution images and maintain
interactivity. The result is then processed using mentioned image
ltering eects—e.g., watercolor rendering or oil paint ltering—to
reduce ne-scale visual noise and locally inject characteristics of
artistic media, which may be interactively rened by a user. Finally,
color moods may be adjusted by using fast color transfer functions
based on color look-up tables [Selan 2004].
6 RESULTS AND DISCUSSION
The eects used for measurements in this section are: Toon (bilateral
ltering, local luminance thresholding for color quantization, and
artistic contours, comprising 14 render-to-texture passes, ve half-
oat textures and 16 byte textures); Oilpaint (ow-based painting
strokes painted on a canvas texture, consisting of 18 render-to-
texture passes, 9 half-oat textures, and 16 byte textures); Pencil
Hatching (ow-oriented example-based hatching with contours,
consisting of 14 render-to-texture passes, ve half-oat textures, 17
byte textures, and 18 small tonal art map textures); Watercolor (sim-
ulated wobbling, edge darkening, pigment density variation, and
wet-in-wet, comprising 22 render-to-texture passes, 13 half-oat
textures, and 25 byte textures); and Color Transformation (adjusting
saturation, brightness, expose, contrast, gamma, lights, and shad-
ows of the image, consisting of only one render-to-texture pass
and one byte texture). Example images of the complex eects are
shown in Fig. 7.
The used devices are the Sony Experia Z3 (a medium-class smart-
phone with a Qualcomm Adreno 330 GPU (
578 MHz
), a
2.5 GHz
original toon oil paint pencil hatching watercolor
Figure 7: Overview of the eects used for evaluating the ap-
proach.
ProsumerFX: Mobile Design of Image Stylization Components SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand
Eect Z3 (HD) S7 (HD) S7 (FHD) S7 (WQHD) Pixel C (WQHD)
Pencil Hatching 14.2 fps 30.0 fps 14.8 fps 7.8 fps 9.8 fps
Oil paint 4.3 fps 11.8 fps 3.9 fps 1.6 fps 1.1 fps
Toon 13.1 fps 30.0 fps 15.0 fps 7.8 fps 7.8 fps
Watercolor 7.3 fps 21.6 fps 8.3 fps 4.2 fps 3.6 fps
Color Transform 29.8 fps 30.0 fps 30.0 fps 30.0 fps 30.0 fps
Table 1: Frame rate of the lter eects on dierent devices.
Eect Z3 S7 Pixel C
Pencil Hatching 1.68 s 0.87 s 0.45 s
Oil paint 1.33 s 0 .41 s 0.88 s
Toon 1.09 s 1.97 s 0.42 s
Table 2: Time for parsing eects.
quad-core Krait 400 CPU, and an HD 720
×
1280 display); the
Samsung Gallaxy S7 (a high-end smartphone with an ARM Mali-
T880 MP12 GPU (
650 MHz
), an Octa-core (4
×2.3 GHz
Mongoose
&4
×1.6 GHz
Cortex-A53) CPU, and a Wide Quad High Denition
(WQHD) 2560
×
1440 display that can also be changed to Full High
Denition (FHD) and High Denition (HD); and the Google Pixel C
(a high-end tablet with a Nvidia Maxwell GPU (
850 MHz
), a 1.9 GHz
quad-core “big.LITTLE” ARMv8-A CPU, and a WQHD 2560
×
1800
display).
Interactivity. The presented app enables its users to modify high-
quality eects during run-time. It immediately updates the render-
ing pipeline and displays the resulting image with interactive frame
rates. Recombining or removing eects takes less than a millisec-
ond, even with more than ten eects in the pipeline. Downloading,
extracting, parsing, and adding an eect to the pipeline using a
100 MBit/s
connection takes
1 s
to
3 s
in total. About half of this
duration is utilized for parsing the eect les (cf. Table 2).
However, the time to display the result image depends on the
performance of the IVOs and their parameterization. Complex artis-
tic eects such as pencil hatching achieve a frame rate of
10 fps
to
20 fps
in FHD on the Samsung S7 and similar devices (cf. Table 1).
Simpler eects achieve higher frame rates. The color transforma-
tion eect even reaches the devices’ limit of
30 fps
. However, neural
style transfer targeting an output resolution of 1024
2
pixels com-
putes
3 s
to
15 s
, depending on the device. To improve the rendering
performance, the system re-renders only the passes and eects
that have their inputs changed. Additionally, the app automatically
reduces the preview quality while painting as well as in the camera
live view.
Furthermore, the amount of eects is limited with respect to their
memory consumption. If the rendering has a target resolution of
WQHD, the S7 is able to hold nine pencil hatching eect in a pipeline.
With a resolution of FHD, the limit grows to 12 pencil eects. With
HD, even 14 pencil hatching eects can be rendered in one pipeline.
Furthermore, the complexity of the eects inuences the limit of
eects in the pipeline. Six oil paint eects or six watercolor eects
can be hold in one pipeline targeting WQHD on the S7. In contrast
to this, more than 100 of the small color transformation eects can
be rendered together.
Device Heterogeneity. The platform-independent document for-
mat and the separation of IVOs in denition and implementation
addresses the device heterogeneity of the created eects. However,
if one IVO supports only one target API, e.g., one pass uses OpenGL
ES 3.1 features and does not provide a fall back for OpenGL ES 2.0,
all components that contain it cannot be executed or manipulated
on such a device. Hence, the concept enables platform independence
but does not solve it completely on its own.
Technology Transparency. The proposed concept provides sup-
port for technology transparency of IVOs by dening a document
format that abstracts concrete technologies. Therefore, adjustments
to new rendering APIs do not aect existing assets. However, if a
change to the document structure (e.g., an XML schema update)
is required, it can easily be performed by the data server using
migration scripts.
Reusable Building Blocks. The presented framework and XML-
based document format has been successfully used for teaching
image processing techniques with OpenGL ES in undergraduate
and graduate courses. In total, 190 eects have been designed and
developed using the presented document format. Since the students
were able to create new high-quality IVOs, the document format
can be considered simple enough for the use by less experienced
developers.
7 CONCLUSIONS AND FUTURE WORK
We presented a system that enables novice users and professionals
to create and manipulate their individual platform-independent,
parameterizable image stylization components based on existing
components, customize their presentation, as well as share them
in a web-based community. This kind of prosumer culture goes
beyond user-generated content, because users generate tools that
enable other users to generate content. Our scenarios represent
examples on how this concept provides a higher level of creativity
for users and supports rapid prototyping and rapid A/B testing of
visual styles for professionals. The presented concept is not limited
to support mobile devices only, and can be used as blueprint for
other implementation platforms such as desktop PCs or server-
based rendering approaches.
7.1 Future Work
Low Level Eect Modications. To extend the control that users
have over their eects, the app might support interactions for low
level eect modications, especially on tablets. Possible extensions
could be: exchanging processing steps of eect (e.g., choose another
contour extraction algorithm) to optimize visual quality or perfor-
mance; exchanging textures (e.g., canvas textures or color lookup
tables) to adjust the visual attributes of the eect’s processing func-
tion; or collapsing similar parameters to a single one to simplify
the presentation of the eect.
SA ’17 MGIA, November 27-30, 2017, Bangkok, Thailand T. Dürschmid et al.
Hinting Rendering Performance. When a user combines many
eects that have low rendering performance, the total rendering
time of the style might increase substantially. To give users advice
on their created eects, the system might hint slow eect in the
pipeline and give the user control of the rendering quality. Thereby,
users could tailor the quality/performance ratio to their purpose.
ACKNOWLEDGMENTS
We thank Erik Griese, Moritz Hilscher, Alexander Riese, and Hen-
drik Tjabben for their contributions to the design and implementa-
tion of the presented system. This work was partly funded by the
Federal Ministry of Education and Research (BMBF), Germany, for
the AVA project 01IS15041B and within the InnoProle Transfer
research group "4DnD-Vis" (www.4dndvis.de).
REFERENCES
Apple Inc. 2015. Model-View-Controller. (21 Oct. 2015). https://developer.apple.com/
library/content/documentation/General/Conceptual/DevPedia-CocoaCore/MVC.
html
Saeideh Bakhshi, David A. Shamma, Lyndon Kennedy, and Eric Gilbert. 2015. Why We
Filter Our Photos and How It Impacts Engagement. In Proc. ICWSM. AAAI Press,
12–21. http://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10573/
10484
Luca Benedetti, Holger Winnemöller, Massimiliano Corsini, and Roberto Scopigno.
2014. Painting with Bob: Assisted Creativity for Novices. In Proc. ACM Symposium
on User Interface Software and Technology. ACM, New York, 419–428. https://doi.
org/10.1145/2642918.2647415
Adrien Bousseau, Matt Kaplan, Joëlle Thollot, and François X. Sillion. 2006. Interactive
Watercolor Rendering with Temporal Coherence and Abstraction. In Proc. NPAR.
ACM, New York, 141–149. https://doi.org/10.1145/1124728.1124751
Tolga Capin, Kari Pulli, and Tomas Akenine-Möller. 2008. The State of the Art in
Mobile Graphics Research. IEEE Computer Graphics and Applications 28, 4 (2008),
74–84. https://doi.org/10.1109/MCG.2008.83
Kapil Dev. 2013. Mobile Expressive Renderings: The State of the Art. IEEE Computer
Graphics and Applications 33, 3 (2013), 22–31. https://doi.org/10.1109/MCG.2013.20
Stephen DiVerdi,Aravind Krishnaswamy, Radomir Mäch, and Daichi Ito. 2013. Painting
with Polygons: A Procedural Watercolor Engine. IEEE Transactions on Visualization
and Computer Graphics 19, 5 (2013), 723–735. https://doi.org/10.1109/TVCG.2012.
295
Tobias Dürschmid. 2017. A Framework for Editing and Execution of Image and Video
Processing Techniques on Mobile Devices. (26 July 2017). https://doi.org/10.13140/
RG.2.2.13252.32648
Tobias Dürschmid, Matthias Trapp, and Jürgen Döllner. 2017. Towards Architectural
Styles for Android App Software Product Lines. In Proc. International Conference on
Mobile Software Engineering and Systems. IEEE Press, Piscataway, NJ, USA, 58–62.
https://doi.org/10.1109/MOBILESoft.2017.12
Jan Fischer, Michael Haller, and Bruce H Thomas. 2008. Stylized Depiction in Mixed
Reality. International Journal of Virtual Reality 7, 4 (Dec. 2008), 71–79. http:
//mi-lab.org/les/publications2008/scher2008- ijvr.pdf
Christian Fuchs. 2010. Web 2.0, Prosumption, and Surveillance. Surveillance &
Society 8, 3 (2 Sept. 2010), 288–309. https://ojs.library.queensu.ca/index.php/
surveillance-and- society/article/view/4165
Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1995. Design Patterns
- Elements of Reusable Object-Oriented Software. Addison-Wesley.
Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image Style Transfer
Using Convolutional Neural Networks. In Proc. CVPR. IEEE Computer Society, Los
Alamitos, 2414–2423. https://doi.org/10.1109/CVPR.2016.265
Tobias Isenberg. 2016. Interactive NPAR: What Type of Tools Should We Create?. In
Proc. NPAR. Eurographics Association, Goslar, Germany, 89–96. https://doi.org/10.
2312/exp.20161067
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time
Style Transfer and Super-Resolution. In Proc. ECCV. Springer International, Cham,
Switzerland, 694–711. https://doi.org/10.1007/978-3- 319-46475- 6_43
Dongwann Kang and Kyunghyun Yoon. 2015. Interactive Painterly Rendering for
Mobile Devices. In Proc. International Conference on Entertainment Computing.
Springer International Publishing, Cham, Switzerland, 445–450. https://doi.org/10.
1007/978-3- 319-24589- 8_38
Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint
Bilateral Upsampling. ACM Transactions on Graphics 26, 3 (July 2007). https:
//doi.org/10.1145/1276377.1276497
Jan Eric Kyprianidis, John Collomosse, Tinghuai Wang, and Tobias Isenberg. 2013.
State of the “Art”: A Taxonomy of Artistic Stylization Techniques for Images and
Video. IEEE Transactions on Visualization and Computer Graphics 19, 5 (May 2013),
866–885. https://doi.org/10.1109/TVCG.2012.160
Junkyu Oh, SeungRol Maeng, and Jinho Park. 2012. Ecient Watercolor Painting
on Mobile Devices. International Journal of Contents 8, 4 (2012), 36–41. https:
//doi.org/10.5392/IJoC.2012.8.4.036
David Lorge Parnas. 1972. On the Criteria to Be Used in Decomposing Systems into
Modules. Commun. ACM 15, 12 (Dec. 1972), 1053–1058. https://doi.org/10.1145/
361598.361623
Sebastian Pasewaldt, Amir Semmo, Jürgen Döllner, and Frank Schlegel. 2016. BeCasso:
Artistic Image Processing and Editing on Mobile Devices. In Proc. SIGGRAPH
ASIA Mobile Graphics and Interactive Applications. ACM, New York, 14:1–14:1.
https://doi.org/10.1145/2999508.2999518
Coimbatore Krishna Prahalad and Venkat Ramaswamy. 2004. The future of competition:
Co-creating unique value with customers. Harvard Business Press.
Emil Praun, Hughes Hoppe, Matthew Webb, and Adam Finkelstein. 2001. Real-Time
Hatching. In Proc. SIGGRAPH. ACM, New York, 581–586. https://doi.org/10.1145/
383259.383328
Dirk Riehle, Wolf Siberski, Dirk Bäumer, Daniel Megert, and Heinz Zülighoven. 1997.
Pattern Languages of Program Design 3. Addison-Wesley Longman Publishing Co.,
Inc., Boston, MA, USA, Chapter Serializer, 293–312.
George Ritzer, Paul Dean, and Nathan Jurgenson. 2012. The Coming of Age of the
Prosumer. American Behavioral Scientist 56, 4 (2012), 379–398. https://doi.org/10.
1177/0002764211429368
George Ritzer and Nathan Jurgenson. 2010. Production, Consumption, Prosump-
tion. Journal of Consumer Culture 10, 1 (2010), 13–36. https://doi.org/10.1177/
1469540509354673
Paul Rosin and John Collomosse (Eds.). 2013. Image and Video based Artistic Stylisation.
Computational Imaging and Vision, Vol. 42. Springer, London/Heidelberg. https:
//doi.org/10.1007/978-1- 4471-4519- 6
Jeremy Selan. 2004. Using Lookup Tables to Accelerate Color Transformations. In GPU
Gems. Addison-Wesley, 381–392. http://http.developer.nvidia.com/GPUGems2/
gpugems2_chapter24.html
Amir Semmo, Jürgen Döllner, and Frank Schlegel. 2016a. BeCasso: Image Stylization
by Interactive Oil Paint Filtering on Mobile Devices. In ACM SIGGRAPH 2016 Appy
Hour. ACM, New York, 6:1–6:1. https://doi.org/10.1145/2936744.2936750
Amir Semmo, Tobias Dürschmid, Matthias Trapp, Mandy Klingbeil, Jürgen Döllner,
and Sebastian Pasewaldt. 2016b. Interactive Image Filtering with Multiple Levels-of-
control on Mobile Devices. In Proc. SIGGRAPH ASIA Mobile Graphics and Interactive
Applications. ACM, New York, 2:1–2:8. https://doi.org/10.1145/2999508.2999521
Amir Semmo, Tobias Isenberg, and Jürgen Döllner. 2017a. Neural Style Transfer: A Par-
adigm Shift for Image-based Artistic Rendering?. In Proc. International Symposium
on Non-Photorealistic Animation and Rendering. CM, 5:1–5:13.
Amir Semmo, Daniel Limberger, Jan Eric Kyprianidis, and Jürgen Döllner. 2016c. Image
Stylization by Interactive Oil Paint Filtering. Computers & Graphics 55, C (April
2016), 157–171. https://doi.org/10.1016/j.cag.2015.12.001
Amir Semmo, Matthias Trapp, Jürgen Döllner, and Mandy Klingbeil. 2017b. Pictory:
Combining Neural Style Transfer and Image Filtering. In ACM SIGGRAPH 2017
Appy Hour (SIGGRAPH ’17). ACM. https://doi.org/10.1145/3098900.3098906
Maximilian Söchting. 2017. Design, Implementation and Web-based Provisioning of a
Database for Image Processing Operations. (26 July 2017). https://doi.org/10.13140/
RG.2.2.23550.48964
Alvin Toer. 1980. The Third Wave. Bantam Books.
Romain Vergne and Pascal Barla. 2015. Designing Gratin, A GPU-Tailored Node-Based
System. Journal of Computer Graphics Techniques (JCGT) 4, 4 (Nov. 2015), 54–71.
http://jcgt.org/published/0004/04/03/
Miaoyi Wang, Bin Wang, Yun Fei, Kanglai Qian, Wenping Wang, Jiating Chen, and Jun-
Hai Yong. 2014. Towards Photo Watercolorization with Artistic Verisimilitude. IEEE
Transactions on Visualization and Computer Graphics 20, 10 (Feb. 2014), 1451–1460.
https://doi.org/10.1109/TVCG.2014.2303984
Daniel Wexler and Gilles Dezeustre. 2012. Intelligent Brush Strokes. In Proc. ACM
SIGGRAPH Talks. ACM, New York, NY, USA, 50:1–50:1. https://doi.org/10.1145/
2343045.2343112
Holger Winnemöller. 2013. NPR in the Wild. In Image and Video based Artistic
Stylisation, Paul Rosin and John Collomosse (Eds.). Computational Imaging and
Vision, Vol. 42. Springer, London/Heidelberg, Chapter 17, 353–374. https://doi.org/
10.1007/978-1- 4471-4519- 6_17
Holger Winnemöller, Jan Eric Kyprianidis, and Sven Olsen. 2012. XDoG: An eXtended
Dierence-of-Gaussians Compendium including Advanced Image Stylization. Com-
puters & Graphics 36, 6 (Oct. 2012), 740–753. https://doi.org/10.1016/j.cag.2012.03.
004
Holger Winnemöller, Sven C. Olsen, and Bruce Gooch. 2006. Real-Time Video Ab-
straction. ACM Transactions on Graphics 25, 3 (July 2006), 1221–1226. https:
//doi.org/10.1145/1141911.1142018
... With the continuous advances in mobile hardware, many state-of-the-art image editing tools have been transferred to mobile platforms, raising new questions about how such applications should be designed (Isenberg, 2016). These tools include image recoloring, neural style transfer and filtering methods (Ryffel et al., 2017, Dürschmid et al., 2017, Reimann et al., 2019, Semmo et al., 2017. Ryffel et al. extended their soft segmentation approach to an augmented reality application, allowing users to virtually recolor a select number of paintings in a museum (Ryffel et al., 2017). ...
... A recent neural style transfer approach proposed using a network with a reduced number of layers to reduce computation time, relying on upsampling to create high resolution results from the low resolution network output (Semmo et al., 2017). Other neural style transfer mobile applications consider the usability of the application, devising methods to increase the learnability of the application through user driven redesigns (Reimann et al., 2019) and allowing users to manipulate different style parameters and presets, and even share styles between users (Dürschmid et al., 2017). In this paper, we propose a pilot study to formatively assess user interactions with an example-based mobile video recoloring application, and to explore potential differences in user opinion and usability in terms of both the recoloring interface and the video results. ...
Conference Paper
Full-text available
Multimedia software products can be used to create and edit various aspects of online media. Recently, the affordances of mobile devices and high-speed mobile data networks mean that these editing capabilities are more readily available for mobile devices enabling a broader consumer-base. However, the precise role of the user in creative practice is often neglected in favor of reporting faster, more streamlined device functionality. In this paper, we seek to identify high-level human-computer interaction issues concerning video recoloring interfaces that are driven by the needs of different user-types via a methodological and explorative process. By conducting a pilot study, we have captured both quantitative and qualitative responses that formatively explore the role of the user in video recoloring tasks carried out on mobile devices. This research presents a variety of user responses to a video recoloring application, identifying areas of future investigation for explorative practices in user interface design for video recoloring visualization. These findings present important information for researchers exploring the use of state-of-art video recoloring processes and contribute to dialogues surrounding the study of mobile technology in use.
... This is technically achieved by parameterizing image filters at three levels of control, i. e., using presets, global parameter adjustments and complementary on-screen painting that operates within the filters' parameter spaces for local adjustments [39]. The advancement of this concept further enables to interactively design parameterizable image stylization components on-device by reusing building blocks of image processing effects and pipelines [8], which forms a particular requirement for a rapid software product line development of mobile apps [9], service-based image processing and provisioning of processing techniques [34,57,55], as well as novel interaction techniques [50]. While image and video processing techniques are traditionally implemented by following an engineering approach, recent advancements in deep learning and convolutional neural networks showed how image style transfer can be provided in a more generalized way to ease "one-click solutions" for casual creativity apps [40,32,33,30,31]. ...
... In addition to the Styles Suite, there are plug-ins for image stylization in Microsoft Office and the G Suite from Google. Furthermore, there is an Android app called ProsumerFX [8], which allows interactive image stylization and high-level modifications of the effects used. In contrast to the aforementioned clients, the ProsumerFX app performs rendering on client-side. ...
Thesis
Full-text available
With the improvement of cameras and smartphones, more and more people can now take high-resolution pictures. Especially in the field of advertising and marketing, images with extremely high resolution are needed, e. g., for high quality print results. Also, in other fields such as art or medicine, images with several billion pixels are created. Due to their size, such gigapixel images cannot be processed or displayed similar to conventional images. Processing methods for such images have high performance requirements. Especially for mobile devices, which are even more limited in screen size and memory than computers, processing such images is hardly possible. In this thesis, a service-based approach for processing gigapixel images is presented to approach this problem. Cloud-based processing using different microservices enables a hardware-independent way to process gigapixel images. Therefore, the concept and implementation of such an integration into an existing service-based architecture is presented. To enable the exploration of gigapixel images, the integration of a gigapixel image viewer into a web application is presented. Furthermore, the design and implementation will be evaluated with regard to advantages, limitations, and runtime.
... Therefore, we favor the integration of client-side processing for our approach by developing a WebGL-based image processor. With respect to modeling image processing techniques, Dürschmid et al. [6] present an approach that supports collaborative design of stylization Fig. 2 Structural overview of VCAs by example. VCA 1 is implemented for the two platforms OpenGL ES and Vulkan. ...
Article
Full-text available
Various web-based image-editing tools and web-based collaborative tools exist in isolation. Research focusing to bridge the gap between these two domains is sparse. We respond to the above and develop prototype groupware for real-time collaborative editing of raster and vector images in a web browser. To better understand the requirements, we conduct a preliminary user study and establish communication and synchronization as key elements. The existing groupware for text documents or presentations handles the above through well-established techniques. However, those cannot be extended as it is for raster or vector graphics manipulation. To this end, we develop a document model that is maintained by a server and is delivered and synchronized to multiple clients. Our prototypical implementation is based on a scalable client–server architecture: using WebGL for interactive browser-based rendering and WebSocket connections to maintain synchronization. We evaluate our work qualitatively through a post-deployment user study for three different scenarios. For quantitative evaluation, we perform a thorough performance measure on both client and server side, thereby identifying design recommendations for future concurrent image-editing software(s).
... The GPU-Processor is responsible for processing GPI tiles with hardware-accelerated graphics APIs such as OpenGL or Vulkan. For it, (1) tiles are loaded as textures into the Video Random Access Memory (VRAM), (2) the processing operations defined as Visual Computing Assets (VCAs) are applied as shader programs [5], and (3) the results are read back into RAM and returned as a response. ...
Conference Paper
Full-text available
With the ongoing improvement of digital cameras and smartphones, more and more people can acquire high- resolution digital images. Due to their size and high performance requirements, such Gigapixel Images (GPIs) are often challenging to process and explore compared to conventional low resolution images. To address this problem, this paper presents a service-based approach for GPI processing in a device-independent way using cloud-based processing. For it, the concept, design, and implementation of GPI processing functionality into service-based architectures is presented and evaluated with respect to advantages, limitations, and runtime performance.
... The GPU-Processor is responsible for processing GPI tiles with hardware-accelerated graphics APIs such as OpenGL or Vulkan. For it, (1) tiles are loaded as textures into the Video Random Access Memory (VRAM), (2) the processing operations defined as Visual Computing Assets (VCAs) are applied as shader programs [5], and (3) the results are read back into RAM and returned as a response. ...
Conference Paper
Full-text available
With the ongoing improvement of digital cameras and smartphones, more and more people can acquire high-resolution digital images. Due to their size and high performance requirements, such Gigapixel Images (GPIs) areoften challenging to process and explore compared to conventional low resolution images. To address this problem,this paper presents a service-based approach for GPI processing in a device-independent way using cloud-basedprocessing. For it, the concept, design, and implementation of GPI processing functionality into service-basedarchitectures is presented and evaluated with respect to advantages, limitations, and runtime performance.
... 2. It reports on a feasibility study regarding the Unity GE and different Visual Computing Assets (VCAs) (Dürschmid et al., 2017). ...
Conference Paper
Full-text available
This paper describes an approach for using the Unity game engine for image processing by integrating a custom GPU-based image processor. It describes different application levels and integration approaches for extending the Unity game engine. It further documents the respective software components and implementation details required, and demonstrates use cases such as scene post-processing and material-map processing.
... However, it is possible to transfer concepts from the OGC standard, such as unified data models. These data models are implemented using a platformindependent operation format [4]. In the future, it is possible to transfer even more concepts set by the OGC to the general image-processing domain, such as the standardized self-description of services. ...
... In order to access different intermediate steps of a stylization effect to be used for vectorization, a system is required where a stylization effect is divided into individual sub-steps, which can be combined as desired. Our approach is based on the work of Semmo et al. [12], which provides a GPU-based framework for complex image stylization effects and was also used for ProsumerFX [4]. It consists of three components, which are explained below: ...
... In order to access different intermediate steps of a stylization effect to be used for vectorization, a system is required where a stylization effect is divided into individual sub-steps, which can be combined as desired. Our approach is based on the work of Semmo et al. [12], which provides a GPU-based framework for complex image stylization effects and was also used for ProsumerFX [4]. It consists of three components, which are explained below: ...
Conference Paper
Full-text available
This paper presents a new approach for the vectorization of stylized images using intermediate data representations to interface image stylization and vectorization techniques. It enables the combination of efficient GPU-based implementations of interactive image stylization techniques and the advantages of vectorized image representations. We demonstrate the capabilities of our approach using half-toning and toon stylization techniques.
Chapter
Acquisition and consumption of visual media such as digital image and videos is becoming one of the most important forms of modern communication. However, since the creation and sharing of images is increasing exponentially, images as a media form suffer from being devalued, as the quality of single images are getting less and less important, and the frequency of the shared content turns to be the focus. In this work, an interactive system which allows users to interact with volatile and diverting artwork based on their eye movement only is presented. The system uses real-time image-abstraction techniques to create an artwork unique to each situation. It supports multiple, distinct interaction modes, which share common design principles, enabling users to experience game-like interactions focusing on eye-movement and the diverting image content itself. This approach hints at possible future research in the field of relaxation exercises and casual art consumption and creation.
Thesis
Full-text available
Photo editing and image stylization applications are popular tools to improve the aesthetics of images and even create works of art. The underlying image processing operations of these applications enable users to transform their images. The resulting stylized and filtered images are often shared on social media or photo sharing communities such as Flickr, allowing artists to gather feedback and get recognition for their photos. However, these kind of communities only allow to share the resulting image, not the image processing operation itself, therefore limiting creativity and collaboration. Furthermore, sharing image processing operations is conceptually challenging, due to the strong platform dependence. This thesis presents the concept of a web-based platform that enables the publishing and provisioning of image processing operations. Additionally, a document format for defining platform-independent image processing operations is proposed that enables the cross-platform execution and editing of image effects. The presented platform complements client applications that support on-device modification of image effects with features for collaboration. The system was evaluated in regard to the implementation complexity and the interactivity of the web interface in order to prove the feasibility of the concept. In the future, it is possible to extend the proposed system to a web-based social community for collaboratively designing, modifying and sharing image processing operations with features such as ratings, comments and content feeds.
Thesis
Full-text available
With the continuous advances of mobile graphics hardware, high-quality image stylization, e.g., based on image filtering, stroke-based rendering, and neural style transfer, is becoming feasible and increasingly used in casual creativity apps. Nowadays, users want to create and distribute their own works and become a prosumer, i.e., being both consumer and producer. However, the creativity facilitated by contemporary mobile apps, is typically limited with respect to the usage and application of pre-defined visual styles, that ultimately does not include their design and composition – an inherent requirement of prosumers. This thesis presents the concept and implementation of a GPU-based mobile application that enables to interactively design parameterizable image stylization effects on-device, by reusing building blocks of image processing effects and pipelines. The parameterization is supported by three levels of control: (1) convenience presets, (2) global parameters, and (3) local parameter adjustments using on-screen painting. Furthermore, the created visual styles are defined using a platform-independent document format and can be shared with other users via a web-based community platform. The presented app is evaluated with regard to variety and visual quality of the styles, run-time performance measures, memory consumption, and implementation complexity metrics to demonstrate the feasibility of the concept. The results show that the app supports the interactive combination of complex effects such as neural style transfer, watercolor filtering, oil paint filtering, and pencil hatching filtering to create unique high-quality effects. This approach supports collaborative works for designing visual styles, including their rapid prototyping, A/B testing, publishing, and distribution. Hence, it satisfies the needs for creative expression of both professionals and novice users, i.e., the general public.
Conference Paper
Full-text available
This work presents Pictory, a mobile app that empowers users to transform photos into artistic renditions by using a combination of neural style transfer with user-controlled state-of-the-art nonlinear image filtering. The combined approach features merits of both artistic rendering paradigms: deep convolutional neural networks can be used to transfer style characteristics at a global scale, while image filtering is able to simulate phenomena of artistic media at a local scale. Thereby, the proposed app implements an interactive two-stage process: first, style presets based on pre-trained feed-forward neural networks are applied using GPU-accelerated compute shaders to obtain initial results. Second, the intermediate output is stylized via oil paint, watercolor, or toon filtering to inject characteristics of traditional painting media such as pigment dispersion (watercolor) as well as soft color blendings (oil paint), and to filter artifacts such as fine-scale noise. Finally, on-screen painting facilitates pixel-precise creative control over the filtering stage, e. g., to vary the brush and color transfer, while joint bilateral upsampling enables outputs at full image resolution suited for printing on real canvas.
Conference Paper
Full-text available
In this meta paper we discuss image-based artistic rendering (IB-AR) based on neural style transfer (NST) and argue, while NST may represent a paradigm shift for IB-AR, that it also has to evolve as an interactive tool that considers the design aspects and mechanisms of artwork production. IB-AR received significant attention in the past decades for visual communication, covering a plethora of techniques to mimic the appeal of artistic media. Example-based rendering represents one the most promising paradigms in IB-AR to (semi-)automatically simulate artistic media with high fidelity, but so far has been limited because it relies on pre-defined image pairs for training or informs only low-level image features for texture transfers. Advancements in deep learning showed to alleviate these limitations by matching content and style statistics via activations of neural network layers, thus making a generalized style transfer practicable. We categorize style transfers within the taxonomy of IB-AR, then propose a semiotic structure to derive a technical research agenda for NSTs with respect to the grand challenges of NPAR. We finally discuss the potentials of NSTs, thereby identifying applications such as casual creativity and art production.
Conference Paper
Full-text available
Software product line development for Android apps is difficult due to an inflexible design of the Android framework. However, since mobile applications become more and more complex, increased code reuse and thus reduced time-to-market play an important role, which can be improved by software product lines. We propose five architectural styles for developing software product lines of Android apps: (1) activity extensions, (2) activity connectors, (3) dynamic preference entries, (4) decoupled definition of domain-specific behavior via configuration files, (5) feature model using Android resources. We demonstrate the benefits in an early case study using an image processing product line which enables more than 90% of code reuse.
Conference Paper
Full-text available
BeCasso is a mobile app that enables users to transform photos into high-quality, high-resolution non-photorealistic renditions, such as oil and watercolor paintings, cartoons, and colored pencil drawings, which are inspired by real-world paintings or drawing techniques. In contrast to neuronal network and physically-based approaches, the app employs state-of-the-art nonlinear image filtering. For example, oil paint and cartoon effects are based on smoothed structure information to interactively synthesize renderings with soft color transitions. BeCasso empowers users to easily create aesthetic renderings by implementing a two-fold strategy: First, it provides parameter presets that may serve as a starting point for a custom stylization based on global parameter adjustments. Thereby, users can obtain initial renditions that may be fine-tuned afterwards. Second, it enables local style adjustments: using on-screen painting metaphors, users are able to locally adjust different stylization features, e.g., to vary the level of abstraction, pen, brush and stroke direction or the contour lines. In this way, the app provides tools for both higher-level interaction and low-level control to serve the different needs of non-experts and digital artists.
Conference Paper
Full-text available
With the continuous development of mobile graphics hardware, interactive high-quality image stylization based on nonlinear filtering is becoming feasible and increasingly used in casual creativity apps. However, these apps often only serve high-level controls to parameterize image filters and generally lack support for low-level (artistic) control, thus automating art creation rather than assisting it. This work presents a GPU-based framework that enables to parameterize image filters at three levels of control: (1) presets followed by (2) global parameter adjustments can be interactively refined by (3) complementary on-screen painting that operates within the filters' parameter spaces for local adjustments. The framework provides a modular XML-based effect scheme to effectively build complex image processing chains-using these interactive filters as building blocks-that can be efficiently processed on mobile devices. Thereby, global and local parameterizations are directed with higher-level algorithmic support to ease the interactive editing process, which is demonstrated by state-of-the-art stylization effects, such as oil paint filtering and watercolor rendering.
Conference Paper
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.
Article
We present a novel watercolor painting drawing system which can work even on low powered computing machine such as tablet PC. Most digital watercolor systems are generated to perform on desktop, not low powered mobile computing system such as iPad. Our system can be utilized for art education besides professional painters. Our system is not a na?ve imitation of real watercolor painting, but handles with properties of watercolor such as diffusion, boundary salience, and mixing of water and pigment.