ArticlePDF Available

Using artificially generated pictures in customer- facing systems: an evaluation study with data- driven personas

Taylor & Francis
Behaviour & Information Technology
Authors:

Abstract and Figures

We conduct two studies to evaluate the suitability of artificially generated facial pictures for use in a customer-facing system using data-driven personas. STUDY 1 investigates the quality of a sample of 1,000 artificially generated facial pictures. Obtaining 6,812 crowd judgments, we find that 90% of the images are rated medium quality or better. STUDY 2 examines the application of artificially generated facial pictures in data-driven personas using an experimental setting where the high-quality pictures are implemented in persona profiles. Based on 496 participants using 4 persona treatments (2 × 2 research design), findings of Bayesian analysis show that using the artificial pictures in persona profiles did not decrease the scores for Authenticity, Clarity, Empathy, and Willingness to Use of the data-driven personas. ARTICLE HISTORY
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=tbit20
Behaviour & Information Technology
ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tbit20
Using artificially generated pictures in customer-
facing systems: an evaluation study with data-
driven personas
Joni Salminen , Soon-gyo Jung , Ahmed Mohamed Sayed Kamel , João M.
Santos & Bernard J. Jansen
To cite this article: Joni Salminen , Soon-gyo Jung , Ahmed Mohamed Sayed Kamel , João
M. Santos & Bernard J. Jansen (2020): Using artificially generated pictures in customer-facing
systems: an evaluation study with data-driven personas, Behaviour & Information Technology, DOI:
10.1080/0144929X.2020.1838610
To link to this article: https://doi.org/10.1080/0144929X.2020.1838610
© 2020 The Author(s). Published by Informa
UK Limited, trading as Taylor & Francis
Group
Published online: 06 Nov 2020.
Submit your article to this journal
View related articles
View Crossmark data
Using articially generated pictures in customer-facing systems: an evaluation
study with data-driven personas
Joni Salminen
a,b
, Soon-gyo Jung
a
, Ahmed Mohamed Sayed Kamel
c
, João M. Santos
d
and Bernard J. Jansen
a
a
Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar;
b
Turku School of Economics at the University of Turku, Turku,
Finland;
c
Department of Clinical Pharmacy, Cairo University, Giza, Egypt;
d
Instituto Universitário de Lisboa (ISCTE-IUL), Lisbon, Portugal
ABSTRACT
We conduct two studies to evaluate the suitability of articially generated facial pictures for use in
a customer-facing system using data-driven personas. STUDY 1 investigates the quality of a sample
of 1,000 articially generated facial pictures. Obtaining 6,812 crowd judgments, we nd that 90% of
the images are rated medium quality or better. STUDY 2 examines the application of articially
generated facial pictures in data-driven personas using an experimental setting where the high-
quality pictures are implemented in persona proles. Based on 496 participants using 4 persona
treatments (2 × 2 research design), ndings of Bayesian analysis show that using the articial
pictures in persona proles did not decrease the scores for Authenticity, Clarity, Empathy, and
Willingness to Use of the data-driven personas.
ARTICLE HISTORY
Received 16 April 2020
Accepted 13 October 2020
KEYWORDS
Evaluation; human
computer interaction; user
behaviour; human factors;
articially generated facial
pictures
1. Introduction
There is tremendous research interest concerning arti-
cial image generation (AIG). The state-of-the-art studies
in this eld use Generative Adversarial Networks
(GANs) (Goodfellow et al. 2014) and Conditional
GANs (Lu, Tai, and Tang 2017) to generate images
that are promised to be photorealistic and easily deploy-
able. GANs have been applied, for example, to auto-
matically create art (Tan et al. 2017), cartoons (Liu
et al. 2018), medical images (Nie et al. 2017), and facial
pictures (Karras, Laine, and Aila 2019), the latter includ-
ing transformations such as increasing/decreasing a per-
sons age or altering their gender (Antipov, Baccouche,
and Dugelay 2017; Choi et al. 2018; Isola et al. 2017).
Due its low cost, AIG provides novel opportunities for
a wide range of applications, including health-care (Nie
et al. 2017), advertising (Neumann, Pyromallis, and Alex-
ander 2018),anduseranalyticsforhumancomputer
interaction (HCI) and design purposes (Salminen et al.
2019a). However, despite the far-reaching interest in
AIG among academia and across industries, there is
scant research on evaluating the suitability of the generated
images for practical use in deployed systems.Thismeans
that the quality and impact of the articial images on
user perceptions are often neglected, lacking user studies
of their deployment in real systems. This area of evalu-
ation is an overlooked but critical area of research, as it
is the nal stepof deployment that actually determines
if the quality of the AIG is good enough, as prior work
has shown the impact that pictures can have on real sys-
tems (King, Lazard, and White 2020). Therefore, the
impact of AIG on user experience (UX) and design appli-
cations is a largely unaddressed eld of study, although
with work in related areas of empathy (Weiss and
Cohen 2019). For example, Weiss and Cohen (2019)
that aspects of empathy with subjects in videos is complex
in terms of encouraging or discouraging engagement with
the content.
Most typically, articial pictures are evaluated using
technical metrics (Yuan et al. 2020) that are abstract
and do not reect user perceptions or UX. An example
is the Frèchet inception distance (FID) (Heusel et al.
2017) that measures the similarity of two image distri-
butions (i.e. the generated set and the training set).
While metrics such as FID are without question necess-
ary for measuring the technical quality of the generated
images (Zhao et al. 2020), we argue there is also a sub-
stantial need for evaluating the user experience of the
pictures for real-world systems and applications.
In this regard, the user study tradition from HCI is
helpful in addition to technical metrics, user-centric
metrics gauging UX and user perceptions (Ashraf, Jaa-
far, and Sulaiman 2019; Brauner et al. 2019) can be
deployed. The potential impact of AIG is
© 2020 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group
This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc-
nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built
upon in any way.
CONTACT Bernard J. Jansen jjansen@acm.org Education City Research Complex, Doha, Qatar
BEHAVIOUR & INFORMATION TECHNOLOGY
https://doi.org/10.1080/0144929X.2020.1838610
transformational, including domains of public relations,
marketing, advertising, ecommerce sites, retail bro-
chures, chatbots, virtual agents, design, and others.
The use of articially generated facial images is generally
free of copyright restrictions and can allow for a wide
range of demographic diversity (age, gender, ethnicity).
Nonetheless, these benets only hold if the pictures are
good enoughfor real applications. Given the multiple
application areas of AIG, the results of an evaluation
study measuring the impact of articial facial pictures
on UX is of immediate interest for researchers and prac-
titioners alike.
To address the call for user studies concerning AIG,
we carry out two evaluation studies: (a) one addressing
the overall perceived quality of articial pictures among
crowd workers, and (b) another addressing user percep-
tions when implementing the pictures for data-driven
personas (DDPs). Our research question is: Are arti-
cially generated facial pictures good enoughfor a system
requiring substantial images of people?
DDPs are personas imaginary people representing
real user segments, as dened traditionally in HCI
(Cooper 2004) created from social media and Web ana-
lytics data (An et al. 2018a,2018b). Although the DDP
process may vary system to system, most will have the
six major steps shown in Figure 1.
The advantage of DDPs, relative to traditional personas
(Brangier and Bornet 2011) that are manually created and
typically include 37 personas per set (Hong et al. 2018),
is that one can create hundreds of DDPs from the data to
reect dierent behavioural and demographic nuances in
the underlying user population (Salminen et al. 2018b).
For example, a news organisation distributing its contents
in social media platforms to audiences originating from
dozens of countries can have dozens of audience segments
relevant for dierent decision-making scenarios in other
geographic areas (Salminen et al. 2018e). DDPs summar-
ise these segments into easily approachable human
proles (see Figure 2 for example) that can be used within
the organisation to understand the personas needs (Niel-
sen 2019) and communicate (Salminen et al. 2018d)about
these needs as a part of user-centric decision making
(Idoughi, Seah, and Kolski 2012).
One of the important issues for automatically creat-
ing DDPs from data is the availability of persona pic-
tures since DDP systems can create dozens of
personas in near real time, there is a need for an inven-
tory of pictures to use when rendering the personas for
end users to view and interact with. Conceptually, this
leads to a need for an AIG module that creates suitable
persona pictures on demand (Salminen et al. 2019a).
However, prior to engaging in system development,
there is a need for ensuring that articially generated
pictures are not detrimental to user perceptions of the
personas, or otherwise, one risks futile eorts with
immature technology. In a sense, therefore, the question
of picture quality for DDPs is also a question of feasi-
bility study (of implementation).
Note that pictures constitute an essential element of
the persona prole (Baxter, Courage, and Caine 2015;
Nielsen et al. 2015). Pictures are instrumental for the
persona to appear believable, and they have been
found impactful for central persona perceptions, such
as empathy (Probster, Haque, and Marsden 2018).
Therefore, DDPs require these pictures in order to
realise the many benets associated with the use of per-
sonas in the HCI literature (Long 2009; Nielsen and
Storgaard Hansen 2014).
To evaluate the quality,werst generate a sample of
1,000 articial facial pictures using a state-of-the-art
generator. To evaluate this sample, we then obtain
6,812 judgments from crowdworkers. To evaluate user
perceptions, we conduct a 2 × 2 experiment with DDPs
with a real/articial picture. For measurement of user
perceptions, we deploy the Persona Perception Scale
(PPS) instrument (Salminen et al. 2018c) to gauge the
impact of articial pictures on the DDPsauthenticity
and clarity, as well as the sense of empathy, and willing-
ness to use among the online pool of respondents.
Thus, our research goal is to evaluate articially gener-
ated pictures across multiple dimensions for deployment
in DDPs. Note that our goal is not to make a technical
AIG contribution. Rather, we apply a pre-existing method
for persona proles and then evaluate the results for user
perceptions. So, our contribution is in the area of practical
design and implementation of AIG.
Note also that even though we focus on DDPs in this
research, many other domains and use cases have simi-
lar needs in terms of requiring large collections of
diverse facial images, including HCI and human-robot
interaction such as avatars (Ablanedo et al. 2018;Şen-
gün 2014; Sengün 2015), robots (dos Santos et al.
2014;Duy2003;Edwards et al. 2016; Holz, Dragone,
and OHare 2009), and chatbots (Araujo 2018; Go and
Shyam Sundar 2019; Shmueli-Scheuer et al. 2018;
Zhou et al. 2019a). Thus, our evaluation study has a
cross-sectional value for other design purposes where
articial facial pictures would be useful.
2. Related literature
2.1. Lack of evaluation studies for articial
pictures
To quantify the need for evaluation studies of AIG in
real systems, we carried out a scoping review (Bazzano
2J. SALMINEN ET AL.
et al. 2017) by extracting information from 20 research
articles that generate articial facial pictures. The
articles were retrieved via Google Scholar using relevant
search phrases (automatic image generation + faces,
facial image creation,articial picture generation +
face, etc.) and focusing on peer-reviewed conference/
journal articles published between 2015 and 2019. The
list of articles, along with the extracted evaluation
methods, is provided in Supplementary Material.
Results show that evaluation methods in these
articles almost always contain one or more technical
metrics (90%, N= 18) and always a short, subjective
evaluation by the authors (100%, N= 20), in the line
of manual inspection revealed some errors but gener-
ally good quality(not an actual quote). Among the 20
articles, less than half (45%, N= 9) measured actual
human perceptions (typically using crowdsourced rat-
ings). More importantly, none of the articles provided
an evaluation study that would implement the generated
pictures into a real system or application. The results of
this scoping review thus show a general lack of user
studies for practical evaluation of AIG in real systems
or use cases (0% of the research we could locate did so).
As stated, the evaluation of AIG focuses on technical
metrics (Gao et al. 2020) of image generation (e.g. incep-
tion score (Dey et al. 2019;DiandPatel2017;Salimans
et al. 2016;Yinetal.2017), FID (Dey et al. 2019;Karras,
Laine, and Aila 2019;Linetal.2019), Euclidean distance
(Gecer et al. 2018), cosine similarity (Dey et al. 2019),
reconstruction errors (Chen et al. 2019; Lee et al. 2018),
or accuracy of face recognition (Di, Sindagi, and Patel
2018;Liuetal.2017)). Many of the technical metrics
are said to have various strengths and weaknesses (Barratt
and Sharma 2018; Karras, Laine, and Aila 2019;
Shmelkov, Schmid, and Alahari 2018;Zhangetal.
2018). The main weakness is that they do not capture
user perceptions or UX ramications of the pictures in
real applications. This is because the technical metrics
are not directly related to end-user experience when the
user is observing the pictures within the context of their
intended use (e.g. as part of DDPs).
Human evaluation studies, on the other hand, tend to
focus on comparing the outputs of dierent algorithms,
again ignoring the importance of context on the evalu-
ation results. Typically, participants are asked to rank
pictures produced using dierent algorithms from best
to worst (Li et al. 2018; Liu et al. 2017; Zhou et al.
2019b) or rate the pictures by user perception metrics,
such as realism, overall quality, and identity (Yin et al.
2017; Zhou et al. 2019b). For example, Li et al. (2018)
recruited 84 volunteers to rank three generated images
out of 10 non-makeup and 20 makeup test images
based on quality, realism, and makeup style similarity.
Lee et al. (2018) employed a similar approach by asking
users which image is more realistic out of samples cre-
ated using dierent generation methods. Similarly,
Choi et al. (2018) asked crowd workers in Amazon
Mechanical Turk (AMT) to rank the generated images
based on realism, quality of attribute transfer (hair col-
our, gender, or age), and preservation of the persons
original identity. The participants were shown four
images at a time, generated using dierent methods.
Zhang et al. (2018) conducted a two-alternative forced
choice (2AFC) test by asking AMT participants which
of the provided pictures is more similar to a reference
picture.
On rarer instances, user perception metrics, such as
realism, overall quality, and identity have been
deployed. For example, Zhou et al. (2019) evaluated
the quality of their generated results by asking if the
participants consider the generated faces as realistic
(yesor no). In their study, 88.4% of the pictures
were considered realistic. Iizuka, Simo-Serra, and Ishi-
kawa (2017) recruited ten volunteers to evaluate the
naturalnessof the generated pictures; the volunteers
were asked to guess if a picture was real or generated.
Overall, 77% of the generated pictures were deemed to
be real. Yin et al. (2017) asked students to compare 100
generated pictures with original pictures along with
three criteria: (1) saliency (the degree of the attributes
that has been changed in the picture), (2) quality (the
overall quality of the picture), and (3) identity (if the
generated and the original picture are the same per-
son). Their AIG method achieved an average quality
rating of 4.20 out of 5. While these studies are closer
to the realm of UX, we could not locate previous
research that would (a) investigate the eect of articial
pictures on UX of a real system, or (b) evaluate the
impact of using articially generated pictures on user
perceptions.However, evaluating AIG approaches for
user perceptions and UX in real systems, is crucial
for determining the success of AIG in real usage con-
texts for design, HCI, and various other areas of appli-
cation (Özmen and Yucel 2019).
Figure 1. Data-driven persona development approach. Six-step process common for most DDP methods.
BEHAVIOUR & INFORMATION TECHNOLOGY 3
2.2. Data-driven persona development
A persona is a ctive person that describes a user or cus-
tomer segment (Cooper 1999). Originating from HCI,
personas are used in various domains, such as user
experience/design (Matthews, Judge, and Whittaker
2012), marketing (Jenkinson 1994), and online analytics
(Salminen et al. 2018b) to increase the empathy by
designers, software developers, marketers (etc.) toward
the users or customers of a product (Dong, Kelkar,
and Braun 2007). Personas make it possible for decision
makers to see use cases through the eyes of the user
(Goodwin 2009) and facilitate communication between
team members through shared mental models (Pruitt
and Adlin 2006). Researchers are increasingly develop-
ing methodologies for DDPs (McGinn and Kotamraju
2008; Zhang, Brown, and Shankar 2016) and automatic
persona generation (An et al. 2018a; An et al. 2018b),
mainly due to the increase in the availability of online
user data and to increase the robustness of personas
given the alternative forms of user understanding
(Jansen, Salminen, and Jung 2020). DDPs typically
leverage quantitative social media and online analytics
data to create personas that represent users or custo-
mers of a specic channel
1
(Salminen et al. 2017).
Regarding the development of DDPs, for the generated
pictures to be useful for personas, they need to be taken
for real, meaning that they do not hinder the user per-
ceptions of the personas (e.g. not reduce the personas
authenticity).
2.3. Persona user perceptions
Evaluation of user perceptions has been noted as a
major concern of personas. Scholars have observed
that personas need justication, mainly for their accu-
racy and usefulness in real organisations and usage
scenarios (Chapman and Milham 2006; Friess 2012;
Matthews, Judge, and Whittaker 2012). Prior research
typically examines persona user perceptions via case
studies (Faily and Flechais 2011; Jansen, Van Mechelen,
Figure 2. Example of DDP. The persona has a picture (stock photo in this example), name, age, text description, topics of interest,
quotes, most viewed contents, and audience size. The picture is purchased and downloaded manually from an online photobank;
the practical goal of this research is to replace manual photo curation through automatic image generation.
4J. SALMINEN ET AL.
and Slegers 2017; Nielsen and Storgaard Hansen 2014),
ethnography (Friess 2012), usability standards (Long
2009), or using statistical evaluation (An et al. 2018b;
Brickey, Walczak, and Burgess 2012; Zhang, Brown,
and Shankar 2016). For example, Friess (2012) investi-
gated the adoption of personas among designers. Long
(2009) measured the eectiveness of using personas as
a design tool, using Nielsens usability heuristics. Niel-
sen et al. (2017) analyze the match between journalists
preconceptions and personas created from the audience
data, whereas Chapman et al. (2008) evaluate personas
as quantitative information. While these evaluation
approaches are interesting, survey methods provide a
lucrative alternative for understanding how end users
perceive personas. Survey research typically measures
perceptions as latent constructs, apt for measurement
of attitudes and perceptions that cannot be directly
observed (Barrett 2007). This approach seems intui-
tively compatible with personas, as researchers have
reported several attitudinal perceptions concerning
personas (Salminen et al. 2019c). These are captured
in the PPS survey instrument (Salminen et al. 2018c; Sal-
minen et al. 2019f; Salminen et al. 2019g) that includes
eight constructs and twenty-eight items to measure user
perceptions of personas. We deploy this instrument in
this research, as it covers essential user perceptions in
the persona context.
2.4. Hypotheses
Following prior persona research, we formulate the fol-
lowing hypotheses to test persona user perceptions.
.H01: Using articial pictures does not decrease the
authenticity of the persona. HCI research has
shown that authenticity (or credibility,believability)
is a crucial issue for persona acceptance in real organ-
isations if the personas come across as fake,
decision makers are unlikely to adopt them for use
(Chapman and Milham 2006; Matthews, Judge, and
Whittaker 2012). This is especially relevant for our
context because personas already are ctitious people
describing real user groups (An et al. 2018b), so we
need to ensure that enhancing these ctitious people
with articially generated pictures does not further
risk the perception of realism.
.H02: Using articial pictures does not decrease the
clarity of the persona prole. For personas to be use-
ful, they should not be abstract or misleading (Mat-
thews, Judge, and Whittaker 2012). HCI researchers
have found that personas with inconsistent infor-
mation make end users of personas confused (Salmi-
nen et al. 2018d; Salminen et al. 2019b). Again, we
need to ensure that articial pictures do not make
persona proles more messyor unclear for the
end users.
.H03: Using articial pictures does not decrease
empathy towards the persona. Empathy is con-
sidered, among HCI scholars, as a key advantage of
personas compared to other forms of presenting
user data (Cooper 1999; Nielsen 2019). The generated
personas need to resonatewith end users to make a
real impact. Therefore, to be successful, articial pic-
tures should not reduce the sense of empathy towards
the persona.
.H04: Using articial pictures does not decrease the
willingness to use the persona. Willingness to use
(WTU) is a crucial construct for the adoption of per-
sonas for practical decision making (Rönkkö 2005;
Rönkkö et al. 2004). HCI research has shown that if
persona users do not show a willingness to learn
more about the persona for their task at hand, per-
sona creation risks remaining a futile exercise
(Rönkkö et al. 2004).
Overall, ranking high on these perceptions is con-
sidered positive (desirable) within the HCI literature.
This leads to dening the good enoughquality of arti-
cial pictures in the DDP context such that agood
enoughpicture quality does not decrease (a) the authen-
ticity (i.e. the persona is still considered as realas with
real photographs), (b) clarity of the persona prole, (c)
the sense of empathy felt toward the persona, or (d) the
willingness to learn more about the persona. In other
words, it is the design goal of replacing real photographs
with articial pictures in the context of personas, with
the concept being transferrable to other domains.
3. Methodology
3.1. Overview of evaluation steps
Our evaluation of picture quality consists of two separ-
ate studies: (1) crowdsourced evaluation study of AIG
quality, and (2) user study measuring the perceptions
of an online panel concerning personas with articially
generated pictures. The latter study tests if DDPs are
perceived dierently when using articial pictures,
while addressing the hypotheses presented in the pre-
vious section.
3.2. Research context
Our research context is a DDP system: Automatic Per-
sona Generation (APG
2
). As a DDP system, APG
requires thousands of realistic facial pictures to produce
BEHAVIOUR & INFORMATION TECHNOLOGY 5
a wide range of believable persona proles for client
organisations (Pruitt and Adlin 2006) covering a wide
range of ages and ethnicities. An overview of the typical
DDP development process is presented in Figure 3.
A practical limitation of APG is the need for manu-
ally acquiring facial pictures for the persona proles
(Salminen et al. 2019a). Because the pictures for APG
are acquired from online stock photo banks (e.g. iStock-
Photo, 123rf.com, etc.), manual eort is required to
curate a large number of pictures. A large number of
pictures is needed because APG can generate thousands
of personas for client organisations for each persona, a
unique facial picture is required. Organisations over a
lengthy period can have dozens of unique personas.
Using stock photo banks also involves a nancial cost
(ranging from $1 to $20 USD per picture), making pic-
ture curation both time-consuming and costly. Given
the goal of fully automated persona generation (Salmi-
nen et al. 2019a), there is a practical need for automatic
image generation.
Thus, we evaluate the automatically generated facial
pictures for use in APG (Jung et al. 2018a; Jung et al.
2018b). APG generates personas from online analytics
and social media data. Figure 2 shows an example of a
persona generated using the system. The practical pur-
pose of automatically generated images is to replace
the manual curation of persona prole pictures, saving
time and money. Note that the cost and eort are not
unique problems of APG, but generalise to all similar
images systems, as the pictures need to be provided
for each new persona generated.
3.3. Deploying StyleGAN for persona pictures
For AIG, we utilise a pre-trained version of StyleGAN
(Karras, Laine, and Aila 2019), a state-of-the-art gen-
erator that represents a leap towards photorealistic
facial pictures and can be freely accessed on GitHub.
3
StyleGAN was chosen for this research because (a) it
is a leap toward generating photorealistic facial images,
especially relative to the previous state-of-art, (b) the
trained model is publicly available, and (c) its deploy-
ment is robust for possible use in real systems. Style-
GAN generated the images, so this is a back end
process.
We use a pretrained model from the creators of Sty-
leGAN (Karras, Laine, and Aila 2019). This model was
trained on CelebA-HQ and FFHQ datasets using eight
Tesla V100 GPUs. It is implemented in TensorFlow,
4
an open-source machine learning library and is avail-
able in a GitHub repository.
5
We access this pre-trained
model via the GitHub repository that contains the
model and the required source code to run it.
Our goal is to use this pre-trained model to generate
a sample of 1,000 realistic facial pictures. The method of
applying the published code to generate the pictures is
straightforward. We provide the exact steps below to
facilitate replication studies:
Step 1: Import the required Python packages (os,
pickle,numpy, from PIL:Image,dnnlib).
Step 2: Dene the parameters and paths
Step 3: Initialize the environment and load the pre-
trained StyleGAN model.
Step 4: Set random states and generate new random
input. Randomization is needed because the model
always generates the same face for a particular input
vector. To generate unique images, a unique set of
input arrays should be provided. This is done by setting
a random state equal to the current number of iter-
ations, which allows us to have unique images and
reproducible results at the same time.
Step 5: Generate images using the random input array
created in the previous step.
Steps 6: Save the generated images as les to the out-
put folder. We use the resolution of 1024 × 1024 pixels.
Other available resolutions are 512 × 512 px and 256 ×
256 px.
Figure 3. APG data and processing owchart from server conguration to data collection and persona generation.
6J. SALMINEN ET AL.
The above steps with the mentioned parameters
enable us to generate articial pictures with similar
quality to those in the StyleGAN research paper (Karras,
Laine, and Aila 2019). For replicability, we are sharing
the Python code we used for AIG in Supplementary
Material.
4. STUDY 1: crowdsourced evaluation
4.1. Method
We evaluate the human-perceived quality of 1,000 gener-
ated facial pictures. To facilitate comparison with prior
work using human evaluation for articial pictures
(Choi et al. 2018; Song et al. 2019; Zhang et al. 2018),
we opt for crowdsourcing, using Figure Eight to collect
the ratings. This platform has been widely used for gath-
ering manually annotated training data (Alam, Oi, and
Imran 2018) and ratings (Salminen et al. 2018a) in var-
ious subdomains of computer science. The pictures
were shown in the full 1024 × 1024 pixels format to pro-
vide the crowd raters enough detail for a valid evaluation.
The following task description was provided to the crowd
raters, including the quality criteria and examples:
You are shown a facial picture of a person. Look at the
picture and choose how well it represents a real person.
The options:
.5: Perfectthe picture is indistinguishable from a
real person.
.4: High qualitythe picture has minor defects, but
overall its pretty close to a real person.
.3: Medium qualitythe picture has some aws that
suggest its not a real person.
.2: Low qualitythe picture has severe malformations
or defects that instantly show its a fake picture.
.1: Unusablethe picture does not represent a person
at all.
We also claried to the participants that the use case
is to nd realistic pictures specically for persona
proles, explaining that these are descriptive people of
some user segment. Additionally, we indicated in the
title that the task is to evaluate articial pictures of
people, to manage the expectations of the crowd raters
accordingly (Pitkänen and Salminen 2013). Other than
the persona aspect, these are similar to guidelines used
in prior work to facilitate image comparisons.
Following the quality control guidelines for crowd-
sourcing by Huang, Weber, and Vieweg (2014) and
Alonso (2015), we implemented suitable parameters in
the Figure Eight platform. We also enabled Dynamic
judgments, meaning the platform automatically collects
more ratings when there is a higher disagreement
among the raters. Based on the results of a pilot study
with 100 pictures, not used in the nal research, we
set the maximum number of ratings to 5 and condence
goal to 0.65. The default number of raters was three, so
the platform only went to 5 raters if a 0.65 condence
was not achieved.
6
4.2. Results
We spent $266.98 USD to obtain 6,812 crowdsourced
image ratings. This was the number of evaluations
from trusted contributors, not including the test ques-
tions. Note that if the accuracy of a crowd raters ratings
relative to the test questions falls below the minimum
accuracy threshold (in our case, 80%), the rater is dis-
qualied, and the evaluations become untrusted.
There were 423 untrusted judgments (6% of the total
submitted ratings), i.e. ratings coming from contribu-
tors that continuously fail to correctly rate the test pic-
tures. Thus, 94% of the total ratings were deemed
trustworthy. The majority label for each rated picture
is assigned by comparing the condence-adjusted rat-
ings of each available class, calculated as follows:
Confidenceclass =
n
i=1trustclass
n
i=1trustall
,
where the condence score of the class is given by the
sum of the trust scores from all nraters of that picture.
The trust score is based on a crowdworkers historical
accuracy (relative to test questions) on all the jobs he/
she has participated in. For example, if the condence
score of perfectis 0.66 and medium qualityis 0.72,
then the chosen majority label is medium quality
(0.72 > 0.66).
The results (see Table 1) show High qualityas the
most frequent class. Sixty percent (60%) of the generated
pictures are rated as either Perfector High quality. The
average quality score was 3.7 out of 5 (SD = 0.91) when
calculated from majority votes and 3.8 when calculated
from all the ratings. 9.9% of the pictures were rated as
Low quality, and none was rated as Unusable.
4.3. Reliability analysis
To assess the reliability of the crowd ratings, we
measured the interrater agreement of the quality ratings
among crowdworkers. For this, we used two metrics:
GwetsAC1(AC1) and percentage agreement (PA).
Using AC1 is appropriate when the outcome is ordinal,
the number of ratings varies across items (Gwet 2008)
and where the Kappa metric is low despite a high level
BEHAVIOUR & INFORMATION TECHNOLOGY 7
of agreement (Banerjee et al. 1999; Salminen et al.
2018a). Because of these properties, we chose AC1
with ordinal weights as the interrater agreement metric.
In addition, PA was calculated as a simple baseline
measure. Standard errors were used to construct the
95% condence interval (CI) for AC1. For PA, 95% CI
was calculated using 100 bootstrapped samples.
Results (see Table 2) show a high PA agreement
(86.2%). The interrater reliability was 0.627, in the
range of good (i.e. 0.60.8) (Wongpakaran et al.
2013). The results were statistically signicant (p<
0.001), with the probability of observing such results
by chance is less than 0.1%. Therefore, the crowd ratings
can be considered to have satisfactory internal validity.
However, the quality of some pictures is more easily
agreed upon than others. When stratied, the overall
agreement and AC1 were similar across low,moderate,
and high quality labels (PA 85%, and AC1 0.75).
However, the agreement was lower when the picture
was rated perfect (PA = 76.7%, AC1 = 0.498). This
implies that perfectis more dicult to determine
than the other rating labels.
5. STUDY 2: eects on persona perceptions
5.1. Experiment design
We created two base personas using the APG (An et al.
2018b) methodology described previously; one male
and one female. We leave the evaluation of other gen-
ders for future research. The experiment variable is
the use of an articial image in the persona prole.
The other elements of the persona proles are identical
between the two treatments. For this, we manipulated
the base personas by introducing either (a) a real photo-
graph of a person or (b) a demographically matching
articial picture (see Figure 4).
The demographic match was determined manually
by two researchers who judged that the chosen pictures
were similar for gender, age, and race. Using a modied
Delphi method, a seed image of either a real or article
picture was select using the meta-data attributes of gen-
der, age, and race. The researchers independently
selected matching images for each. The two researchers
then jointed selected the mutually agreed upon image
for the treatments. The articial pictures were chosen
from the ones rated perfectby the crowd raters. The
real photos were sourced from online repositories,
with Creative Commons license.
In total, four persona treatments were created: Male
Persona with Real Picture (MPR), Male Persona with
Articial Picture (MPA), Female Persona with Real Pic-
ture (FPR), and Female Persona with Articial Picture
(FPA). The created personas were mixed into four
sequences:
.Sequence 1: MPR FPA
.Sequence 2: MPA FPR
.Sequence 3: FPR MPA
.Sequence 4: FPA MPR
Each participant was randomly assigned to one of the
sequences. To counterbalance the dataset, we ensured
an even number of participants (N= 520/4 = 130) for
each sequence. Technically, the participants self-
selected the sequence, as each participant could only
take one survey. The participants were excluded from
answering in more than one survey based on their
(anonymous) Respondent ID. The gender distribution
for each of the four sequences, as shown: S1 (M:
Table 1. The results of crowd evaluation based on a majority vote of the picture quality. Most frequent class bolded. Example facial
image from each of the 5 classes shown for comparison.
Table 2. Agreement metrics for the crowdsourced ratings
showing satisfactory internal validity.
Measure Value SE 95% CI P
PA 86.2% 0.6% 85.27%, 87.2%
AC1 0.627 0.017 0.59, 0.66 <0.001
8J. SALMINEN ET AL.
41.5% F: 58.5%), S2 (M: 39.0% F: 61.0%), S3 (M: 39.8%
F:60.2%), S4 (M: 34.5% F: 64.5%).
5.2. Recruitment of participants
We created a survey for each sequence. In each sur-
vey, we (a) explain to participants what the research
is about and what personas are (a persona is dened
as a ctive person describing a speciccustomer
group). Then, we (b) show an example persona with
explanations of the content, and (c) explain the task
scenario (Imagine that you are creating a YouTube
video for the target group that the persona you will
be shown next describes.). After this, (d) the partici-
pants are shown one of the four treatments, asked
to review the information carefully, and complete
the PPS questionnaire.
In total, 520 participants were recruited using
Prolic, an online survey platform often applied in
social science research (Palan and Schitter 2018).
Prolic was chosen for this evaluation step (as opposed
to previously used Figure Eight), as it provides back-
ground information of the participants (e.g. gender,
age) that can be deployed for further analyses. The aver-
age age of the participants was 35 years old (SD = 7.2),
with 59.1% being female, overall. The nationality of
the participants was the United Kingdom, they had at
least an undergraduate degree, and none were students.
We veried the quality of the answers using an attention
check question (Its important that you pay attention to
this study. Please select Slightly agree”’.). Out of 520
answers, 19 (3.7%) failed the attention check; these
answers were removed. In addition, ve answers were
timed out by the Prolic platform. Therefore, we
ended up with 496 qualied participants (95.4% of the
total).
5.3. Measurement
The perceptions are measured using the PPS (Salminen
et al. 2018c), a survey instrument measuring what indi-
viduals think about specic personas (see Table 3). The
PPS has previously deployed in several persona exper-
iments (see [Salminen et al. 2019f; Salminen et al.
2019g]). Note that the authenticity construct is similar
to constructs in earlier articial image evaluation
specically to realism (Zhou et al. 2019b) and natural-
ness (Iizuka, Simo-Serra, and Ishikawa 2017). However,
the other constructs expand the perceptions typically
used for image evaluation. In this sense, the hypotheses
(a) add novelty to the measurement of user perceptions
regarding the employment of articial images in a real
system output, and (b) are relevant for the design and
use of personas.
5.4. Analysis procedure
The participants were grouped based on the persona
presented (either the male or the female one), and
whether the persona picture was articial or real. The
data was re-arranged to disentangle the gender of the
persona, leading to one male-persona dataset (with a
realand an articialgroup), and a female-persona
dataset with similar groups. This allowed the usage of
a standard MANOVA (Hair et al. 2009) to determine
whether the measurements diered across articial
and real pictures; both genders were analysed
independently.
To enhance the robustness of the ndings, Bayesian
independent samples tests were used to estimate
Bayesian Factors (BF), comparing the likelihoods
between the null and alternative hypotheses (Lee
2014). A Naïve Bayes approach was employed with
regards to priors.
Figure 4. Articial male picture [A], Real male picture [B], Articial female picture [C], and Real female picture [D]. Among the male/
female personas, all other content in the persona prole was the same except the picture that alternated between Articial and Real.
Pictures of the full persona proles are provided in Supplementary Material.
BEHAVIOUR & INFORMATION TECHNOLOGY 9
5.5. Results
Male persona: Beginning with the multivariate tests, no
signicant eects were registered for Type of Picture
(Pillais Trace = 0.017, F(5, 476) = 1.651,
h
2
p=0.017, p
= 0.145), indicating that none of the measurements
diered across male real and articial pictures for the
persona prole. Nevertheless, we proceeded with an
analysis of univariate tests, which conrmed that none
of the measurements diered across types of pictures.
The univariate dierences for between-subjects are
summarised in Table 4.
The lack of dierences in scale ratings (see Figure 5)
also indicates that the use of real or articial pictures
results in no dierences for authenticity,clarity,empa-
thy,orwillingness to use for the male persona.
The Bayesian analysis on the male persona indicates
strong lack of evidence for dierences regarding clarity
(BF = 13.856; F(1, 480) = 0.002; p= 0.965) and willing-
ness to use (BF = 10.030; F(1, 480) = 0.658; p= 0.418),
and moderate lack of evidence for dierences regarding
authenticity (BF = 5.126; F(1, 480) = 2.023; p= 0.156)
and empathy (BF = 5.040; F(1, 480) = 2.057; p= 0.152)
(Jereys 1998).
Female persona: Beginning with the multivariate
tests, unlike with the male persona, signicant eects
were registered for Type of Picture (Pillais Trace =
0.051, F(5, 476) = 5.081,
h
2
p=0.051, p< 0.001),
indicating that at least one of the measurements diered
between real and articial pictures for the female per-
sona. Thus, we proceeded with univariate testing to
determine which of the measurements exhibited dier-
ences across picture type (see Table 4). Authenticity had
signicant dierences across types of picture (BF =
0.032; F(1, 480) = 12.479, p< 0.001). Articial female
pictures were perceived as more authentic (M= 5.075,
SD = 1.016) than real pictures (M= 4.711, SD = 1.235).
None of the other measurements diered across types
of pictures. Figure 6 illustrates the comparison between
the two groups for the female persona.
This was corroborated by the Bayesian Factors that
indicate that strong lack of evidence regarding dier-
ences for clarity (BF = 13.865; F(1, 480) = 0.001, p=
0.980) and willingness to use (BF = 13.828; F(1, 480) =
0.006, p= 0.938), and moderate lack of evidence for
empathy (BF = 8.290; F(1, 480) = 1.045, p= 0.307) (Jer-
eys 1998).
Finally, as one the statements in PPS specically dealt
with the picture of the persona (Item 3: The picture of the
persona looks authentic.), we inspected the mean scores
of this statement separately. In line with our other
ndings, the articial female persona picture is in fact
considered to be more authentic than the real photo-
graph (M
FPA
= 5.74 vs. M
FPR
= 4.89). This dierence is
statistically signicant (t(480) = 6.896, p< 0.001). For
Table 3. Survey statements. The participants answered using a 7-point Likert scale, ranging from Strongly
Disagree to Strongly Agree. The statements were validated in (Salminen et al. 2018c). WTU willingness to
use.
Perception Statements
Authenticity The persona seems like a real person.
I have met people like this persona.
The picture of the persona looks authentic.
The persona seems to have a personality.
Clarity The information about the persona is well presented.
The text in the persona prole is clear enough to read.
The information in the persona prole is easy to understand.
Empathy I feel like I understand this persona.
I feel strong ties to this persona.
I can imagine a day in the life of this persona.
Willingness To Use I would like to know more about this persona.
This persona would improve my ability to make decisions about the customers it describes.
I would make use of this persona in my task [of creating a YouTube video].
Table 4. Univariate tests for between-subjects eects (df(error) = 1(480)).
Male persona Female persona
Independent variable Dependent variable F
h
2
pp-value F
h
2
pp- value
Type of Picture (real or articial) Authenticity 2.023 0.004 0.156 12.479 0.025 <0.001
Clarity 0.002 <0.001 0.965 0.001 <0.001 0.980
Empathy 2.057 0.004 0.152 1.045 0.002 0.307
WTU 0.658 0.001 0.418 0.006 <0.001 0.938
10 J. SALMINEN ET AL.
the male persona, dierences are minimal (M
MPA
= 5.11
vs. M
MPR
= 5.14) and not statistically signicant (t(480)
=0.187, p= 0.851).
In summary, for H01, there were no signicant dier-
ences in the perceptions for the male persona; however,for
the female persona, articial pictures actually increased the
perceived authenticity. For the other perceptions, there
was no signicant change when replacing the real photo
with the articially generated picture. Therefore,
.There was no evidence that using articial pictures
decrease the perceived authenticity of the persona
(H01: supported).
.There was no evidence that using articial pictures
decrease the clarity of the persona prole (H02:
supported).
.There was no evidence that using articial pictures
decrease empathy towards the persona (H03: supported).
.There was no evidence that using articial pictures
decrease the willingness to use the persona (H04).
6. Discussion
6.1. Can articial pictures be used for DDPs?
Our analysis focuses on a timely problem in a relevant,
yet underexplored area. However, it is one of increasing
importance in a media rich online environment
(Church, Iyer, and Zhao 2019). The impact of articial
facial pictures on user perceptions has not been studied
thoroughly in previous HCI design literature. The lack
of applied user studies is understandable given that
until recently, the generated facial pictures were not
close to realistic, so the research focus was on improving
algorithms. However, as the quality of the facial pictures
improves, the focus ought to shift towards evaluation
studies in real-world use cases, systems, and appli-
cations. As there is a lack of literature in this regard,
the research presented here contains a step forward in
analysing the use of articially facial generated pictures
in real systems.
In terms of results, the crowd evaluation suggests that
more than half of the articial pictures are considered as
either perfect or high quality. The ratio of perfect and
high-qualitypictures to the rest is around 1.5, implying
that most of the pictures are satisfactory according to
the guidelines we provided. The persona perception
analysis shows that the use of articial pictures vs. real
pictures in persona proles does not reduce the authen-
ticity of the persona or peoples willingness to use the
persona, two crucial concerns of persona applicability.
Therefore, we nd the state-of-the-art of AIG satisfac-
tory for a persona and most likely for other systems
requiring the substantial use of facial images. So, it is
possible to replace the need for manually retrieving pic-
tures from online photo banks with a process of auto-
matically generated pictures.
6.2. Gender dierences in perception
Regarding the female persona with an articial picture
being perceived as more authentic, we surmise that
there might be a stock photoeect involved, rather
than a gender eect. This proposition is backed up by
previous ndings of stock photos being perceived dier-
ently by individuals than non-stock photos (Salminen
et al. 2019f). Visually, to the respondents, the real
photo chosen for the female persona appears dierent
from the one chosen for the male persona (see Figure
4). It is dicult to explain or quantify why this is. We
interpret this nding such that the choice of pictures
for a persona prole, and perhaps other system contexts,
is a delicate matter; even small nuances can aect user
perceptions.
This interpretation is generally in line with previous
HCI research regarding the foundational impact of
photos in persona proles (Hill et al. 2017; Salminen
et al. 2018d;Salminen et al. 2019b). Possibly, stock
photos can appear, at times, less realistic than photos
of real peoplebecause they are too shiny, too perfect
Figure 5. Means of the scale variables for the male persona.
Error bars indicate standard error.
Figure 6. Means of the scale variables for the female persona.
Error bars indicate standard error.
BEHAVIOUR & INFORMATION TECHNOLOGY 11
(or too smiling[Salminen et al. 2019e]). Thus, if the
generators outputs are closer to real people than stock
photos in their appearance, it is possible that these pic-
tures are deemed more realistic than stock photos. How-
ever, this does not explain why the eect was found for
the female persona and not for the male one. The only
way to establish if there is a gender eect that inuences
perceptions of stock photos is to conduct repeated
experiments with stock photos of dierent people. In
addition to repeated experiments, for future research,
the gender dierence suggests another variable to con-
sider: the degree of photo-editingor shiny factor
(i.e. how polished the stock photo is and how this
aects persona perceptions). The proper adjustment of
this variable is best ensured via a manipulation check.
6.3. Wider implications for HCI and system
implementation
In a broader context, the manuscript contributes to
evaluating a machine learning tool for UX/UI design
work. For the use of articial images, guidelines are
provided.
.Solutions for Mitigating Subjectivity: The results
indicate that evaluating the quality of articial facial
pictures contains a moderate to a high level of subjec-
tivity, making reliable evaluation for production sys-
tems costlier. We hypothesise that there will always
be some degree of subjectivity, as individuals vary
in their ability to pay attention to details. This can
be partially remedied by choosing the pictures with
the highest agreement between the raters,orusing a
binary rating scale (i.e. good enoughvs. not good
enough) as the agreement is generally easier to
obtain with fewer classes (Alonso 2015). The
observed disagreementmay be partly fallacious
because people might agree whether a picture is
either usable (4 or 5) or non-usable (1 or 2), but
the exact agreement between 4 or 5, for example, is
lower. As stated, for practical purposes, it does not
matter if a picture is Perfector High quality,as
both classes are decent, at least for this use case.
.Handling of Borderline Cases. Regarding pictures to
use in a production system, we recommend a border-
line principle: if in doubt of the picture quality, reject
it. The marginal cost of generating new pictures is
diminishingly low but showing a low-quality picture
decreases user experience, sometimes drastically. For
this reason, the economics of automatic image gener-
ation are in favour of rejecting borderline images
more than letting through distorted images. How-
ever, rejecting borderline images does increase the
total cost of evaluation because to obtain nuseful pic-
tures, one now has to obtain n× (1 + false positive
rate) ratings, which is (n× (1 + false positive rate)
n)/nratings more than nratings. Additionally, as
we have shown, the higher the disagreement among
the crowd raters, the more ratings required.
.Final Choice for Human. In evaluating the suitability of
articial pictures for use in real applications, domain
expertise is needed because, irrespective of quality
guidelines, the crowd may have dierent quality stan-
dards than domain experts. For example, the crowd
can be used to lter out low-quality photos, but the bet-
terquality photos should be evaluated specically by
domain experts, as dierent domains likely have dier-
ent quality standards. For personas, the pictures need to
be of high quality, but when implementing them for the
system, they are cropped into a smaller resolution that
helps obfuscate minor errors.
6.4. Future research avenues
The following avenues for future research are proposed.
.Suitability in Other Domains. For example, how do
quality standards and requirements by users and
organizations dier across domains and use cases?
How well are articial (fake) pictures detected by
end users, such as consumers and voters? This research
ties in with the nascent eld of deep fakes(Yang, Li,
and Lyu 2019), i.e. images and videos purposefully
manipulated for a political or commercial agenda.
To this end, future studies could investigate the
wider impact of using AI-generated images for
prole pictures on sharing, economy platforms, or
social media and news sites, and how that impact
user perceptions, such as trust. Another interesting
domain for suitability studies includes marketing,
as facial pictures are widely deployed to advertise
products such as fashion and luxury items.
.Algorithmic Bias. It would be important to investi-
gate if the generated pictures involve an algorithmic
bias given that the training data may be biased, it
would be worthwhile to analyze how diverse the gen-
erated pictures for dierent ethnicities, ages, and gen-
ders. Regarding persona perceptions, the race could
be a confounding factor in our research and should
be analysed separately in future research. A related
question is: does the picture quality vary by demo-
graphic factors such as gender and race? Studies on
algorithmic bias have been carried out within the
HCI community (Eslami et al. 2018; Salminen et al.
2019b) and should be extended to this context.
12 J. SALMINEN ET AL.
.Demographically Conditional Images. For future
development, we envision a system that automati-
cally generates persona-specic pictures based on
specic features/attributes of the personas this
would enable on-demandpicture creation for new
personas generated by APG, whereas currently, the
pictures need to manually tagged for age, gender,
and country.
7. Conclusion
Our research goal was to evaluate the applicability of
articial pictures for personas along two dimensions:
their quality and their impact on user perceptions. We
found that more than half of the pictures were rated
as perfect or high quality, with none as unusable. More-
over, the use of articial pictures did not decrease the
perceptions of personas that are found important in
the HCI literature. These results can be considered as
a vote of condence for the current state of technology
concerning the automatic generation of facial pictures
and their use in data-driven persona proles.
Notes
1. A demo of the system can be accessed at https://
persona.qcri.org
2. https://persona.qcri.org
3. https://github.com/NVlabs/stylegan
4. https://www.tensorow.org/
5. https://github.com/NVlabs/stylegan
6. Condence is dened as agreement adjusted by trust
score of each rater.
Disclosure statement
No potential conict of interest was reported by the author(s).
References
Ablanedo, J., E. Fairchild, T. Grith, and C. Rodeheer. 2018.
Is This Person Real? Avatar Stylization and Its Inuence
on Human Perception in a Counseling Training
Environment.In: Chen J., Fragomeni G. (eds). In
International Conference on Virtual, Augmented and
Mixed Reality, 279289. Lecture Notes in Computer
Science, vol 10909. Springer, Cham. doi:10.1007/978-3-
319-91581-4_20.
Alam, F., F. Oi, and M. Imran. 2018.Crisismmd:
Multimodal Twitter Datasets from Natural Disasters.In
Twelfth International AAAI Conference on Web and
Social Media. AAAI. Palo Alto, California, USA.
Alonso, O. 2015.Practical Lessons for Gathering Quality
Labels at Scale.In Proceedings of the 38th International
ACM SIGIR Conference on Research and Development in
Information Retrieval, 10891092. doi:10.1145/2766462.
2776778.
An, J., H. Kwak, S. Jung, J. Salminen, and B. J. Jansen. 2018a.
Customer Segmentation Using Online Platforms: Isolating
Behavioral and Demographic Segments for Persona
Creation via Aggregated User Data.Social Network
Analysis and Mining 8 (1). doi:10.1007/s13278-018-0531-0.
An, J., H. Kwak, J. Salminen, S. Jung, and B. J. Jansen. 2018b.
Imaginary People Representing Real Numbers:
Generating Personas From Online Social Media Data.
ACM Transactions on the Web (TWEB) 12 (4): Article
No. 27. doi:10.1145/3265986.
Antipov, G., M. Baccouche, and J.-L. Dugelay. 2017.Face
Aging with Conditional Generative Adversarial
Networks.In 2017 IEEE International Conference on
Image Processing (ICIP), IEEE, Beijing, 20892093.
Araujo, T. 2018.Living up to the Chatbot Hype: The
Inuence of Anthropomorphic Design Cues and
Communicative Agency Framing on Conversational
Agent and Company Perceptions.Computers in Human
Behavior 85: 183189. doi:10.1016/j.chb.2018.03.051.
Ashraf, M., N. I. Jaafar, and A. Sulaiman. 2019.System- vs.
Consumer-Generated Recommendations: Aective and
Social-Psychological Eects on Purchase Intention.
Behaviour & Information Technology 38 (12): 12591272.
doi:10.1080/0144929X.2019.1583285.
Banerjee, M., M. Capozzoli, L. McSweeney, and D. Sinha.
1999.Beyond Kappa: A Review of Interrater Agreement
Measures.Canadian Journal of Statistics 27 (1): 323.
doi:10.2307/3315487.
Barratt, S., and R. Sharma. 2018. A Note on the Inception
Score. ArXiv:1801.01973 [Cs, Stat].http://arxiv.org/abs/
1801.01973.
Barrett, P. 2007.Structural Equation Modelling: Adjudging
Model t.Personality and Individual Dierences 42 (5):
815824.
Baxter, K., C. Courage, and K. Caine. 2015.Understanding
Your Users: A Practical Guide to User Requirements
Methods, Tools, and Techniques. 2nd ed. Burlington,
Massachusetts. Morgan Kaufmann.
Bazzano,A.N.,J.Martin,E.Hicks,M.Faughnan,andL.Murphy.
2017.Human-centred Design in Global Health: A Scoping
Review of Applications and Contexts.PloS One 12 (11).
Brangier, E., and C. Bornet. 2011.Persona: A Method to
Produce Representations Focused on ConsumersNeeds.
Eds: Waldemar Karwowski, Marcelo M. Soares, Neville A.
Stanton. In Human Factors and Ergonomics in Consumer
Product Design,3761. Taylor and Francis.
Brauner, P., R. Philipsen, A. C. Valdez, and M. Ziee. 2019.
What Happens when Decision Support Systems Fail?
The Importance of Usability on Performance in
Erroneous Systems.Behaviour & Information Technology
38 (12): 12251242. doi:10.1080/0144929X.2019.1581258.
Brickey, J., S. Walczak, and T. Burgess. 2012.Comparing
Semi-Automated Clustering Methods for Persona
Development.IEEE Transactions on Software
Engineering 38 (3): 537546.
Chapman, C. N., E. Love, R. P. Milham, P. ElRif, and J. L.
Alford. 2008.Quantitative Evaluation of Personas as
Information.Proceedings of the Human Factors and
Ergonomics Society Annual Meeting 52 (16): 11071111.
doi:10.1177/154193120805201602.
BEHAVIOUR & INFORMATION TECHNOLOGY 13
Chapman, C. N., and R. P. Milham. 2006.The PersonasNew
Clothes: Methodological and Practical Arguments against a
Popular Method.Proceedings of the Human Factors and
Ergonomics Society Annual Meeting 50 (5): 634636.
doi:10.1177/154193120605000503.
Chen, A., Z. Chen, G. Zhang, K. Mitchell, and J. Yu. 2019.
Photo-Realistic Facial Details Synthesis from Single
Image.In 2019 IEEE/CVF International Conference on
Computer Vision (ICCV), 94289438. doi:10.1109/ICCV.
2019.00952.
Choi, Y., M. Choi, M. Kim, J.-W. Ha, S. Kim, and J. Choo.
2018.Stargan: Unied Generative Adversarial Networks
for Multi-Domain Image-to-Image Translation.In
Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, IEEE, Salt Lake City, UT. 8789
8797.
Church, E. M., L. Iyer, and X. Zhao. 2019.Pictures Tell a
Story: Antecedents of Rich-Media Curation in Social
Network Sites.Behaviour & Information Technology 38
(4): 361374. doi:10.1080/0144929X.2018.1535620.
Cooper, A. 1999.The Inmates Are Running the Asylum: Why
High Tech Products Drive Us Crazy and How to Restore the
Sanity. 1st ed. Carmel, Indiana. Sams Pearson Education.
Cooper, A. 2004.The Inmates Are Running the Asylum: Why
High Tech Products Drive Us Crazy and How to Restore the
Sanity. 2nd ed. Carmel, Indiana. Pearson Higher
Education.
Dey, R., F. Juefei-Xu, V. N. Boddeti, and M. Savvides. 2019.
RankGAN: A Maximum Margin Ranking GAN for
Generating Faces.In Computer Vision ACCV 2018. (Vol.
11363). Cham: Springer. doi:10.1007/978-3-030-20893-6_1.
Di, X., and V. M. Patel. 2017.Face Synthesis from Visual
Attributes via Sketch Using Conditional VAEs and GANs.
ArXiv:1801.00077 [Cs].http://arxiv.org/abs/1801.00077.
Di, X., V. A. Sindagi, and V. M. Patel. 2018.GP-GAN: Gender
Preserving GAN for Synthesizing Faces from Landmarks.In
2018 24th International Conference on Pattern Recognition
(ICPR),10791084. doi:10.1109/ICPR.2018.8545081.
Dong, J., K. Kelkar, and K. Braun. 2007.Getting the Most Out of
Personas for Product Usability Enhancements.In Usability
and Internationalization. HCI and Culture,291296. http://
www.springerlink.com/index/C0U2718G14HG1263.pdf.
dos Santos, T. F., D. G. de Castro, A. A. Masiero, and P. T. A.
Junior. 2014.Behavioral Persona for Human-Robot
Interaction: A Study Based on Pet Robot Kurosu M. (eds)
In. International Conference on Human-Computer
Interaction, 687696.
Duy, B. R. 2003.Anthropomorphism and the Social Robot.
Robotics and Autonomous Systems 42 (34): 177190.
Edwards, A., C. Edwards, P. R. Spence, C. Harris, and A.
Gambino. 2016.Robots in the Classroom: Dierences in
StudentsPerceptions of Credibility and Learning
Between Teacher as Robotand Robot as Teacher.
Computers in Human Behavior 65: 627634. doi:10.1016/
j.chb.2016.06.005.
Eslami, M., S. R. Krishna Kumaran, C. Sandvig, and K.
Karahalios. 2018.Communicating Algorithmic Process
in Online Behavioral Advertising.In Proceedings of the
2018 CHI Conference on Human Factors in Computing
Systems, New York, NY, USA, Paper 432.
Faily, S., and I. Flechais. 2011.Persona Cases: A Technique
for Grounding Personas.In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems,
22672270. doi:10.1145/1978942.1979274.
Friess, E. 2012.Personas and Decision Making in the Design
Process: An Ethnographic Case Study.In Proceedings of
the SIGCHI Conference on Human Factors in Computing
Systems, 12091218. doi:10.1145/2207676.2208572.
Gao, F., J. Zhu, H. Jiang, Z. Niu, W. Han, and J. Yu. 2020.
Incremental Focal Loss GANs.Information Processing
& Management 57 (3): 102192. doi:10.1016/j.ipm.2019.
102192.
Gecer, B., B. Bhattarai, J. Kittler, and T.-K. Kim. 2018.Semi-
supervised Adversarial Learning to Generate Photorealistic
Face Images of New Identities from 3D Morphable Model.
In Computer Vision ECCV 2018. ECCV 2018 (Vol. 11215),
230248. Cham: Springer. doi:10.1007/978-3-030-01252-6_
14.
Go, E., and S. Shyam Sundar. 2019.Humanizing Chatbots:
The Eects of Visual, Identity and Conversational Cues
on Humanness Perceptions.Computers in Human
Behavior. doi:10.1016/j.chb.2019.01.020.
Goodfellow,I.,J.Pouget-Abadie,M.Mirza,B.Xu,D.Warde-
Farley,S.Ozair,A.Courville,andY.Bengio.2014.
Generative Adversarial Nets.In Advances in Neural
Information Processing Systems 27,editedbyZ.Ghahramani,
M.Welling,C.Cortes,N.D.Lawrence,andK.Q.
Weinberger, 26722680. Curran Associates, Inc. http://
papers.nips.cc/paper/5423-generative-adversarial-nets.pdf.
Goodwin, K. 2009.Designing for the Digital Age: How to
Create Human-Centered Products and Services. New York,
New York. 1st ed. Wiley.
Gwet, K. L. 2008.Computing Inter-Rater Reliability and Its
Variance in the Presence of High Agreement.British
Journal of Mathematical and Statistical Psychology 61 (1):
2948. doi:10.1348/000711006X126600.
Hair, J. F., W. C. Black, B. J. Babin, and R. E. Anderson. 2009.
Multivariate Data Analysis. 7th ed. New York, New
York. Pearson.
Heusel, M., H. Ramsauer, T. Unterthiner, B. Nessler, and S.
Hochreiter. 2017.Gans Trained by a Two Time-Scale
Update Rule Converge to a Local Nash Equilibrium.
Advances in Neural Information Processing Systems, 6626
6637.
Hill, C. G., M. Haag, A. Oleson, C. Mendez, N. Marsden, A.
Sarma, and M. Burnett. 2017.Gender-Inclusiveness
Personas vs. Stereotyping: Can We Have It Both Ways?
In Proceedings of the 2017 CHI Conference, 66586671.
doi:10.1145/3025453.3025609.
Holz, T., M. Dragone, and G. M. OHare. 2009.Where
Robots and Virtual Agents Meet.International Journal
of Social Robotics 1 (1): 8393.
Hong, B. B., E. Bohemia, R. Neubauer, and L. Santamaria.
2018.Designing for Users: The Global Studio.In DS
93: Proceedings of the 20th International Conference on
Engineering and Product Design Education (E&PDE
2018), Dyson School of Engineering, Imperial College,
London. 6th-7th September 2018, 738743.
Huang, W., I. Weber, and S. Vieweg. 2014.Inferring
Nationalities of Twitter Users and Studying Inter-
National Linking.ACM HyperText Conference.https://
works.bepress.com/vieweg/18/.
Idoughi, D., A. Seah, and C. Kolski. 2012.Adding User
Experience into the Interactive Service Design Loop: A
14 J. SALMINEN ET AL.
Persona-Based Approach.Behaviour & Information
Technology 31 (3): 287303. doi:10.1080/0144929X.2011.
563799.
Iizuka, S., E. Simo-Serra, and H. Ishikawa. 2017.Globally and
Locally Consistent Image Completion.ACM Transactions
on Graphics 36 (4): 107:1107:14. doi:10.1145/3072959.
3073659.
Isola, P., J.-Y. Zhu, T. Zhou, and A. A. Efros. 2017.Image-to-
Image Translation with Conditional Adversarial
Networks.In 2017 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), IEEE, Honolulu, HI.
59675976.
Jansen, B. J., J. O. Salminen, and S.-G. Jung. 2020.Data-
Driven Personas for Enhanced User Understanding:
Combining Empathy with Rationality for Better Insights
to Analytics.Data and Information Management 4 (1):
117. doi:10.2478/dim-2020-0005.
Jansen, A., M. Van Mechelen, and K. Slegers. 2017.Personas
and Behavioral Theories: A Case Study Using Self-
Determination Theory to Construct Overweight
Personas.In Proceedings of the 2017 CHI Conference on
Human Factors in Computing Systems, 21272136. doi:10.
1145/3025453.3026003.
Jereys, H. 1998.Theory of Probability. 3rd ed. Oxford:
Oxford University Press.
Jenkinson, A. 1994.Beyond Segmentation.Journal of
Targeting, Measurement and Analysis for Marketing 3 (1):
6072.
Jung, S., J. Salminen, J. An, H. Kwak, and B. J. Jansen. 2018a.
Automatically Conceptualizing Social Media Analytics
Data via Personas.In Proceedings of the International
AAAI Conference on Web and Social Media (ICWSM
2018), June 25. International AAAI Conference on Web
and Social Media (ICWSM 2018), San Francisco,
California, USA.
Jung, S., J. Salminen, H. Kwak, J. An, and B. J. Jansen. 2018b.
Automatic Persona Generation (APG): A Rationale and
Demonstration.In Proceedings of the ACM, 2018
Conference on Human Information Interaction &
Retrieval, ACM, New Brunswick, NJ., 321324.
Karras, T., S. Laine, and T. Aila. 2019.A Style-Based
Generator Architecture for Generative Adversarial
Networks.In 2019 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), 43964405.
doi:10.1109/CVPR.2019.00453.
King, A. J., A. J. Lazard, and S. R. White. 2020.The Inuence
of Visual Complexity on Initial User Impressions: Testing
the Persuasive Model of Web Design.Behaviour &
Information Technology 39 (5): 497510. doi:10.1080/
0144929X.2019.1602167.
Lee, M. D. 2014.Bayesian Cognitive Modeling: A Practical
Course.Cambridge: Cambridge University Press.
Lee, H.-Y., H.-Y. Tseng, J.-B. Huang, M. K. Singh, and M.-H.
Yang. 2018.Diverse Image-to-Image Translation via
Disentangled Representations.In Computer Vision
ECCV 2018. ECCV 2018 (Vol. 11205). Cham: Springer.
doi:10.1007/978-3-030-01246-5_3.
Li, T., R. Qian, C. Dong, S. Liu, Q. Yan, W. Zhu, and L. Lin.
2018.BeautyGAN: Instance-Level Facial Makeup
Transfer with Deep Generative Adversarial Network.In
Proceedings of the 26th ACM International Conference on
Multimedia, 645653. doi:10.1145/3240508.3240618.
Lin, C. H., C.-C. Chang, Y.-S. Chen, D.-C. Juan, W. Wei, and
H.-T. Chen. 2019.COCO-GAN: Generation by Parts via
Conditional Coordinating.In 2019 IEEE/CVF
International Conference on Computer Vision (ICCV),
45114520. doi:10.1109/ICCV.2019.00461.
Liu,Y.,Z.Qin,T.Wan,andZ.Luo.2018.Auto-painter: Cartoon
Image Generation from Sketch by Using Conditional
Wasserstein Generative Adversarial Networks.
Neurocomputing 311: 7887. doi:10.1016/j.neucom.2018.05.045.
Liu, S., Y. Sun, D. Zhu, R. Bao, W. Wang, X. Shu, and S. Yan.
2017.Face Aging with Contextual Generative Adversarial
Nets.In Proceedings of the 25th ACM International
Conference on Multimedia,8290. doi:10.1145/3123266.
3123431.
Long, F. 2009.Real or Imaginary: The Eectiveness of Using
Personas in Product Design.In Proceedings of the Irish
Ergonomics Society Annual Conference, Dublin, IR., 14.
Lu, Y., Y.-W. Tai, and C.-K. Tang. 2017.Conditional
Cyclegan for Attribute Guided Face Image Generation.
ArXiv Preprint ArXiv:1705.09966.
Matthews, T., T. Judge, and S. Whittaker. 2012.How Do
Designers and User Experience Professionals Actually
Perceive and Use Personas?In Proceedings of the
SIGCHI Conference on Human Factors in Computing
Systems, 12191228. doi:10.1145/2207676.2208573.
McGinn, J. J., and N. Kotamraju. 2008.Data-Driven Persona
Development.In ACM Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems,
ACM, New York, NY, USA, 15211524.
Neumann, A., C. Pyromallis, and B. Alexander. 2018.
Evolution of Images with Diversity and Constraints
Using a Generative Adversarial Network.Cheng L.,
Leung A., Ozawa S. (eds). In International Conference on
Neural Information Processing, 452465.
Nie, D., R. Trullo, J. Lian, C. Petitjean, S. Ruan, Q. Wang, and
D. Shen. 2017.Medical Image Synthesis with Context-
Aware Generative Adversarial Networks.Descoteaux M.,
Maier-Hein L., Franz A., Jannin P., Collins D., Duchesne
S. (eds). In International Conference on Medical Image
Computing and Computer-Assisted Intervention, 417425.
Nielsen, L. 2019.PersonasUser Focused Design. 2nd ed.
Springer.
Nielsen, L., K. S. Hansen, J. Stage, and J. Billestrup. 2015.A
Template for Design Personas: Analysis of 47 Persona
Descriptions from Danish Industries and Organizations.
International Journal of Sociotechnology and Knowledge
Development 7 (1): 4561. doi:10.4018/ijskd.2015010104.
Nielsen, L., S.-G. Jung, J. An, J. Salminen, H. Kwak, and B. J.
Jansen. 2017.Who Are Your Users?: Comparing Media
ProfessionalsPreconception of Users to Data-Driven
Personas.In Proceedings of the 29th Australian
Conference on Computer-Human Interaction, 602606.
doi:10.1145/3152771.3156178.
Nielsen, L., and K. Storgaard Hansen. 2014.Personas Is
Applicable: A Study on the Use of Personas in Denmark.
In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems, ACM, New York, NY..
16651674.
Özmen, M. U., and E. Yucel. 2019.Handling of Online
Information by Users: Evidence from TED Talks.
Behaviour & Information Technology 38 (12): 13091323.
doi:10.1080/0144929X.2019.1584244.
BEHAVIOUR & INFORMATION TECHNOLOGY 15
Palan, S., and C. Schitter. 2018.Prolic. AcA Subject Pool
for Online Experiments.Journal of Behavioral and
Experimental Finance 17: 2227.
Pitkänen, L., and J. Salminen. 2013.Managing the Crowd: A
Study on Videography Application.In Proceedings of
Applied Business and Entrepreneurship Association
International (ABEAI), November.
Probster, M., M. E. Haque, and N. Marsden. 2018.
Perceptions of Personas: The Role of Instructions.In
2018 IEEE International Conference on Engineering,
Technology and Innovation (ICE/ITMC),18. doi:10.
1109/ICE.2018.8436339.
Pruitt, J., and T. Adlin. 2006.The Persona Lifecycle: Keeping
People in Mind Throughout Product Design. 1st ed.
Morgan Kaufmann.
Rönkkö, K. 2005.An Empirical Study Demonstrating How
Dierent Design Constraints, Project Organization and
Contexts Limited the Utility of Personas.In Proceedings
of the Proceedings of the 38th Annual Hawaii
International Conference on System Sciences Volume 08.
doi:10.1109/HICSS.2005.85.
Rönkkö, K., M. Hellman, B. Kilander, and Y. Dittrich. 2004.
Personas Is Not Applicable: Local Remedies Interpreted
in a Wider Context.In Proceedings of the Eighth
Conference on Participatory Design: Artful Integration:
Interweaving Media, Materials and Practices Volume 1,
112120. doi:10.1145/1011870.1011884.
Salimans, T., I. Goodfellow, W. Zaremba, V. Cheung, A.
Radford, X. Chen, and X. Chen. 2016.Improved
Techniques for Training GANs.In Advances in Neural
Information Processing Systems 29, edited by D. D. Lee,
M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett,
22342242. Curran Associates, Inc. http://papers.nips.cc/
paper/6125-improved-techniques-for-training-gans.pdf.
Salminen, J., H. Almerekhi, P. Dey, and B. J. Jansen. 2018a.
Inter-Rater Agreement for Social Computing Studies.
In Proceedings of The Fifth International Conference on
Social Networks Analysis, Management and Security
(SNAMS 2018), October 15. The Fifth International
Conference on Social Networks Analysis, Management
and Security (SNAMS 2018), Valencia, Spain.
Salminen, J., B. J. Jansen, J. An, H. Kwak, and S. Jung. 2018b.
Are Personas Done? Evaluating Their Usefulness in the
Age of Digital Analytics.Persona Studies 4 (2): 4765.
doi:10.21153/psj2018vol4no2art737.
Salminen, J., B. J. Jansen, J. An, H. Kwak, and S.-G.Jung. 2019a.
Automatic Persona Generation for Online Content
Creators: Conceptual Rationale and a Research Agenda.In
PersonasUser Focused Design,editedbyL.Nielsen,135
160. London: Springer. doi:10.1007/978-1-4471-7427-1_8.
Salminen, J., S. Jung, J. An, H. Kwak, L. Nielsen, and B. J.
Jansen. 2019b.Confusion and Information Triggered by
Photos in Persona Proles.International Journal of
Human-Computer Studies 129: 114. doi:10.1016/j.ijhcs.
2019.03.005.
Salminen, J., S.-G. Jung, and B. J. Jansen. 2019c.Detecting
Demographic Bias in Automatically Generated Personas.
In Extended Abstracts of the 2019 CHI Conference on
Human Factors in Computing Systems, LBW0122:1
LBW0122:6. doi:10.1145/3290607.3313034.
Salminen, J., S. G. Jung, and B. J. Jansen. 2019d. The Future
of Data-Driven Personas: A Marriage of Online Analytics
Numbers and Human Attributes.In ICEIS 2019
Proceedings of the 21st International Conference on
Enterprise Information Systems, 596603. https://
pennstate.pure.elsevier.com/en/publications/the-future-of-
data-driven-personas-a-marriage-of-online-analytics.
Salminen, J., S.-G. Jung, J. M. Santos, and B. J. Jansen. 2019e.
The Eect of Smiling Pictures on Perceptions of
Personas.In UMAP19 Adjunct: Adjunct Publication of
the 27th Conference on User Modeling, Adaptation and
Personalization. doi:10.1145/3314183.3324973.
Salminen, J., S.-G. Jung, J. M. Santos, and B. J. Jansen. 2019f.
Does a Smile Matter if the Person Is Not Real?: The Eect
of a Smile and Stock Photos on Persona Perceptions.
International Journal of HumanComputer Interaction 0
(0): 123. doi:10.1080/10447318.2019.1664068.
Salminen, J., H. Kwak, J. M. Santos, S.-G. Jung, J. An, and B. J.
Jansen. 2018c.Persona Perception Scale: Developing and
Validating an Instrument for Human-Like Representations
of Data.In CHI18 Extended Abstracts: CHI Conference on
Human Factors in Computing Systems Extended Abstracts
Proceedings. CHI 2018 Extended Abstracts, Montréal,
Canada. doi:10.1145/3170427.3188461.
Salminen, J., L. Nielsen, S.-G. Jung, J. An, H. Kwak, and B. J.
Jansen. 2018d.“‘Is More Better?: Impact of Multiple
Photos on Perception of Persona Proles.In Proceedings
of ACM CHI Conference on Human Factors in Computing
Systems (CHI2018), April 21. ACM CHI Conference on
Human Factors in Computing Systems (CHI2018), ACM,
Montréal, Canada. Paper 317.
Salminen, J., J. M. Santos, S.-G. Jung, M. Eslami, and B. J.
Jansen. 2019g.Persona Transparency: Analyzing the
Impact of Explanations on Perceptions of Data-Driven
Personas.International Journal of HumanComputer
Interaction 0(0):113. doi:10.1080/10447318.2019.1688946.
Salminen, J., S. Şengün, H. Kwak, B. J. Jansen, J. An, S. Jung, S.
Vieweg, and F. Harrell. 2017.Generating Cultural Personas
from Social Data: A Perspective of Middle Eastern Users.In
Proceedings of The Fourth International Symposium on Social
Networks Analysis, Management and Security (SNAMS-
2017). The Fourth International Symposium on Social
Networks Analysis, Management and Security (SNAMS-
2017), IEEE. Prague, Czech Republic, 1-8.
Salminen, J., S. Şengün, H. Kwak, B. J. Jansen, J. An, S. Jung, S.
Vieweg, and F. Harrell. 2018e.From 2,772 Segments to
Five Personas: Summarizing a Diverse Online Audience
by Generating Culturally Adapted Personas.First
Monday 23 (6). doi:10.5210/fm.v23i6.8415.
Şengün, S. 2014.A Semiotic Reading of Digital Avatars and
Their Role of Uncertainty Reduction in Digital
Communication.Journal of Media Critiques 1 (Special):
149162.
Sengün, S. 2015.Why Do I Fall for the Elf, When I am No
Orc Myself? The Implıcatıons of Vırtual Avatars ın
Dıgıtal Communıcatıon.Comunicação e Sociedade 27:
181193.
Shmelkov, K., C. Schmid, and K. Alahari. 2018.How Good Is
My GAN?In Computer Vision ECCV 2018, edited by V.
Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, 218
234. Springer International Publishing. doi:10.1007/978-3-
030-01216-8_14.
Shmueli-Scheuer, M., T. Sandbank, D. Konopnicki, and O. P.
Nakash. 2018.Exploring the Universe of Egregious
16 J. SALMINEN ET AL.
Conversations in Chatbots.In Proceedings of the 23rd
International Conference on Intelligent User Interfaces
Companion, ACM, New York, NY, Article 16.
Song, Y., D. Li, A. Wang, and H. Qi. 2019.Talking Face
Generation by Conditional Recurrent Adversarial
Network.In Proceedings of the Twenty-Eighth
International Joint Conference on Articial Intelligence,
Macao, China. 919925.
Tan, W. R., C. S. Chan, H. E. Aguirre, and K. Tanaka. 2017.
ArtGAN: Artwork Synthesis with Conditional
Categorical GANs.In 2017 IEEE International
Conference on Image Processing (ICIP), IEEE. 37603764.
Weiss, J. K., and E. L. Cohen. 2019.Clicking for Change: The
Role of Empathy and Negative Aect on Engagement with
a Charitable Social Media Campaign.Behaviour &
Information Technology 38 (12): 11851193. doi:10.1080/
0144929X.2019.1578827.
Wongpakaran, N., T. Wongpakaran, D. Wedding, and K. L.
Gwet. 2013.A Comparison of Cohens Kappa and Gwets
AC1 when Calculating Inter-Rater Reliability Coecients:
A Study Conducted with Personality Disorder Samples.
BMC Medical Research Methodology 13 (1): 61.
Yang, X., Y. Li, and S. Lyu. 2019.Exposing Deep Fakes Using
Inconsistent Head Poses.In ICASSP 2019-2019 IEEE
International Conference on Acoustics, Speech and Signal
Processing (ICASSP), IEEE, 82618265.
Yin, W., Y. Fu, L. Sigal, and X. Xue. 2017.Semi-Latent GAN:
Learning to Generate and Modify Facial Images from
Attributes.ArXiv:1704.02166 [Cs].http://arxiv.org/abs/
1704.02166.
Yuan, Y., H. Su, J. Liu, and G. Zeng. 2020.Locally and
Multiply Distorted Image Quality Assessment via Multi-
Stage CNNs.Information Processing & Management 57
(4): 102175. doi:10.1016/j.ipm.2019.102175.
Zhang, X., H.-F. Brown, and A. Shankar. 2016.Data-driven
Personas: Constructing Archetypal Users with
Clickstreams and User Telemetry.In Proceedings of the
2016 CHI Conference on Human Factors in Computing
Systems, ACM, New York, NY, 53505359.
Zhang, R., P. Isola, A. A. Efros, E. Shechtman, and O. Wang.
2018.The Unreasonable Eectiveness of Deep Features as
a Perceptual Metric.In 2018 IEEE/CVF Conference on
Computer Vision and Pattern Recognition, 586595.
doi:10.1109/CVPR.2018.00068.
Zhao, R., Y. Xue, J. Cai, and Z. Gao. 2020.Parsing Human
Image by Fusing Semantic and Spatial Features: A Deep
Learning Approach.Information Processing &
Management 57 (6): 102306. doi:10.1016/j.ipm.2020.
102306.
Zhou, M. X., W. Chen, Z. Xiao, H. Yang, T. Chi, and R.
Williams. 2019a.Getting Virtually Personal: Chatbots
who Actively Listen to You and Infer Your Personality.
In Proceedings of the 24th International Conference on
Intelligent User Interfaces: Companion, ACM, New York,
NY, 123124.
Zhou, H., Y. Liu, Z. Liu, P. Luo, and X. Wang. 2019b.Talking
Face Generation by Adversarially Disentangled Audio-
Visual Representation.Proceedings of the AAAI
Conference on Articial Intelligence 33 (01): 92999306.
doi:10.1609/aaai.v33i01.33019299.
BEHAVIOUR & INFORMATION TECHNOLOGY 17
... Yet, current literature does not adequately address them. We are aware of only one study that addresses these questions, evaluating the applicability of AI-generated pictures in persona profiles using a sample of 496 participants (Salminen, Jung, Ahmed Mohamed Sayed Kamel, Santos, & Jansen, 2020d). The study found that using artificial images in persona profiles did not negatively affect perceptions of authenticity, clarity, empathy, or willingness to use the personas (Salminen et al., 2020d). ...
... We are aware of only one study that addresses these questions, evaluating the applicability of AI-generated pictures in persona profiles using a sample of 496 participants (Salminen, Jung, Ahmed Mohamed Sayed Kamel, Santos, & Jansen, 2020d). The study found that using artificial images in persona profiles did not negatively affect perceptions of authenticity, clarity, empathy, or willingness to use the personas (Salminen et al., 2020d). The interesting feat is that this study is from 2020, a period predating the current state-of-the-art models like DALL-E and Midjourney that generate pictures based on contextual prompts. ...
Article
HCI research is facing a vital question of the effectiveness of AI-generated personas. Addressing this question, this research explores user perceptions of AI-generated personas for textual content (GPT-4) and two image generation models (DALL-E and Midjourney). We evaluate whether the inclusion of images in AI-generated personas impacts user perception or if AI text descriptions alone suffice to create good personas. Recruiting 216 participants, we compare three GPT-generated personas without images and those with either DALL-E or Midjourney-created images. Contrary to initial expectations from persona literature, the presence of images in AI-generated personas did not significantly impact user perceptions. Rather, the participants generally perceived AI-generated personas to be of good quality regardless of the inclusion of images. These findings suggest that textual content, i.e., the persona narrative, is the primary driver of user perceptions in AI-generated personas. Our findings contribute to the ongoing AI-HCI discourse and provide recommendations for designing AI-generated personas.
... The first steps of deepfake were taken in 1997 when Bregler et al. (1997) released their paper Video Rewrite: driving visual speech with audio in which the researchers presented their innovation of using existing footage of people speaking words not present in the original footage. In recent years, deepfake technologies have become increasingly common for crafting realistic human portrayals in both images (Karras et al., 2019;Salminen et al., 2020c) and video (Tahir et al., 2021), with video being the more dominant media format. For personas, deepfakes are an attractive modality option since they could make the user representation livelier and more appealing (André et al., 1998). ...
Article
Deepfakes, realistic portrayals of people that do not exist, have garnered interest in research and industry. Yet, the contributions of deepfake technology to human-computer interaction remain unclear. One possible value of deepfake technology is to create more immersive user personas. To test this premise, we use a commercial-grade service to generate three deepfake personas (DFs). We also create counterparts of the same persona in two traditional modalities: classic and narrative personas. We then investigate how persona modality affects the perceptions and task performance of the persona user. Our findings show that the DFs were perceived as less empathetic, credible, complete, clear, and immersive than other modalities. Participants also indicated less willingness to use the DFs and less sense of control, but there were no differences in task performance. We also found a strong correlation between the uncanny valley effect and other user perceptions, implying that the tested deepfake technology might lack maturity for personas, negatively affecting user experience. Designers might also be accustomed to using traditional persona profiles. Further research is needed to investigate the potential and downsides of DFs.
... Social media data can be used to process standard demographics and segment customer types based on product parameters and market analysis [38]. In this case, persona evaluation becomes a study in complexity, since the human evaluator is faced with a mix of artificially generated data, such as pictures [39], automatically collected standard user demographic data, and post-processed information from various sources of user interaction from linguistically rich environments [40]. ...
Chapter
Full-text available
Automatic persona generation has been shown to have specific measurable benefits for application creators and users. In most situations, personas are adequately descriptive and diversified to achieve user type accuracy and coverage. For specific market segments, such as online gaming, using personas may accurately describe existing user base but not changing habit and need that are introduced by the fluidity of the offerings and the delivery methods. Changes in the ways that applications are marketed, such as new payment methods, for example, subscription models, pay-to-play and pay-to-win, payment-driven-gamification, seriously affect user needs and result in direct impact on user acceptance. This work utilises structured user needs from online gaming players to augment personas using personalisation techniques. The personas are finetuned and de-diversified to result in concise personas, based on user needs that successfully convey information for creators and users alike.KeywordsPersonaOnline gamingDiversificationPersonalisationData-driven methodsCollaborative filteringUser studyUsability evaluation
... For both samples, a carefully selected number of participants from the online survey platform Prolific was recruited. Prolific has been used in several persona user studies in the past [94,96,99,100], and its data quality has been found satisfactory for academic research [82,83]. We applied custom prescreening to increase the validity of the responses with the following sampling criteria: ...
Article
Full-text available
User-centric design within organizations is crucial for developing information technology that offers optimal usability and user experience. Personas are a central user-centered design technique that puts people before technology and helps decision makers understand the needs and wants of the end-user segments of their products, systems, and services. However, it is not clear how ready organizations are to adopt persona thinking. To address these concerns, we develop and validate the Persona Readiness Scale (PRS), a survey instrument to measure organizational readiness for personas. After a 12-person qualitative pilot study, the PRS was administered to 372 professionals across different industries to examine its reliability and validity, including 125 for exploratory factor analysis and 247 for confirmatory factor analysis. The confirmatory factor analysis indicated a good fit with five dimensions: Culture readiness, Knowledge readiness, Data and systems readiness, Capability readiness, and Goal readiness. Higher persona readiness is positively associated with the respondents’ evaluations of successful persona projects. Organizations can apply the resulting 18-item scale to identify areas of improvement before initiating costly persona projects towards the overarching goal of user-centric product development. Located at the cross-section of information systems and human–computer interaction, our research provides a valuable instrument for organizations wanting to leverage personas towards more user-centric and empathetic decision making about users.
... Nowadays, we are constantly surrounded with multiple streaming data sources (Internet Of Things, sensors…) that support our daily lives and activities. From that observation, design engineering also becomes more influenced by data-driven inspirations in several areas of interest, for instance: datadriven construction of personas (Stevenson and Mattson, 2019) (Salminen et al., 2020); creation of an undergraduate curriculum on data materialisation, i.e. learning how to interact with data (Beghelli, Huerta-Canepa and Segal, 2019); exploration of the design space in Design by shopping (Abi-Akle, Minel and Yannou, 2017); elaboration of a data-driven concept generation and evaluation approaches for supporting designers in the early phases of the design process (Han et al., 2020). Design for mobility systems becomes an increasing challenging topic for the design community since mobility is a major societal issue regarding its impact on global sustainability, liveability of cities and also well-being of citizens. ...
Article
Full-text available
Massive data are surrounding us in our daily lives. Urban mobility generates a very high number of complex data reflecting the mobility of people, vehicles and objects. Transport operators are primary users who strive to discover the meaning of phenomena behind traffic data, aiming at regulation and transport planning. This paper tackles the question "How to design a supportive tool for visual exploration of digital mobility data to help a transport analyst in decision making?” The objective is to support an analyst to conduct an ex post analysis of train circulation and passenger flows, notably in disrupted situations. We propose a problem-solution process combined with data visualisation. It relies on the observation of operational agents, creativity sessions and the development of user scenarios. The process is illustrated for a case study on one of the commuter line of the Paris metropolitan area. Results encompass three different layers and multiple interlinked views to explore spatial patterns, spatio-temporal clusters and passenger flows. We join several transport network indicators whether are measured, forecasted, or estimated. A user scenario is developed to investigate disrupted situations in public transport.
Chapter
Full-text available
In this chapter, we discuss the aspects of getting data that is useful for creating data-driven personas. We start by introducing the concept of persona information needs, which refers to stakeholders’ requests for information. We then proceed to persona information display and design, which refers to how the selected information is displayed to end-users of data-driven personas. After this, we present the primary data collection strategies for data-driven personas, including surveys, text quantification, and automated data collection. Finally, we discuss five central data challenges: (1) availability; (2) specifications; (3) unknown measurement error; (4) bias; and (5) ethical concerns. We conclude by presenting takeaways and educational questions.
Article
Full-text available
There has been little research into whether a persona's picture should portray a happy or unhappy individual. We report a user experiment with 235 participants, testing the effects of happy and unhappy image styles on user perceptions, engagement, and personality traits attributed to personas using a mixed-methods analysis. Results indicate that the participant's perceptions of the persona's realism and pain point severity increase with the use of unhappy pictures. In contrast, personas with happy pictures are perceived as more extroverted, agreeable, open, conscientious, and emotionally stable. The participants’ proposed design ideas for the personas scored more lexical empathy scores for happy personas. There were also significant perception changes along with the gender and ethnic lines regarding both empathy and perceptions of pain points. Implications are the facial expression in the persona profile can affect the perceptions of those employing the personas. Therefore, persona designers should align facial expressions with the task for which the personas will be employed. Generally, unhappy images emphasize realism and pain point severity, and happy images invoke positive perceptions.
Chapter
Full-text available
During exceptional times when researchers do not have physical access to users of technology, the importance of remote user studies increases. We provide recommendations based on lessons learned from conducting online user studies utilizing four online research platforms (Appen, MTurk, Prolific, and Upwork). Our recommendations aim to help those inexperienced with online user studies. They are also beneficial for those interested in increasing their proficiency, employing this increasingly important research methodology for studying people’s interactions with technology and information.
Article
Full-text available
Persona is a common human-computer interaction technique for increasing stakeholders' understanding of audiences, customers, or users. Applied in many domains, such as e-commerce, health, marketing, software development, and system design, personas have remained relatively unchanged for several decades. However, with the increasing popularity of digital user data and data science algorithms, there are new opportunities to progressively shift personas from general representations of user segments to precise interactive tools for decision-making. In this vision, the persona profile functions as an interface to a fully functional analytics system. With this research, we conceptually investigate how data-driven personas can be leveraged as analytics tools for understanding users. We present a conceptual framework consisting of (a) persona benefits, (b) analytics benefits, and (c) decision-making outcomes. We apply this framework for an analysis of digital marketing use cases to demonstrate how data-driven personas can be leveraged in practical situations. We then present a functional overview of an actual data-driven persona system that relies on the concept of data aggregation in which the fundamental question defines the unit of analysis for decision-making. The system provides several functionalities for stakeholders within organizations to address this question.
Article
During natural and man-made disasters, people use social media platforms such as Twitter to post textual and multimedia content to report updates about injured or dead people, infrastructure damage, missing or found people, among other information types. Studies have revealed that this online information, if processed timely and effectively, is extremely useful for humanitarian organizations to gain situational awareness and plan relief operations. In addition to the analysis of textual content, recent studies have shown that imagery content on social media can boost disaster response significantly. Despite extensive research that mainly focuses on textual content to extract useful information, limited work has focused on the use of imagery content or the combination of both content types. One of the reasons is the lack of labeled imagery data in this domain. Therefore, in this paper, we aim to tackle this limitation by releasing a large multi-modal dataset from natural disasters collected from Twitter. We provide three types of annotations, which are useful to address a number of crisis response and management tasks for different humanitarian organizations.
Article
Social media analytics is insightful but can also be difficult to use within organizations. To address this, we present Automatic Persona Generation (APG), a system and methodology for quantitatively generating personas using large amounts of online social media data. The APG system is operational, deployed in a pilot version with several organizations in multiple industry verticals. APG uses a robust web and stable back-end database framework to process tens of millions of user interactions with thousands of online digital products on multiple social media platforms, including Facebook and YouTube. APG identifies both distinct and impactful audience segments for an organization to create persona profiles by enhancing the social media analytics data with pertinent features, such as names, photos, interests, etc. We demonstrate the architecture development, and main system features. APG provides value for organizations distributing content via online platforms and is unique in its approach to leveraging social media data for audience understanding. APG is online at https://persona.qcri.org.
Book
People relate to other people, not to simplified types or segments. This is the concept that underpins this book. Personas, a user centered design methodology, covers topics from interaction design within IT, through to issues surrounding product design, communication, and marketing. Project developers need to understand how users approach their products from the product’s infancy, and regardless of what the product might be. Developers should be able to describe the user of the product via vivid depictions, as if they – with their different attitudes, desires and habits – were already using the product. In doing so they can more clearly formulate how to turn the product's potential into reality. Based on 20 years’ experience in solving problems for businesses and 15 years of research, currently at the IT University of Copenhagen, Lene Nielsen is Denmark’s leading expert in the persona method. She has a PhD in personas and scenarios, and through her research and practical experiences has developed her own approach to the method – 10 Steps to Personas. This second edition of Personas – User Focused Design presents a step-by-step methodology of personas which will be of interest to developers of IT, communications solutions and innovative products. This book also includes three new chapters and considerable expansion on the material in the first edition.
Article
How to parse the human image to obtain the text label corresponding to the human body is a critical task for human-computer interaction. Although previous methods have significantly improved the parsing performance, the problem of parsing confusion and tiny target missing remains unresolved, which leads to errors and incomplete inference accordingly. Targeting at these drawbacks, we fuse semantic and spatial features to mine the human body information based on the Dual Pyramid Unit convolutional neural network, named as DPUNet. DPUNet is composed of Context Pyramid Unit (CPU) and Spatial Pyramid Unit (SPU). Firstly, we design the CPU to aggregate the local to global semantic information, which exports the semantic feature for eliminating the semantic confusion. To capture the tiny targets for preventing the details from missing, the SPU is proposed to incorporate the multi-scale spatial information and output the spatial feature. Finally, the features of two complementary units are fused for accurate and complete human parsing results. Our approach achieves more excellent performance than the state-of-the-art methods on single human and multiple human parsing datasets. Meanwhile, the proposed framework is efficient with a fast speed of 41.2fps.
Article
Generative Adversarial Networks (GANs) have achieved inspiring performance in both unsupervised image generation and conditional cross-modal image translation. However, how to generate quality images at an affordable cost is still challenging. We argue that it is the vast number of easy examples that disturb training of GANs, and propose to address this problem by down-weighting losses assigned to easy examples. Our novel Incremental Focal Loss (IFL) progressively focuses training on hard examples and prevents easy examples from overwhelming the generator and discriminator during training. In addition, we propose an enhanced self-attention (ESA) mechanism to boost the representational capacity of the generator. We apply IFL and ESA to a number of unsupervised and conditional GANs, and conduct experiments on various tasks, including face photo-sketch synthesis, map↔aerial-photo translation, single image super-resolution reconstruction, and image generation on CelebA, LSUN, and CIFAR-10. Results show that IFL boosts learning of GANs over existing loss functions. Besides, both IFL and ESA make GANs produce quality images with realistic details in all these tasks, even when no task adaptation is involved.