Conference PaperPDF Available

Sharing Information with Web Services – A Mental Model Approach in the Context of Optional Information

Authors:

Abstract

Web forms are a common way for web service providers to collect data from their users. Usually, the users are asked for a lot of information while some items are labeled as optional and others as mandatory. When filling in the web form, users have to decide, which data, often of personal and sensitive nature, they want to share. The factors that influence the decision whether or not to share some information has been studied in the literature in various contexts. However, it is unclear to which extent their results can be transferred to other contexts. In this work we conduct a qualitative user study to verify, whether the reasons for sharing optional information from previous studies [12] are relevant for the context of interacting with a commercial website. We found, that only a few of them were named by the participants of our study.
Sharing Information with Web Services A
Mental Model Approach in the Context of
Optional Information
Oksana Kulyk2, Benjamin Maximilian Reinheimer2, and Melanie Volkamer1,2
1Karlstad University, Sweden
2Technische Universit¨at Darmstadt, Germany
name.surname@secuso.org
Abstract. Web forms are a common way for web service providers to
collect data from their users. Usually, the users are asked for a lot of infor-
mation while some items are labeled as optional and others as mandatory.
When filling in the web form, users have to decide, which data, often of
personal and sensitive nature, they want to share. The factors that in-
fluence the decision whether or not to share some information has been
studied in the literature in various contexts. However, it is unclear to
which extent their results can be transferred to other contexts. In this
work we conduct a qualitative user study to verify, whether the reasons
for sharing optional information from previous studies [12] are relevant
for the context of interacting with a commercial website. We found, that
only a few of them were named by the participants of our study.
Keywords: Web forms, optional fields, mental models, interviews
1 Introduction
Web forms have been a known component on websites since the 1990s. They are
often used by web service providers to collect personal data of their users, which
is either directly required for the functionality of the service, or serves other
purposes such as enabling data analytics (e.g. for personalized advertisements
or service improvements). Usually, the users are asked for a lot of information,
often of personal nature, while some items are labeled as optional and others
as mandatory. When filling in the web form, users have to decide, which data
they want to share. The only way for the users not to share data requested in
mandatory fields is either not to use the service at all, or to provide information
that is fake but has a semantic that the service provider accepts (e.g. a wrong
birthday but still an existing date). Users have more power in deciding whether
to share information or not when the fields are optional.
A number of studies have been dedicated to the research on users’ behaviour
and perception regarding the web form fields. In particular, several studies fo-
cused on researching a link between the users’ willingness to provide data by
filling in the web form fields that are not mandatory and the users personality
traits [5, 11]. The paper by Preibusch et al. [12] provides a list of reasons which
might explain, why users provide data for optional fields on the website form.
This list, however, is only partially supported by existing studies. Furthermore,
these studies have been performed in specific contexts, such as creating an ac-
count on social lending networks. Hence it is unclear whether their results can
be generalised to other types of web services.
In order to check the relevance of their list, Preibusch et al. conducted a
quantitative study in [12]. One of the goals of the study has been to find out the
reasons why the participants filled in the optional fields. Their results confirmed
the relevance of some of the reasons from their initial list. Furthermore, the study
found anecdotal evidence for additional reasons such as user extroversion or
feeling compelled to complete all the fields in the form. The authors, however, did
not elaborate on the additional reasons. Furthermore, being only one of several
research goals of the study, the reasons of filling in the forms were the focus of
only one open question, which did not allow clarifying follow-up questions. The
study also focused on a specific context of the mTurk platform. As such, the
participants expected that their input data will be used for statistical research
(which provided additional motivation for some of them to input more), that
the more data they input, the more rewards they would gain from the mTurk
platform, and the authors themselves recognize that the active users of mTurk
might be more inclined to fill in web forms out of interest than the general
population.
In this work we conduct a user study to verify, whether the reasons for
filling in the optional fields of the web forms are consistent with the initial list
by Preibusch et al. in the context of interacting with a commercial website.
Concretely, the scenario for our study was that the participants had to register
a user account on a mock website of the Deutsche Bahn (German railways)
company1. They were told, that the goal of the study is to evaluate the usability
of the new design proposed by the company. After filling in the registration
form, semi-structured interviews were conducted with the participants, where
they were asked to explain why they filled or not filled in the optional form fields.
The interviews were qualitatively analysed using the reasons from the initial list
of Preibusch et al. as pre-existing categories. The results show which reasons have
been mentioned by our participants and whether additional reasons have been
mentioned that cannot be assigned to the initial list. We found correspondences
for three out of ten items in the list in our interviews. We also found evidence
for two additional categories of reasons that the participants gave when asked
to explain why they filled in an optional field.
We furthermore looked at the reasons why the participants were reluctant to
share additional data, and at the countermeasures they used, such as providing
fake data, in order to avoid sharing more than they would want.
1https://www.bahn.de/p/view/index.shtml, last accessed 10.02.2017.
2
2 Methodology
In this section we describe the user study that we have conducted, and the
methods we used to analyse the resulting interviews.
2.1 User Study
We first by describing the study design and the demographics of our participants.
Mock Registration Website For our user study we set up a mock registration
website that contained a cloned and modified registration form of the ”Deutsche
Bahn” company on a local virtual machine. The DNS entries in the host file
on the operation system were manipulated, so that the participants were not
able to tell that the website is not online. For the same purpose, the Internet
connection status bar was hidden.
The form on our mock website resembled the design of the original Deutsche
Bahn registration website, but contained a different set of form fields, namely,
eight mandatory fields and six optional fields. The fields that were included
in the form were chosen as common fields on the websites in Alexa top 502.
Namely, we included such optional fields as title, date of birth and phone number.
In additional to these fields that were commonly encountered on the websites,
we also chose to include two optional fields that we rarely used in web-forms,
namely, the marital status and country of origin. We further included an optional
checkbox that asked whether the participants consent to using cookies.
Participants The study consisted of 16 participants, with eight women and
eight men. The youngest participant was 23 years old, and the oldest 58 years
old, with 36.5 as the mean age. In order to prevent priming the participants
towards thinking about their privacy, the participants of the study were told
that they are going to participate in a study done in collaboration with the
Deutsche Bahn, and that the goal of the study was usability evaluation of a
new registration form for the Deutsche Bahn website. The participants were
offered either 10 Euros or one credit point reimbursement for their participation
in the study. Most of the participants rated their IT knowledge highly: When
asked to agree or disagree with the statement that their IT knowledge is good,
11 participants answered that they either “strongly agree” or “agree”,, three
neither agreed nor disagreed, and three disagreed with this statement. All but
one participants answered either “strongly argee” or “agree” to the statement
that their privacy is important to them (the remaining participant did not answer
that question), and all but three answered “strongly agree” or “agree” to the
statement that they take active measures to protect their privacy (out of the
remaining participants, two neither agreed nor disagreed, and one did not answer
the question).
2http://www.alexa.com/topsites, last accessed on 10.02
3
Study Design After welcoming the participants, the study consisted of two
parts.
Registration. In the first part, the participants were told to fill in the registration
form on the mock website that we have set up. At the beginning of the study
every participant had to read the same study description, describing the goal of
the study, namely, usability evaluation of the registration form. The participants
were then told to register themselves using the modified registration form. It was
furthermore stated, that no questions during the registration process are allowed
in order not to interfere the process. Still, the participants were encouraged to
think out loud during the registration. The registration process was completed
when the participant clicked on the send button.
Follow-up questions. The follow-up questions were asked in form of a semi-
structured interview. After the registration, the participants were told that we
have an exclusive access to their registered dataset to further discuss their per-
ceived usability of the registration form. To obfuscate the real intention of our
interview and to be in compliance with our communicated research goals, the
introductory questions started with usability topics. Afterwards, based on the
displayed data type fields, the participants were asked a set of questions about
why they have or have not filled in the optional fields.
The study concluded by debriefing and gathering the demographic data.
2.2 Analysis Methodology
Our main research goal was to find out, which reasons for filling in the op-
tional fields from the Preibusch et al. initial list [12] were mentioned by our
participants, and whether there have been any reasons not on this list. For this
purpose, the interviews were transcribed and analysed using qualitative semi-
open coding approach. We took the list of Preibusch et al. as the pre-defined
categories and classified the participants responses in the interviews according
to these categories. In case we encountered responses that could not fit into the
pre-defined categories, we assigned them to new categories. Each transcript has
been analysed by two independent authors, and the findings were then discussed
and agreed upon among the authors. The categories were supplemented with the
quotes from the interviews, translated from German to English.
As additional research goals, we decided to consider the reasons that the
participants gave for not providing their personal data to websites, and the
countermeasures they used when a website requested some kind of personal
data they did not want to disclose. For these goals, the interviews were analysed
by two authors using open-coding approach, and the resulting categories were
further discussed among the authors and agreed upon. As with the main research
goal, we provide a quote supporting each one of the categories, translated from
German to English.
4
3 Results
In this section we describe the evaluation results of our study.
3.1 Reasons for Filling in Optional Data Fields
We first describe the findings relevant to our main research goal. We provide the
list of the pre-existing categories and specify whether we found any correspon-
dences to them in our dataset. We further describe the new categories that were
derived from our analysis.
Pre-Existing Categories We first describe the correspondences we found in
our interviews to the list of Preibush et al. in [12].
Over-disclosure by accident. Commonly, the users do not distinguish between
optional and mandatory forms, either due to the website’s design or due to
not paying attention to the clues that point that a field is optional. As such,
a significant number of participants in our study reported not seeing the red
star that appears only near to mandatory fields3, and then mentioned that they
would not have filled the data if they have seen that it is optional.
“It was not intended, I would not have filled it in if I did not think that
I had to input it.”
Over-disclosure by proxy. This item relates to the cases, where the autocomplete
function of one’s browser ends up filling in more data than the user intended
to. As the participants in our study used a lab computer to fill in the form,
over-disclosure by proxy was not relevant for them.
Limit disclosure is costly. It has been suggested, that some users fill in all the
fields in the form, since distinguishing between optional and mandatory fields
requires too much time or effort, for example, if the website requires sending
the filled form first before telling whether there is data missing in some of the
mandatory fields. However, none of the participants named this reason for filling
in optional forms explicitly.
Building social capital. The studies on websites that maintain a public or semi-
public (i.e. open only to friends on social networks, or to recruiters on job hunting
websites) have shown [7], that some users provide more data in their accounts
in order to create a better image of themselves. In our study, however, the
participants did not have to create a public profile of themselves, hence, they
could not build social capital based on the data they provided. Therefore, as
expected, none of them has mentioned this reason.
3Note that our mock registration form used the same indicator for distinguishing
between mandatory and optional fields as the real Deutsche Bahn website.
5
Expecting monetary return. The data provided by the users is often used by
the companies to provide additional offers to the users such as personalised
advertisements. Hence, it has been suggested that the users might input their
data in order to be able potentially to benefit from such offers. A number of
our participants mentioned, that they disclose such data as their date of birth,
expecting special offers sent to them on their birthday, or expecting information
on discounts tailored to their interests.
“Okay, it can also present a benefit, if I, for example, register myself
somewhere or fill in some form, and in this way the personalized offers
can be tailored to me. This can be an advantage.”
Note that although the participants interacted with the mock website, they
did not attempt to surf the website in order to find the information about the
exact benefits they might get from disclosing additional data. They also did not
mention that they tend to research the potential benefits of data disclosure on
other websites they use before they actually input their data on these websites.
Hence, their expectations relied more on their reasoning and previous experience
than on the information provided by the service prior to the data disclosure.
Expecting non-monetary return. Similar to monetary benefits, the companies
might provide additional features to the user based on their input data, such as
personalised recommendations of products or services or additional functionality.
Some of our participants mentioned expecting such a non-monetary return in
form of an additional functionality in exchange for providing additional data,
such as getting phone notifications when the transport is late if the phone number
is provided.
“...while booking a bus trip in Germany on the Internet, one has to input
the phone number in order to be notified about the delays. And I see a
benefit in this, that I leave my phone number, although generally I am
reluctant. This would be an example where I see that it makes sense for
me to leave my phone number.”
Similar to the expectations of monetary return, our participants neither at-
tempted to find out whether the Deutsche Bahn provides additional functional-
ity in exchange of disclosing optional data prior to the registration, nor did they
mention researching potential benefits of data disclosure before providing their
data on other websites.
Expecting infrastructure improvements. Preibusch et al. suggest, that the com-
panies can use the information gathered from the users to better adjust their
services to the demands of their customers. Hence, expecting such adjustments,
the users might choose to provide additional data. However, none of our partic-
ipants mentioned such motivation for disclosing data on web forms.
6
Acting reciprocally/altruistically. Studies have shown [11] that people who gen-
erally tend to act reciprocally also provided more data by filling in the fields
in the study questionnaires. Since, however, our study focused on filling in the
registration forms on commercial websites, it is not surprising that our partici-
pants did not mention the motivation to act reciprocally or altruistically as their
reason for providing additional data.
Personality. Preibusch et al. suggest that for some users their personality might
influence their decision to input more data, for example, if the user enjoys filling
in the questionnaires. Indeed, the study in [12] included a significant number of
participants who mentioned that they enjoyed participating in the surveys or
find the activity of filling in the forms fun and interesting. However, none of our
participants mentioned their personal preferences as a motivation for providing
more data. It is worth noting, however, that a number of participants mentioned
their personality traits as the reason not to provide their data on the websites,
which we describe in Section 3.2.
New Findings We further describe additional reasons mentioned by our par-
ticipants but not included in the initial list in [12].
“It makes sense for them to request this information” Several users mentioned
filling in the fields, that they expected to be mandatory, even though the fields
were marked as optional. The expectations of the participants were either due to
their previous experience with similar services, or due to their assumption that
the particular data is required for the service functionality.
“Maybe for some... maybe at the Espirit online shop, there I would think,
why are they interested in my date of birth, they are only interested in
what I order. [...] They do not need to know my date of birth. And here
I thought, that it might be relevant for ordering the train ticket. I would
relate the date of birth to the registration.
“Country of origin... I saw that it is not mandatory... I deliberately filled
it in, because I think that this is an important category for the classifi-
cation. This was just my interpretation.
“No, I think, when I fill something in, do they really need this, or not?
And all that I filled in is important... so, in my opinion.”
As with the case of expecting monetary or non-monetary return from provid-
ing additional data (see Section 3.1), the participants in this category relied on
their own reasoning in deciding whether the requested data is indeed required
by the service instead of attempting to get this information from the service
provider itself.
7
“I trust that they have their reasons for requesting this information Similar
to the previous category, some participants claimed to disclose optional data if
they believed that there was a good reason for the service provider to request
the information. However, while the participants in the previous category based
their beliefs on their own reasoning, others relied more on their trust in the
service provider to use their data responsibly.
“Now, for example, I have an airline in mind, they need some data in
any case. I do not have any problems with it, since I trust that the data
stays confidential with them.
Similar to the previous category, the participants neither attempted did not
attempt to find out the reasons why the service collects the requested data.
Filling out fields as default behaviour. Some of the participants claimed, they
generally tend to fill in all the forms on the website, unless they have a particu-
lar reason not to. While these claims can be considered close to the pre-existing
categories “over-disclosure by accident” and “limit disclosure is costly”, we still
decided to categorize them separately, since the participants neither claimed
to overlook the indicator and disclose more than they intended to, neither men-
tioned making a conscious decision to save time or effort by filling in all available
fields.
“I just did not see any disadvantage, so I thought, I fill this in.”
In particular, some stressed that they would disclose the information if the
website is trusted.
“And the fields I do not fill in, these are, for example, address stuff, but
I have no concerns with the Deutsche Bahn.”
3.2 Other Findings
We describe the findings for our additional research goals, namely, by providing
an overview of the reasons that our participants mentioned for not disclosing
their personal data, and the countermeasures they mentioned using when con-
fronted with the request to share more data than they wanted.
Reasons for Not Filling in Optional Data Fields The responses of the
participants who were reluctant to share their data can be grouped into two
categories.
Personal feelings. A number of participants mentioned that they did not share
their data due to their personality, or because they “had a bad feeling” sharing
more than they considered absolutely needed. As such, this group focused on
their subjective feelings and personal preferences:
“I do not like disclosing it, but this really a very personal and subjective
thing!”
8
Concrete threats. Another group mentioned specific threats that they wanted to
protect themselves against, such as spam mails or phone calls, or identity theft:
“I do not like it when people just call. I have experienced this a couple
of times, that someone just calls me, and I do not like it.”
“Some [companies] really try [to protect the data], but then it’s like,
yeah, we have been hacked, or... and this is just great. Then they have
all the data, all the credit cards... this did not yet happen to me, but...
this is why I do not have a lot to do with the Internet services.”
Countermeasures We asked our participants what would they do if there is a
registration form on some website with mandatory fields that require data the
participants do not want to disclose. The responses can be grouped into following
categories:
Boycott the website. The most obvious solution mentioned by several partici-
pants was that they would refuse to use a website, if it required data considered
too private by the participants. In particular, looking for alternatives that pro-
vide a similar service but either require less data or are more trusted not to
misuse the collected data has been mentioned:
“I already had this, that I wanted to register, for example, in the online
shop, and then I did not want to fill in the data. And then I did not
register, and bought it at Amazon for a couple of euros more.
Input fake data. A solution also mentioned by our participants was to input
fake data, if the real data is considered too private to disclose. The types of data
that is faked, as well as the settings in which fake data is given, varies. As such,
a number of participants mentioned that they input fake data often, aside from
the situations when it could hinder the functionality offered by the service:
“So I am always the one who under circumstances also inputs fake data,
when it does not suit me. This is possible.
Some have mentioned that they are reluctant to input fake data into the
websites owned by governmental institutions:
“Actually, always, except for, I would say, official institutions, where it
has to be correct.”
Another approach that has been mentioned in the interviews was to input
fake data, which, however, is not misleading. One particular example is the date
of birth: as the website’s intention is to find out, whether the user is older than
18, the specific age is assumed to be irrelevant, hence, fake data can be given.
9
“There is the Rotk¨appchen sparkling wine, and when one goes to this
website, then one has to input the date of birth. Maybe minors under
18 years old are not allowed to visit the website. So I could imagine.
And when I look at something on the website, then I just click on some
number. I mean, I am not under 18, but I just click something, since it
does not matter whether I am 30, or 40, or 50 years old, for me to go
there.
It is worth noting, however, that a number of participants claimed that they
never input fake data due to their personality traits.
“No, I am very honest.”
Avoid registration, but still use the service. One possible solution to avoid
filling in unwanted web form fields was to look for the ways to use the website
functionality without registration.
“I actually never register, and continue without login. [...] They do not
know who I am, what my name is, where I live and so on, and I do not
have to remember any login and can always do that in another way, so
to say.”
Use throw-away contact information. The reluctance to fill in contact informa-
tion has been often mentioned by our participants, either due to privacy reasons,
or in order to avoid unwanted advertisements. Hence, in order to be able to reg-
ister on the websites, that demanded the user’s e-mail address, some of the
participants mentioned registering a separate address just for the registration
purposes, that is not checked as often as their regular address.
“The e-mail address is in any case a second email address, so it is not
an important one. When too many junk gets there, then it will not be
read.”
“Then one can have a spam e-mail. Then they can spam me as they
want, that does not bother me.”
While the practice of using throw-away phone numbers appears to be much
less frequent than using throw-away e-mail adresses, it has been mentioned as
well. In particular, one of the participants reported registering a phone number
from an Internet phone company, so that the calls to this number went to the
participant’s email instead of going to their regular phone.
“When one has to input the phone number as a mandatory field, then
I often input a Sipgate phone number, that lands in a normal mail box.
[...] This is a Voice-over-IP phone number, there I get at most an e-mail,
when someone calls it. But my mobile phone does not ring.”
10
4 Related Work
For describing the related work we focus on research that studied the factors
that influence the data disclosure of the users and the tools that aim to prevent
the users from disclosing too much data. We furthermore describe the works in
other domains that study the mental models of the users and the motive for
their behaviour concerning various security mechanisms.
Reasons and factors that influence data exposure. A number of studies focused on
the topic of web forms and optional fields. As such, Preibusch et al. conducted
a quantitative user study in order to study the users [12] and construct our
original list of reasons to expose their data. They also conducted a user study
trying to gauge additional responses, but the context was also limited (the users
thought that the purpose of the study was to gather and analyse their data).
Their further findings include quantitative analysis whether users are likely to
fill in optional fields, whether the presence of mandatory fields increases their
likelihood to enter data and how long does it take to enter data.
The personality traits of the users that influence their data disclosure have
been the topic of several studies. As such, Egelman [5] studied the personality
traits that help predict the decision making and risk-taking attitudes of the users.
The focus of other studies was more specific. As such, Adams et al. [2] studied
the trade-offs that the users consider acceptable for disclosing their personal
data, and Ackerman et al. studied the users attitudes towards providing data
in e-commerce [1]. The study in [11] focused on the dependencies between the
personality traits such as fairness or desire to act reciprocally and filling in the
forms. All those studies strengthen the assumption that attitudes and personality
traits should be more focused when trying to understand differences in privacy
behavior.
Other researchers studied disclosure of personal data on social lending sites
[4]. They argue that this exposure is related to the theory of descriptive so-
cial norms. It means that either the similarity of context, social proximity, and
mimicry of success factors leads to people exposing their data because of social
norms and less because of rational decisions. Kramer conducted a similar study
where they look at the specific privacy in Facebook [8]. Furthermore Korff et al.
studied the effect of differences in the choice amount by changing the number of
chechboxes and choice structure by varying the sensitivity of personal data items
presented on privacy behavior [7]. They expect the amount and the structure
to have a similiar effect on the privacy behavior compared to all day decisions
like shopping. Acquisti et al. studied the extent to which the users are ready to
sacrifice their privacy in exchange of a monetary return.
Tools that prevent data exposure. A number of researchers focused on the devel-
opment of different tools to support more privacy-aware behaviour of the users in
the process of filling out web forms. Knijnenburg et al. conducted a study where
they compared new and more detailed forms of auto-completion tools with a
11
traditional one [6]. The main purpose was to revive the privacy calculus for fill-
ing out web forms. They proclaim that users may skip this privacy calculus out
of convenience and therefore use the traditional auto-completion tools. Krol et
al. developed a tool for alerting users when they are about to fill in an optional
form [9], thus making people more aware of unnecessary data exposure.
Mental models of privacy-preserving behaviour in other domains. Aside from
web forms, a number of papers studied the reasons why the users do not engage
in privacy-preserving behaviour in various domains. As such, a qualitative study
have been conducted by Renaud et al. [13] in order to derive the mental models
of users regarding e-mail encryption. The study in [3] researched the reasons
mentioned by the participants for not using password managers, and the study
in [16] considered the reasons that prevent smartphoned users in engaging in
various secure behaviour such as setting a screen lock or installing an anti-
virus software. A general overview of mental models in security is provided by
Volkamer et al. in [15] , stressing that understanding the mental models and
comprehension of security mechanisms of the users is cruical in supporting the
users in their privacy-related decisions.
5 Conclusion
In this chapter we summarize our findings, as well as discuss their implications
and possible directions of future work.
5.1 Summary
As the web-based services attract more users, the websites also tend to gather
more personal data. The users are seldom provided an explanation on what the
purpose of the data collection is, and often the website design makes it hard
for the user to notice, which data is mandatory to provide for using the service.
Hence, users result in filling in the optional fields on the website forms, providing
more personal data than needed for their intentions.
We have conducted a study to find out the reasons, why the users fill in
optional fields on the websites. We based our assumptions on what these reasons
are on existing literature, namely, on the list provided in [12]. The reasons on
this list, however, were either not confirmed in an empirical study at all, or
the study was done in a specific context (such as the study of user’s behaviour
on social networks or providing data for a research survey) which is not directly
transferable to other types of websites and services. Our study focused on finding
out whether the aforemendtioned reasons would be relevant for the scenario
where the users have to fill in the registration form on a website of a company
that provides commercial services, which is one of the most common contexts
encountered on the web. In our study we asked the participants to register an
account on our mock registration website, which, as they were told, belonged
to the Deutsche Bahn company (German Railways) that assigned our research
12
group to conduct a usability study of their new registration form. After the
participants registered an account, they were asked to explain what data they
decided to share and why.
We found, that only three out of ten reasons from the initial list in [12]
were mentioned by our participants when asked to explain why they filled in
the optional fields in the forms. Namely, the reasons that were mentioned by
our participants were over-exposure by accident (i.e. not being able to notice
an indicator that shows whether a field is optional or mandatory), expecting
monetary return (e.g. special birthday offers, if the date of birth is provided) and
expecting non-monetary return (e.g. a phone notification for a delayed transport,
if the phone number is provided).
We have further identified three categories that were not explicitly present in
the initial list by Preibusch et al., but mentioned by our participants. In the first
category, the participants decided to fill in optional fields because they believed
that the service required the particular data in order to provide the necessary
functionality. Despite the fields being marked as optional by the service, the
participants in this category strongly relied on their own reasoning to decide,
whether it makes sense for the service to request a particular piece of data,
hence, whether they should provide this data. The second category, on the other
hand included the statements from the participants that generally relied on trust
in the service. Even if the participants noticed that some fields were optional
and they could not themselves think of a good reason for the service to require
some particular data, they still decided to fill in these fields since they trusted
that the service would not request the data unless it had a good reason to do
so. The third category consisted of the statements that concerned the default
behaviour of the users. Especially if the service itself was found trustworthy, the
participants decided to fill in all the fields, since they saw no disadvantage in
doing otherwise.
Further findings indicate, that many of our participants were reluctant to
disclose their data, due to either concrete concerns of data misuse, or a general
feeling of uneasiness. Moreover, we provided a list of countermeasures that the
participants would use if the website requests some data they are not comfortable
sharing, such as providing fake data, registering a separate e-mail which the user
rarely checks for providing it on the website or boycotting the website entirely.
5.2 Discussion and Future Work
Our findings indicate following factors that determine whether the users are likely
to input their optional data. The first factor is the users’ trust that the service
would not collect data without good reason, is unlikely to misuse it and is capable
of ensuring its security against external attacks. The second important factor is
transparency meaning that it is be important for some users to understand what
their data is used for before they decide to disclose it. Note, that the factor of
transparency has also been found relevant in privacy-related decisions in other
domains, such as in deciding to install a smartphone app if the permissions
that the app requests make sense to the user [10]. The final factor is awareness,
13
meaning that many provide more personal data then they would want to, only
because they did not notice an option to do otherwise.
Note that all these factors are reflected in the EU General Data Protection
Guideline [14] (GDPG). As such, Art. 5 states, that “Personal data should be [...]
collected for specified, explicit and legitimate purposes and not further processed
in a manner that is incompatible with those purposes; [...] adequate, relevant
and limited to what is necessary in relation to the purposes for which they are
processed; [...] processed in a manner that ensures appropriate security of the
personal data, including protection against unauthorised or unlawful processing
and against accidental loss, destruction or damage, using appropriate technical
or organisational measures (integrity and confidentiality)”, which corresponds
to the factor of trust as expressed by our participants. The guideline further
demands that the users are provided with “the purposes of the processing for
which the personal data are intended as well as the legal basis for the processing”
(Art. 13) , which corresponds to the factor of transparency. As our results show,
our participants relied on their expectations of which benefit they would get from
disclosing additional data, or what the purpose of collecting specific information
was, instead of attempting to find out this information from the service itself.
Still, they were more likely to disclose the data if they could think of a purpose
behind its collection. The factor of awareness is addressed with the guidelines
requiring the consent of the users for data processing (Art. 6). The guideline
defines consent in Art. 4 as “any freely given, specific, informed and unambiguous
indication of the data subject’s wishes by which he or she, by a statement or
by a clear affirmative action, signifies agreement to the processing of personal
data relating to him or her”. The participants in our study, on the other hand,
overlooked the information on the website, thus providing more data than they
would otherwise do.
Given our findings related to the GDPG, an important direction of future
work is the improvement of communication between the service providers and
the users. As such, as trust in the service provider has been shown to be an
important factor for the decision making of the users, tools for trust assessment
(e.g. in form of an evaluation, to which extent the service provider complies to
the GDPG) and communication, possibly from independent institutions, would
be helpful in supporting the users. Furthermore, our study has shown that the
users rely on their considerations on what the potential benefits of their data
disclosure would be, or how the service could use their data, while deciding which
data to disclose. Hence, input from the service provider with this information
can help the users make a more informed decision. Finally, as a number of users
tend to overlook the indicators for optional fields, providing more data than
they would want to, more visible indicators on the website would make sure
that accidental disclosure without the users explicit consent is minimized.
Our study has further shown, that there is a discrepancy in the participants
attitudes towards data disclosure. As such, while some of the participants filled
in the data without having any concerns, others claimed being reluctant to dis-
close their data. Given that all the participants had to interact with the same
14
website, it would be interesting to investigate the further differences between
those two groups that influence their decision making and attitudes towards
data disclosure. Furthermore, an interesting direction of future work would be
investigating other contexts in which the users have to decide whether to disclose
data. As such, it would be interesting to compare, whether the user behaviour
and reasons for either disclosing or not disclosing data differ while interacting
with a trustworthy website such as a well-known Deutsche Bahn company, as
opposed to interacting with a small and unknown online shop or other service
that might be deemed less trustworthy by the participants.
The prevalence of various countermeasures, such as using fake data, that the
participants use in order to avoid filling in the mandatory fields shows that the
reluctance of sharing personal data, even at the expense of the user’s conve-
nience, is a significant factor in decision making for many users. These findings
suggest that collecting too much data without providing a sufficient explanation
can be detrimental for the web services as well. On the other hand, since the
countermeasures mentioned by our participants are not an optimal solution for
every user, better tools for supporting the users who do not want to disclose
their personal data are needed.
Acknowledgements
This work has been co-funded by the DFG as part of project D.1 within the RTG
2050 “Privacy and Trust for Mobile Users”. This research has also received fund-
ing from the European Unions Horizon 2020 research and innovation programme
under grant agreement No 653454. It has also been supported by the German
Federal Ministry of Education and Research (BMBF) as well as by the Hessen
State Ministry for Higher Education, Research and the Arts within CRISP.
References
1. Ackerman, M.S., Cranor, L.F., Reagle, J.: Privacy in e-commerce: examining user
scenarios and privacy preferences. In: 1st ACM conference on Electronic commerce.
pp. 1–8. ACM (1999)
2. Adams, A., Sasse, M.A.: Privacy in multimedia communications: Protecting users,
not just data. In: People and Computers XV Interaction without Frontiers, pp.
49–64. Springer (2001)
3. Alkaldi, N., Renaud, K.: Why do people adopt, or reject, smartphone password
managers? In: EuroUSEC 2016: European Workshop on Usable Security. vol. 18,
pp. 1–14 (2016)
4. ohme, R., otzsch, S.: Collective exposure: Peer effects in voluntary disclosure of
personal data. In: FC 2011: International Conference on Financial Cryptography
and Data Security. pp. 1–15. Springer (2011)
5. Egelman, S., Peer, E.: Predicting privacy and security attitudes. ACM SIGCAS
Computers and Society 45(1), 22–28 (2015)
6. Knijnenburg, B.P., Kobsa, A., Jin, H.: Counteracting the negative effect of form
auto-completion on the privacy calculus. In: ICIS 2013: International Conference
on Information Systems. AIS eLibrary (2013)
15
7. Korff, S., ohme, R.: Too much choice: End-user privacy decisions in the context of
choice proliferation. In: SOUPS 2014: Symposium on Usable Privacy and Security.
pp. 69–87. USENIX (2014)
8. Kr¨amer, N.C., Haferkamp, N.: Online self-presentation: Balancing privacy concerns
and impression construction on social networking sites. In: Privacy Online, pp.
127–141. Springer (2011)
9. Krol, K., Preibusch, S.: Control versus effort in privacy warnings for webforms. In:
WPES 2016: ACM on Workshop on Privacy in the Electronic Society. pp. 13–23.
ACM (2016)
10. Kulyk, O., Gerber, P., El Hanafi, M., Reinheimer, B., Renaud, K., Volkamer,
M.: Encouraging privacy-aware smartphone app installation: What would the
technically-adept do. In: USEC 2016: Usable Security Workshop. Internet Soci-
ety (2016)
11. Malheiros, M., Preibusch, S., Sasse, M.A.: “Fairly truthful”: The impact of per-
ceived effort, fairness, relevance, and sensitivity on personal data disclosure. In:
Trust 2013: International Conference on Trust and Trustworthy Computing. pp.
250–266. Springer (2013)
12. Preibusch, S., Krol, K., Beresford, A.R.: The privacy economics of voluntary over-
disclosure in web forms. In: The Economics of Information Security and Privacy,
pp. 183–209. Springer (2013)
13. Renaud, K., Volkamer, M., Renkema-Padmos, A.: Why doesn’t Jane protect her
privacy? In: PETS 2014: International Symposium on Privacy Enhancing Tech-
nologies Symposium. pp. 244–262. Springer (2014)
14. The European Parliament and of the Council of European Union: Regulation (EU)
2016/679 of the European Parliament and of the Council of 27 April 2016 on the
protection of natural persons with regard to the processing of personal data and
on the free movement of such data, and repealing Directive 95/46/EC (2016),
http://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32016R0679, last
accessed on 10.02.2017
15. Volkamer, M., Renaud, K.: Mental models–general introduction and review of their
application to human-centred security. In: Number Theory and Cryptography, pp.
255–280. Springer (2013)
16. Volkamer, M., Renaud, K., Kulyk, O., Emer¨oz, S.: A socio-technical investigation
into smartphone security. In: STM 2015: International Workshop on Security and
Trust Management. pp. 265–273. Springer (2015)
16
Article
Data entry forms use completeness requirements to specify the fields that are required or optional to fill for collecting necessary information from different types of users. However, because of the evolving nature of software, some required fields may not be applicable for certain types of users anymore. Nevertheless, they may still be incorrectly marked as required in the form; we call such fields obsolete required fields. Since obsolete required fields usually have “not-null” validation checks before submitting the form, users have to enter meaningless values in such fields in order to complete the form submission. These meaningless values threaten the quality of the filled data, and could negatively affect stakeholders or learning-based tools that use the data. To avoid users filling meaningless values, existing techniques usually rely on manually written rules to identify the obsolete required fields and relax their completeness requirements. However, these techniques are ineffective and costly. In this paper, we propose LACQUER, a learning-based automated approach for relaxing the completeness requirements of data entry forms. LACQUER builds Bayesian Network models to automatically learn conditions under which users had to fill meaningless values. To improve its learning ability, LACQUER identifies the cases where a required field is only applicable for a small group of users, and uses SMOTE, an oversampling technique, to generate more instances on such fields for effectively mining dependencies on them. During the data entry session, LACQUER predicts the completeness requirement of a target based on the already filled fields and their conditional dependencies in the trained model. Our experimental results show that LACQUER can accurately relax the completeness requirements of required fields in data entry forms with precision values ranging between 0.76 and 0.90 on different datasets. LACQUER can prevent users from filling 20% to 64% of meaningless values, with negative predictive values (i.e., the ability to correctly predict a field as “optional”) between 0.72 and 0.91. Furthermore, LACQUER is efficient; it takes at most 839 ms to predict the completeness requirement of an instance.
Conference Paper
Full-text available
While personal data is a source of competitive advantage, businesses should consider the potential reaction of individuals to certain types of data requests. Privacy research has identified some factors that impact privacy perceptions, but these have not yet been linked to actual disclosure behaviour. We describe a field-experiment investigating the effect of different factors on online disclosure behaviour. 2720 US participants were invited to participate in an Amazon Mechanical Turk survey advertised as a marketing study for a credit card company. Participants were asked to disclose several items of personal data. In a follow-up UCL branded survey, a subset (N=1851) of the same participants rated how they perceived the effort, fairness, relevance, and sensitivity of the first phase personal data requests and how truthful their answers had been. Findings show that fairness has a consistent and significant effect on the disclosure and truthfulness of data items such as weekly spending or occupation. Partial support was found for the effect of effort and sensitivity. Privacy researchers are advised to take into account the under-investigated fairness construct in their research. Businesses should focus on non-sensitive data items which are perceived as fair in the context they are collected; otherwise they risk obtaining low-quality or incomplete data from their customers. © 2013 Springer-Verlag.
Conference Paper
Full-text available
End-to-end encryption has been heralded by privacy and security researchers as an effective defence against dragnet surveillance, but there is no evidence of widespread end-user uptake. We argue that the non-adoption of end-to-end encryption might not be entirely due to usability issues identified by Whitten and Tygar in their seminal paper “Why Johnny Can’t Encrypt”. Our investigation revealed a number of fundamental issues such as incomplete threat models, misaligned incentives, and a general absence of understanding of the email architecture. From our data and related research literature we found evidence of a number of potential explanations for the low uptake of end-to-end encryption. This suggests that merely increasing the availability and usability of encryption functionality in email clients will not automatically encourage increased deployment by email users. We shall have to focus, first, on building comprehensive end-user mental models related to email, and email security. We conclude by suggesting directions for future research.
Conference Paper
Full-text available
Smartphone apps can harvest very personal details from the phone with ease. This is a particular privacy concern. Unthinking installation of untrustworthy apps constitutes risky behaviour. This could be due to poor awareness or a lack of know-how: knowledge of how to go about protecting privacy. It seems that Smartphone owners proceed with installation, ignoring any misgivings they might have, and thereby irretrievably sacrifice their privacy. In this paper, we focus on the lack of know-how. Our primary aim was to design a set of guidelines to help Smartphone owners to judge whether apps are likely to respect their privacy or not. To produce these we investigated the stances of those who do, to some extent, have the requisite awareness and knowledge, namely those with experience in IT security or computer science in general. Such technically-adept people can reasonably be expected to apply pattern-like heuristics when making installation decisions. We carried out a study to identify and describe their heuristics. We then distilled their app-related decision processes into a set of easily accessible guidelines and we conclude the paper by providing these.
Conference Paper
Full-text available
Many people do not deliberately act to protect the data on their Smartphones.The most obvious explanation for a failure to behave securely is that the appro-priate mechanisms are unusable. Does this mean usable mechanisms will auto-matically be adopted? Probably not! Poor usability certainly plays a role, butother factors also contribute to non-adoption of precautionary mechanisms andbehaviours. We carried out a series of interviews to determine justi�cations fornon-adoption of security precautions, speci�cally in the smartphone context, anddeveloped a model of Smartphone precaution non-adoption. We propose that future work should investigate the use of media campaigns in raising awareness of these issues.
Conference Paper
Full-text available
While personal data is a source of competitive advantage, businesses should consider the potential reaction of individuals to certain types of data requests. Privacy research has identified some factors that impact privacy perceptions, but these have not yet been linked to actual disclosure behaviour. We describe a field-experiment investigating the effect of different factors on online disclosure behaviour. 2720 US participants were invited to participate in an Amazon Mechanical Turk survey advertised as a marketing study for a credit card company. Participants were asked to disclose several items of personal data. In a follow-up UCL branded survey, a subset (N=1851) of the same participants rated how they perceived the effort, fairness, relevance, and sensitivity of the first phase personal data requests and how truthful their answers had been. Findings show that fairness has a consistent and significant effect on the disclosure and truthfulness of data items such as weekly spending or occupation. Partial support was found for the effect of effort and sensitivity. Privacy researchers are advised to take into account the under-investigated fairness construct in their research. Businesses should focus on non-sensitive data items which are perceived as fair in the context they are collected; otherwise they risk obtaining low-quality or incomplete data from their customers.
Article
Full-text available
While individual differences in decision-making have been examined within the social sciences for several decades, this research has only recently begun to be applied by computer scientists to examine privacy and security attitudes (and ultimately behaviors). Specifically, several researchers have shown how different online privacy decisions are correlated with the "Big Five" personality traits. However, in our own research, we show that the five factor model is actually a weak predictor of privacy preferences and behaviors, and that other well-studied individual differences in the psychology literature are much stronger predictors. We describe the results of several experiments that showed how decision-making style and risk-taking attitudes are strong predictors of privacy attitudes, as well as a new scale that we developed to measure security behavior intentions. Finally, we show that privacy and security attitudes are correlated, but orthogonal.
Conference Paper
Webforms are the primary way of collecting information online. However, some users may wish to limit the amount of personal information they provide and only fill out the minimum required for the transaction. With less than one third of websites marking fields as mandatory or optional, limiting disclosure can be a daunting task. This paper reports on a large behavioural online experiment on user reactions to warnings alerting them that they are about to submit non-mandatory information. Eight warning dialogues were tested between 4,620 participants. We found that warnings mentioning security or privacy threats both significantly reduced the disclosure of personal information in the webforms used (e.g., -27 percentage points for date of birth). The most actionable warning was not the one that minimised user effort but the one that left participants most in control. We consider our study useful to establish what kind of warning messages could help users manage their privacy. In order not to contribute to the ever increasing warning fatigue, a good real-world implementation of over-disclosure indicators would be for the browser to provide users with real-time information on mandatoriness/optionality when the webform loads, for example by highlighting optional fields.
Chapter
The Web form is the primary method of collecting personal data from individuals on the Web. Privacy concerns, time spent, and typing effort act as a major deterrent to completing Web forms. Yet consumers regularly provide more data than required. In a field experiment, we recruited 1,500 Web users to complete a form asking for ten items of identity and profile information of varying levels of sensitivity. We manipulated the number of mandatory fields (none vs. two) and the compensation for participation (0.25vs.0.25 vs. 0.50) to quantify the extent of over-disclosure, the motives behind it, and the resulting costs and privacy invasion. We benchmarked the efficiency of compulsion and incentives in soliciting data against voluntary disclosure alone.We observed a high prevalence of deliberate and unpaid over-disclosure of data. Participants regularly completed more form fields than required, or provided more details than requested. Through careful experimental design, we verified that participants understood that additional data disclosure was voluntary, and the information provided was considered sensitive. In our experiment, we found that making some fields mandatory jeopardised voluntary disclosure for the remaining optional fields. Conversely, monetary incentives for disclosing those same fields yielded positive spillover by increasing revelation ratios for other optional fields. We discuss the implications for commercial website operators, regulators, privacy-enhancing browser standards, and further experimental research in privacy economics.
Chapter
Reaching the milestone figure of 500 million members in July 2010, the growth of the social networking site Facebook has rapidly accelerated. Currently, its membership figures would make it the third largest country in the world, suggesting that participation in online social networks has become more than a cursory phenomenon. Members of Facebook are required to create an individualized online profile that provides information about themselves, their physical appearance, individual tastes, and preferences (see Liu 2007; Liu et al. 2006), and that highlights certain aspects of their own personality. By means of these features, users inevitably construct and manage impressions of their self. Research has already shown that such a personal webpage even allows a more detailed self-presentation than a casual face-to-face interaction and that people indeed make use of it in order to emphasize certain aspects of their “true” self (Bargh et al. 2002; Haferkamp and Krämer 2010). Additionally, empirical findings indicate that social networking sites (SNSs) are not only a potential means for self-presentation but that people are indeed highly motivated to use this new arena for presenting themselves (Haferkamp and Krämer 2010). In doing this, they even adopt profile elements that have originally been provided for other purposes (e.g., people become a member of a group in order to display their attitudes and interests instead of in order to communicate with others, Haferkamp and Krämer 2009). This tendency might be due to the fundamental motive of every human being to present him/herself in a positive way and, in doing so, gain positive reactions from those forming an impression (Leary 1995; Leary and Kowalski 1990).