PreprintPDF Available

The AI Ghostwriter Effect: Users Do Not Perceive Ownership of AI-Generated Text But Self-Declare as Authors

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate authorship and ownership in human-AI collaboration for personalized language generation models. We show an AI Ghostwriter Effect: Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship. The degree of personalization did not impact the AI Ghostwriter Effect, and control over the model increased participants' sense of ownership. We also found that the discrepancy between the sense of ownership and the authorship declaration is stronger in interactions with a human ghostwriter and that people use similar rationalizations for authorship in AI ghostwriters and human ghostwriters. We discuss how our findings relate to psychological ownership and human-AI interaction to lay the foundations for adapting authorship frameworks and user interfaces in AI in text-generation tasks.
Content may be subject to copyright.
The AI Ghostwriter Eect: Users Do Not Perceive Ownership of AI-Generated
Text But Self-Declare as Authors
FIONA DRAXLER and ANNA WERNER,LMU Munich, Germany
FLORIAN LEHMANN,University of Bayreuth, Germany
MATTHIAS HOPPE and ALBRECHT SCHMIDT,LMU Munich, Germany
DANIEL BUSCHEK,University of Bayreuth, Germany
ROBIN WELSCH,Aalto University, Finland
Human-AI interaction in text production increases complexity in authorship. In two empirical studies (n1 = 30 & n2 = 96), we investigate
authorship and ownership in human-AI collaboration for personalized language generation models. We show an AI Ghostwriter Eect:
Users do not consider themselves the owners and authors of AI-generated text but refrain from publicly declaring AI authorship.
The degree of personalization did not impact the AI Ghostwriter Eect, and control over the model increased participants’ sense
of ownership. We also found that the discrepancy between the sense of ownership and the authorship declaration is stronger in
interactions with a human ghostwriter and that people use similar rationalizations for authorship in AI ghostwriters and human
ghostwriters. We discuss how our ndings relate to psychological ownership and human-AI interaction to lay the foundations for
adapting authorship frameworks and user interfaces in AI in text-generation tasks.
CCS Concepts:
Computing methodologies Natural language generation
;
Applied computing Publishing
;Text editing;
Social and professional topics Intellectual property.
Additional Key Words and Phrases: ownership, authorship, large language models, text generation
ACM Reference Format:
Fiona Draxler, Anna Werner, Florian Lehmann, Matthias Hoppe, Albrecht Schmidt, Daniel Buschek, and Robin Welsch. 2023. The AI
Ghostwriter Eect: Users Do Not Perceive Ownership of AI-Generated Text But Self-Declare as Authors. In .ACM, New York, NY,
USA, 34 pages. https://doi.org/XXXXXXX.XXXXXXX
1 INTRODUCTION
Imagine your short visit to New York is coming to an end and rather than spending time writing a postcard yourself, you
might ask GPT-3 to write a personalized postcard for you. Would you sign this postcard with your own name? The use
of personalized articial intelligence (AI) in human-AI collaboration has the potential to signicantly impact the concept
of authorship. One example of such collaboration can be seen in the use of language generation models, such as GPT-3,
to assist with writing tasks. Previous HCI research has investigated user perspectives on collaborative creative writing
[
44
,
71
,
98
], style transfer [
87
], text summarization [
47
], and perceived authorship with AI suggestions [
72
]. However,
it has not yet investigated how people attribute and declare authorship for the generated text. Previous research in
the social sciences has investigated the prevalence and rationalization of ghostwriting [
22
,
43
,
88
]. Ghostwriting, as a
practice of using text produced by someone without crediting, is common in some academic elds, such as medicine
[
49
,
107
], but also in writing autobiographies or political speeches [
17
]. Reasons for the use of ghostwriters include
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for third-party
components of this work must be honored. For all other uses, contact the owner/author(s).
©2023 Copyright held by the owner/author(s).
Manuscript submitted to ACM
1
arXiv:2303.03283v1 [cs.HC] 6 Mar 2023
, 2023, Draxler et al.
nancial or political interests [
49
], a lack of time and writing expertise [
17
], and the pressure to obtain gratication,
e.g., good grades [
74
]. Considering that personalized Large Language Models (LLMs) are becoming accessible to the
public and thus have the potential to become ubiquitous for everyday tasks that include text production, it is necessary
to understand how authorship is declared with personalized AI and to what extent AI is used similarly to human
ghostwriters.
In this paper, we approach human-AI interaction for personalized LLMs from an authorship perspective. Thus, we
are interested in investigating the psychological processes involved when attributing authorship for text generation
with personalized AI. Note that this is dierent from the focus in ongoing legal and ethical discussions of AI-generated
material that will yield a prescriptive framework, i.e. how authorship should be declared, and that focuses much on
non-personalized material [90].
To tackle the topic of authorship in personalized LLMs, we conducted two empirical studies that show the AI
Ghostwriter Eect of personalized AI. Overall, we found that users do not judge themselves to be the authors of
AI-generated text but they refrain from declaring authorship of AI publicly. In Study 1, we compared whether the
level of personalization and dierent interaction methods aect perceived and declared authorship. In Study 2, we
replicate the AI Ghostwriter Eect in a large sample and compare it to a human ghostwriter. We show that the degree of
personalization is not critical to the AI Ghostwriter Eect (Study 1), that subjective control over the interaction and the
content increases the sense of authorship, and that people use similar rationalization for AI ghostwriters and human
ghostwriters (Study 2, pre-registered at https://aspredicted.org/RKV_ZXX)
1
. From this, we motivate how to expand
common authorship declaration frameworks (e.g. the CRediT taxonomy
2
, [
1
,
3
,
16
]) to account for the support of AI
from a user-centered perspective.
2 RELATED WORK
This section lays out the current landscape of text-generation models and working with text-generation systems from a
technological and human-centered perspective. In addition, it discusses the concepts of ownership and authorship, and
how it is currently changing with automated generation.
2.1 Generative Models and Personalization
In recent years, various generative models have emerged for tasks such as text-to-text [
18
] and text-to-image
3
generation,
music composition based on prompts
4
, and text simplication [
110
]. Other models generate images [
89
], poetry [
101
],
rap lyrics [
84
], or music [
58
] in the style of previous works. Generative models often implement a Transformer
architecture that includes an encoder and a decoder, utilizing an attention mechanism that allows back-references to
prior content [
104
]. These models are typically trained to predict probable next tokens from previous text. Related tasks
include inlling masked tokens. Thus, they learn connections between subsequent words and are well-suited for text
generation.
In this work, we focus on text generation with GPT-3. Released in 2020, GPT-3 currently is a particularly large and
powerful language model [
29
,
32
]. It uses deep learning methods such as attention mechanisms and the Transformer
architecture with autoregressive pretraining [
18
]. This architecture has improved the generation of long coherent text
[
102
] because of the model’s ability to give special attention to key textual features and to connect interrelated words
1Note that both studies were conducted before AI crediting policies as put forward by Nature and arXiv, cf. Section 2.4.
2https://credit.niso.org
3https://openai.com/dall-e- 2/
4https://mubert.com
2
The AI Ghostwriter Eect , 2023,
over longer passages [
37
]. GPT-3 produces text
5
that statistically ts well with a given natural language prompt, using
its syntactic ability to associate words without understanding the semantics and context of the query [
39
]. The text
quality is often comparable to human-written texts in terms of grammar, coherency, and uency, but it can also produce
nonsensical or incorrect content [
37
,
102
]. Like many other machine learning systems, GPT-3 may also reproduce
biases found in its training data, such as racial or gender biases [
18
,
76
]. One use case in which GPT-3 performs well is
“few-shot learning”, where demonstrations of input-output pairs for the task are included with the prompt [
18
]. The
choice of these in-context examples has a signicant inuence on the content and the quality of the generated text [
75
].
The same idea can also be used for personalization: the output of the system is conditioned by the information
available about the user [
110
] and their context [
34
] to better match generated texts to a writer’s expectations and
needs. According to Wang et al
. [106]
, personalization can happen at two levels: it can include factual knowledge such
as personal data, user attributes, and user preferences. The second category is dened as stylistic modiers which can
either be situational or personal. For example, Gmail Smart Compose interpolates between a global and a personal
model trained on individual sent messages [
23
]. Research in related AI-supported scenarios shows that personalization
and adaptation can improve the perceived quality of results [
65
,
82
] and engagement [
57
]. It is likely that similar eects
also hold for text generation. However, it is challenging to match generated texts with a writer’s values or perspective,
which inuences acceptance [15].
2.2 Interaction with Text-Generation Systems
In HCI research, text generation has also been investigated and applied in user interfaces for text entry. Originally, such
“intelligent” or “predictive” text entry methods were developed as augmentative and alternative communication (AAC)
tools to reduce manual eorts for people with cognitive or motor impairments. Concretely, this included suggesting
whole words, which users can then select instead of typing them letter by letter [
41
,
54
]. Later, similar approaches were
applied towards the grand goal of eciency [
69
]: The bottleneck of an idealized text entry method is the cognitive
process of coming up with ideas, not the physical process of entering them. In systems such as Google’s Smart Reply,
short text suggestions are designed to possibly skip manual writing altogether [61].
Typical text entry systems use language models to show one to three suggestions [
35
,
41
,
46
,
85
]. Each suggestion
can be a single word [
14
,
105
] or even a whole sentence [
6
,
20
]. The former have become a standard feature in mobile
keyboard apps (e.g. SwiftKey6).
While suggestions remain a popular feature, they might in fact reduce typing speed or negatively impact the
experience of some users [
9
,
19
,
30
,
81
]. For illustration, fast typists might enter text faster than reading and selecting
from a list of suggestions. This is particularly likely as suggested text becomes more extensive (e.g. increasing number
of parallel suggestions and/or suggesting longer phrases [
20
]). Suggestions may also make texts shorter and more
predictable [5] or inuence the expressed sentiment [4].
On the one hand, these insights motivate conservative suggestion presentation, such as only giving a suggestion
if it is an extremely likely sentence continuation an approach taken by Google in their Smart Compose feature in
GMail [24]. On the other hand, recent HCI research has started to explore text suggestions beyond eciency.
This direction ts well to the rise of large language models, which are much more suited to generate longer pieces
of text while respecting prior context. Here, we now see systems being conceptualized, presented and/or reected
on as writing assistants, which may provide some degree of inspiration [
13
,
26
,
71
,
96
,
111
]. Generative models could
5Primarily in English
6https://www.microsoft.com/en-us/swiftkey
3
, 2023, Draxler et al.
also enhance the versatility of interactive storytelling systems as proposed by Swanson and Gordon
[98]
. This links
traditional HCI questions (e.g. UI design factors) to emerging questions of authorship (cf. [72]).
Indeed, many researchers (implicitly) touch upon such questions in their interaction studies, without addressing
them explicitly in the study design, since that is not their focus: For instance, Lee et al
. [71]
call their system “CoAuthor”
and introduce measures of “mutuality” and “equality” of the text contributions by users and the AI. Similarly, WordCraft
is presented as a “collaboration” and “conversation” with the assisting language model system [
111
]. Focusing on input
behavior with such systems, Buschek et al
. [20]
quantitatively identied nine patterns, from ignoring suggestions to
chaining several suggested phrases in a row. Arnold et al
. [6]
indicated that phrases are perceived more as ideas, in
contrast to words, which are seen more as predicted continuations. Related, Singh et al
. [96]
describe how writers
actively invest into taking ‘leaps’ to t their story around a suggested phrase. Similarly, Dang et al
. [31]
found evidence
of writers changing their original text to inuence AI-generated summaries, as well as interpreting these summaries as
an external perspective on their text. Qualitatively, Bhat et al
. [13]
dissect such AI inuences on writers and their text
through the lens of the cognitive process model of writing from Hayes
[53]
, which distinguishes between proposing
ideas, translating those into text, and nally transcribing these with a given input method. Suggestions may inuence
all three of these processes. Recent work shows potential inuences on the writer’s opinion in this context [60].
In summary, research on text entry at the intersection of HCI and Natural Language Processing currently sees a
strong shift towards much more extensive AI support for human writing. While many studies take note of aspects of
(perceived) authorship along the way, a dedicated investigation is still missing. This motivates our work here, including
our consideration of dierent interaction designs in Study 1.
2.3 Declared Authorship, Copyright, and Ghostwriting
Intuitively, the creator of a text or artwork should also be declared as its author and hold the copyright. Practice shows
that this is not always the case. Notably, ghostwriting describes the practice of not crediting a creator [
27
,
107
]. There
are many reasons why credits dier from the actual work. For example, politicians and CEOs hire speechwriters because
they lack the time, background details, or rhetoric skills [
17
,
88
]. Similarly, public personas at times seek ghostwriter
support for publishing their autobiographies, blog articles, or social media posts [
22
,
43
], where the public persona’s
name as the author is essential for marketing. In the scientic domain, good publication records are important for
researchers’ careers and funding opportunities. In a survey by Nylenna et al
. [79]
, 29% of respondents (researchers in
medicine) stated that they had experienced not being credited as an author when they felt they should be; and 36% felt
pressurized to add authors who did not deserve this. Moreover, there have been cases of pharmaceutical companies
oering authorship on pre-written reports on medical trials, trading visibility for nancial gains [
49
,
100
]. This unethical
practice obfuscates conicts of interests and may make drug trial results seem more credible than is their due [50].
In response to unethical authorship practices, many publishing organizations dene xed authorship criteria [
27
].
For example, the International Committee of Medical Journal Editors (ICMJE) dened that someone should be declared
an author of a work if they made a signicant contribution to the conception of the work or analyzing data for the work,
drafted or critically revised the work,approved the version to be published, and are taking responsibility for all aspects of
the work
7
. Other organizations, such as the ACM
8
, dene similar criteria. In a slightly dierent approach, the CRediT
taxonomy classies dierent types of contributions to clearly state the role of each individual contributor [
1
] and
concepts as proposed by Bd et al
. [11]
allow interactive role declaration. This is particularly important when declarations
7https://www.icmje.org/recommendations/browse/roles-and-responsibilities/dening- the-role-of- authors-and- contributors.html
8https://www.acm.org/publications/policies/roles-and-responsibilities#h-criteria- for-authorship
4
The AI Ghostwriter Eect , 2023,
of authorship become complex, for example, when several parties are involved, or when work is derived from previous
work [33]. Nonetheless, detailed attribution is sometimes not even supported by deployment platforms [25].
2.4 AI Support and Declared Authorship
The application of personalized generative systems aects copyright and authorship declaration. Copyright and
intellectual property attribution with AI is a pressing issue from a legal perspective [
8
,
42
,
97
]. Previously, these rights
were attributed to natural persons or organizations, and there is an ongoing debate whether a computer algorithm or
the developers of an algorithm can also claim them [
7
,
59
,
90
]. Legislation for computer-generated work also diers
between countries, crediting the humans working with an algorithm, the developers of an algorithm, or arguing that
no copyright can be granted [
90
]. Moreover, aspects such as the copyright of training data need to be considered [
42
].
For example, there was a recent lawsuit against using copyrighted material for training the code-generation model
Codepilot
9
. OpenAI states that their customers using the text-generation model GPT-3 or the image-generation model
DALL
·
E possess full rights over the produced content
10
, including commercial use. Texts generated by GPT-3 must be
attributed to the user’s name or their company and it must be stated that the content is AI-generated. OpenAI, on the
other hand, retains all rights, titles, and interests in the language model behind GPT-3 and its API.
Declarations of AI contributions in practice can be seen in journalism, where they are sometimes, but not always
declared [
48
,
64
,
77
]. In his Guide to Automated Journalism, Graefe
[48]
notes that AI authorship declarations are
important for transparency. Montal and Reich
[77]
suggest referencing software vendors in the byline of an automatically
generated article, whereas a mention in a full disclosure statement would be sucient for articles where a human writer
used algorithmic text parts. In rare cases, AI contributors have also been listed as authors of a scientic publication
[
70
,
80
,
109
]. However, several scientic associations have recently published policies that discourage or prohibit listing
text generation tools as an author, e.g., Nature
11
and arXiv
12
, because they cannot be held accountable. Instead, they
recommend adding the tools to a Methods section.
2.5 AI Support and Sense of Ownership
Authorship is not only a question of declaration but also of individual judgment. Whether someone feels like an author
depends on factors such as their perceived control and agency. Feelings of ownership—of objects, but also intangible
entities such as ideas—can develop through creation and control (psychological ownership; [
66
,
83
]). For example, in a
video remixing context, Diakopoulos et al. [33] found that creators often felt they had to make substantial changes to
previous work before they could see it as their own. This indicates again that one’s investment is essential for perceived
ownership [
83
]. Research on perceived authorship with generative AI is still scarce. Notably, Lehmann et al
. [72]
could
show a correlation of authorship with control in interaction design: In their study, perceived authorship was stronger
when participants wrote their own texts than when they received text suggestions. In an interview study published in
2017, members of news organizations did not perceive algorithms as authors [
77
]. Instead, they mentioned the respective
journalistic organization or people who contributed to creating the algorithm, suggesting an anthropomorphic image
of authorship. Therefore, the writing process and the resulting product have to be regarded separately; with this paper,
we addresses the interplay between the two dimensions in terms of perceived ownership and declaration of authorship.
9https://www.theverge.com/2022/11/8/23446821/microsoft-openai-github- copilot-class-action- lawsuit-ai-copyright- violation-training- data
10https://openai.com/api/policies/sharing- publication/#content-co-authored- with-the-openai- api-policy
11https://www.nature.com/nature/for-authors/initial- submission
12https://info.arxiv.org/help/moderation/index.html#policy-for- authors- use-of- generative-ai-language- tools
5
, 2023, Draxler et al.
3 RESEARCH QUESTIONS AND HYPOTHESES
The overall goal of our paper is to investigate what aects authorship attribution for texts generated with personalized
LLMs from an HCI perspective. Therefore, we consider the process of attributing authorship (from the sense of ownership
to the declaration of authorship), the degree of control in human-AI interaction, the quality of personalization, and the
comparison to human ghostwriters. With the sense of ownership, we address the subjective side: does a person feel that
they are the author of a text and that they own it? This also encompasses a sense of control over the product, as the two
concepts are tightly linked (cf. Section 2.5). The declaration of authorship, on the other hand, refers to the entity a text
is attributed to, e.g., in the header or byline. We explicitly investigate the case of personalized text generation because,
in this case, the lines between the AI model and the user blur. Concretely, we address the following research questions:
RQ1: Does the sense of ownership match the declaration of authorship for personalized AI-generated texts?
RQ2: Does the degree of control in human-AI interaction aect the sense of ownership?
RQ3: Is the sense of ownership dependent on the quality of personalization?
RQ4: In what ways does the AI Ghostwriter Eect dier from human ghostwriters?
We consider these questions in a very simple writing task. Specically, study participants pictured being in New
York and writing a postcard with the support of GPT-3. This task has several advantages to previous studies, e.g. [
72
],
that are relevant for our study. First, writing postcards is a task that we expected potential participants to be familiar
with. Second, postcards are semi-personal in the sense that they are individually written, but often describe similar
experiences, e.g., famous sights or local food. Third, we chose New York as the destination because it is one of the most
well-known cities in the world. If the AI system, here GPT-3, is used much like a human ghostwriter in this task, then
H1.1: participants will consider the AI (and not themselves) to be the owner of personalized AI-generated texts but
H1.2: they will not declare AI authorship or AI support when publishing personalized AI-generated texts.
We subsequently refer to this pattern of results as the AI Ghostwriter Eect. To eectively design interactive LLM-
based applications with authorship in mind, we have to know how dierent interaction methods aect authorship for
personalized AI-generated texts. Lehmann et al
. [72]
found that the sense of authorship is positively correlated with the
degree of inuence over the AI contribution. Thus, we also assess perceived control and leadership, and we hypothesize
that
H2.1:
the degree of inuence over the generated text, as realized by dierent interaction methods, aects the sense of
control, and
H2.2: the degree of inuence over the generated text aects the sense of leadership.
Recent studies have found that usability and user experience do not depend on the quality of personalization
and adaptation for ne-tuned AI systems [
103
]. Even non-adaptive systems are deemed adaptive and personalized
if they are introduced as such [
68
]. If this holds for personalized LLMs, merely labeling the AI as personalized (i.e.,
pseudo-personalization) should yield comparable eects on the sense of ownership. We hypothesize that
H3: the sense of ownership is independent of the quality of personalization for the AI model.
To test H1 - H3 in Study 1, we varied the level of inuence over the AI-generated texts in the postcard writing task by
employing dierent Interaction Methods (within-subject factor). Based on the correlation of inuence and sense of
ownership suggested in [
72
,
83
], we dened four methods in descending order of the level of user inuence: Writing,
Editing,Choosing, and Getting. We generated texts with personalization or pseudo-personalization (ne-tuned vs.
base model; between-subject factor).
6
The AI Ghostwriter Eect , 2023,
To answer RQ4 and to replicate and extend Study 1, we conducted Study 2 (pre-registered, see https://aspredicted.
org/RKV_ZXX
13
), comparing AI-supported writing to the case of a human author supporting the writing task. Based
on the nding that algorithms/AI models are more likely to be exploited than humans [
62
], we posit the following
hypotheses:
H4:
The sense of ownership is lower for texts written by a human ghostwriter than for texts written by an AI
ghostwriter.
H5: People are more likely to attribute (co-)authorship to a human ghostwriter than to an AI ghostwriter.
H6:
When using AI-generated texts, people do not always declare the AI as a (co-)author, even when they do not have
a sense of ownership (AI Ghostwriter Eect). This serves to conrm H1, but without considering personalization.
4 STUDY 1: SENSE OF OWNERSHIP VERSUS DECLARED AUTHORSHIP
We varied the level of inuence over the AI-generated texts in the postcard writing task by employing dierent Inter-
action Methods. In Study 1, our primary goal was to evaluate how participants attribute ownership to personalized
AI-generated texts, whether a sense of control in the interaction aects that, and how participants declare authorship
for dierent Interaction Methods. The study was conducted in July 2022.
4.1 Method
We implemented four Interaction Methods that vary with regard to user inuence over the output from writing a
text manually (full inuence) to receiving a text by the AI (no inuence). First, a sample of 30 participants lled out a
survey that enabled us to ne-tune the parameters of the text-generation model. Then, the same participants interacted
with an actually personalized or a pseudo-personalized model with all Interaction Methods, evaluated them, and
were prompted to publish the texts online. Finally, after approximately two days, we explored whether participants
could still dierentiate texts generated with the dierent Interaction Methods. Note that for the sake of brevity, we
report the results of this exploratory part of Study 1 as supplementary material.
4.1.1 Implementation of the Interaction Methods. The four experimental conditions of Interaction Methods (IMs)
for Study 1 were represented through four dierent interaction methods in scripted websites (see Figure 1). Specically,
we designed a text generation platform where we varied the user interaction from writing a postcard from New York to
receiving a fully auto-generated postcard text.
Interaction Method 1: Writing a postcard. In this baseline condition, participants wrote a postcard on their own
without AI assistance, i.e., they fully inuenced the results. They were required to enter between 100 and 200 words.
Interaction Method 2: Editing a postcard. Participants were provided with a generated postcard but could change
(insert, delete, or replace) up to 30 words. Thus, they had some inuence on the content of the postcard. The editing
limits were adjusted based on a pretest.
Interaction Method 3: Choosing a postcard. Participants were given a selection of three generated postcards and were
asked to choose one.
Interaction Method 4: Getting a postcard. Participants were given one generated postcard only. Thus, they had no
inuence over the content.
13Note that we have slightly adapted the wording to match the manuscript.
7
, 2023, Draxler et al.
(a) Writing (b) Editing
(c) Choosing (d) Getting
Fig. 1. Screenshots of the Interaction Methods in Study 1.
8
The AI Ghostwriter Eect , 2023,
The generated texts were already processed before the study. Nevertheless, they were presented 3 seconds after the
participants had clicked on a “generate” button to make it seem as if the postcard was actually generated on demand.
4.1.2 Procedure. The study consisted of three main parts: (1) ne-tuning survey for personalization and background
processing for text generation, (2) postcard text production with the interaction methods (Section 4.1.1). This was
followed by an authorship and interaction questionnaire. (3) Two days later, there was a nal questionnaire on text
recognition, text rating, and general conclusions. In accordance with the declaration of Helsinki, we informed participants
about their rights and the study procedure and only started the study if they gave their consent to take part in it.
Part 1: Fine-Tuning and Background Processing. For collecting ne-tuning input, we rst asked participants to write
responses of approximately 300-800 characters to 18 dierent prompts. The style and theme of the prompts resembled
the prompt that we used for postcard generation in Part 2 (see below). Examples of prompts are “Please describe yourself
in about 5 sentences. and “What was your favorite vacation?” The participants were then asked to upload ve texts
they had previously written (e.g., blog posts or emails to friends) and match each one with a suitable prompt. Once the
ne-tuning questionnaire was complete, we randomly assigned half the participants to the personalized condition and
the other half to the pseudo-personalized condition. All participants responded to the ne-tuning questions regardless
of their assigned personalization group. This served to mitigate response bias and the placebo eect in human-computer
interaction [21,68].
Before inviting participants to Part 2, we personalized Davinci models
14
with the data of the participants in the
personalization group and generated ve individual postcards for the postcard prompt (“Write a postcard from New
York to a friend. The postcard should have between 100 and 200 words.”). For the pseudo-personalization group, we
randomly alternated between three distinct text sets that contained 5 of the 15 pre-generated texts15.
Part 2: Writing & Uploading, Authorship Evaluation. One day later, participants started Part 2 of the study. Here,
their task was to create four postcards addressed to a friend as if they were currently visiting New York and upload the
created postcards to a blog-style website. They pasted the postcard texts into a text eld and provided metadata for
a title, the date, and author declaration, and their access key (the study ID). The author declaration was a text eld
such that participants could list any entity they wanted. Each of the postcards was composed with one of the four
Interaction Methods (i.e., dierent levels of inuence). The order of the Interaction Methods was counterbalanced.
After all Interaction Methods, we presented a nal questionnaire on the interaction experience, perceived control,
and how they liked the resulting postcards. For questions on a specic Interaction Method, we included a screenshot
of the corresponding UI.
Part 3: Text Recognition & Rating. The Prolic invitations for Part 3 were sent out two days after completing Part 2.
In the nal questionnaire, participants were asked how well they remembered the postcard texts and what interaction
method they had been created with. Finally, we queried participants’ opinions on automatic text generation and whether
it should be mandatory to mark texts that were created with the help of articial intelligence. We add the results for
this part as supplementary material.
4.1.3 Apparatus & Measures. Study 1 was designed as a browser-based online experiment including three questionnaires,
the custom interaction methods described in Section 4.1.1, and a custom blog-style website. The components were
14https://beta.openai.com/docs/guides/ne- tuning
15Technical details and the text sets are provided as supplementary material.
9
, 2023, Draxler et al.
coordinated with a study framework that handled counterbalancing, logging, and participant ow between the study
components. In the ne-tuning questionnaire, we recorded the texts and prompts the participants wrote. In Part 2 and
the nal questionnaire (Part 3), we primarily measured the following constructs:
The sense of ownership indicates to what extent the participants felt as the authors of the text and, conversely, to
what extent they felt that the AI was the author.
The sense of leadership denotes who participants felt took the lead during the writing process.
With the sense of control, we assessed if participants felt that they had control over the resulting text.
The declaration of authorship refers to the entity or entities participants listed as the credited author(s) in the
blog upload form.
The subjective quality of the personalization addresses the participants’ impression of how well the texts produced
with the Interaction Methods matched their style.
The objective quality of the personalization provides a comparison between the texts used for ne-tuning and the
generated texts with or without personalization.
User experience with the Interaction Methods, as assessed with the AttrakDi questionnaire [
52
] and open-
ended questions. This provides insights into the Interaction Methods and future implications for designing
user interfaces for LLMs.
Appendix A provides the full list of measures for these constructs and for auxiliary measures (e.g., perceived creepiness
for a broader perspective on user acceptance). Demographic information was exported from Prolic.
4.1.4 Participants. We recruited 43 native English speakers via Prolic, of which 30 correctly completed all required
steps of the study. Seven participants withdrew from the ne-tuning questionnaire. Of the remaining 36 participants,
one never started Part 2 and one had technical issues. A third participant was excluded because they took more than
two days between the rst and last step of Part 2, and this would have aected their answers in Part 3. Thus, we invited
33 participants to Part 3, and 32 completed it. Two of these uploaded duplicate texts, which meant their data had to
be discarded. Of the nal set of 30 participants, 13 identied as female and 17 as male, and none as non-binary or
other. They were 37.2 years old on average (
𝑆𝐷 =
11
.
8,
𝑚𝑖𝑛 =
20,
𝑚𝑎𝑥 =
70). They listed the UK (17), the US (9), South
Africa (2), Canada (1), and Zimbabwe (1) as their country of residence. Five participants had no prior experience with
text-generation systems. Nineteen (63%) had used word or sentence suggestions and 20 (67%) had used auto-correction
features on their devices. Half of the participants had previously written texts with auto-completion and 8 (27%) had
used a smart-reply option. Additionally, two participants stated that other people sometimes write texts on their behalf,
for example, an assistant at work. The participants received a compensation of £18 for Part 1, £13.50 for Part 2, £3 for
Part 3, and a bonus payment of £6 for nishing all steps, i.e., a total of £40.50 for approximately four hours of their time.
4.2 Results
We will rst investigate whether the personalization of the texts was successful. We then turn to the analysis of the
sense of ownership and the declaration of authorship. Finally, we examine whether the sense of control and sense
of leadership were manipulated by the interaction method and whether the sense of control relates to the sense of
ownership. An in-depth analysis of the remaining measures (including the open-ended questions) was beyond the
scope of this paper. We provide a short overview of additional results as supplementary material.
10
The AI Ghostwriter Eect , 2023,
Table 1. Participants’ perception of the text with the interaction method Getting (1 Strongly disagree 7 Strongly agree)
Personalized Pseudo-Personalized
MD SD MD SD
The postcard mostly contains words and/or phrases that I usually
use when writing in English.
5.07 1.67 4.20 1.90
I would have written a similar postcard by myself. 4.40 1.99 3.53 2.10
Analysis. Below, we analyze dierences between the two Personalization groups and four Interaction Methods
regarding authorship, using mixed analysis of variance (ANOVA) with Bonferroni-corrected
𝑡
-tests in post-hoc compari-
son. For repeated-measures nominal data, we use Cochrane’s Q-test. For estimating regression coecients in hierarchical
repeated measures, we use linear mixed models (LMM) with Kenward-Roger estimation of the degrees of freedom. We
use eect coding for two-level variables. To compare interaction methods in Study 1, we contrast each condition to
the Writing condition. Non-parametric ordinal data is tested with a rank-aligned repeated-measures ANOVA [
63
];
here, follow-up analysis is based on Bonferroni-corrected ART-C contrasts [
36
]. NULL-hypotheses relevant to our
main research questions are evaluated using Bayes factors computed using the BayesFactor package [
78
]. Finally, we
compute cosine similarity to compare texts using the tidytext-package in R [
95
]. The repeated-measures ANOVA model
is robust to violations of normality with regard to their residuals [
45
,
51
]. Thus, for the sake of brevity, the main text
only reports the parametric model (and not also the non-parametric one) in cases where a d’Agostino normality test
[
28
] (Shapiro-Wilk test is too sensitive for
𝑛>
50) indicates normality violations and the results of non-parametric and
parametric models align. We apply a Greenhouse-Geisser correction when the assumption of sphericity is violated. For
statistical inference, 𝛼was set at 5%.
Manipulation Check: Personalization of the Generated Texts. To check whether the personalization was successful,
we computed the cosine similarity of training texts and generated texts for each participant. We had to remove one
participant as the cosine similarity returned was zero. We expected that the similarity of texts for the true personalization
group should be larger than for the pseudo-personalization group that received non-personalized texts. Indeed, on
average the cosine similarity was larger for the true personalization group (
𝑀=.
55,
𝑆𝐷 =.
06) compared to the
pseudo-personalization condition (
𝑀=.
53,
𝑆𝐷 =.
06,
𝑡(
30
.
90
)=
3
.
86,
𝑝<.
001,
𝑑=
1
.
32). Table 1 conrms that the
personalized texts tended to be a better match for the participants’ personal writing styles than the pseudo-personalized
texts.
H1.1 & H3: Sense of Ownership. Next, we tested whether the applied Interaction Methods changed the sense
of authorship for the text generated in the human-AI interaction. In a repeated-measures ANOVA with the factors
Personalization (between, 2 levels; true personalization vs. pseudo-personalization) and Interaction Method
(within, 4 levels; Writing,Getting,Choosing,Editing), we analyzed the item “To whom should this text belong”
that participants answered on a visual analog scale (range of -50 me and +50 AI)
16
. Due to the violation of the sphericity
assumption, 𝑊>.45,𝑝<.001, we applied a Greenhouse-Geisser correction to all within-subject factors.
We found a signicant eect of Interaction Method,
𝐹(
2
.
02
,
6
.
58
)=
55
.
75,
𝑝<.
001,
𝜂2
𝑝=.
67. All AI-related Inter-
action Methods diered from the Writing condition (all
𝑝
<.001), but none of the AI-related Interaction Methods
16
The assumption of normality of residuals was violated (
𝑝
<.001). Therefore, we computed a non-parametric ANOVA [
63
]. As with the parametric model,
there was an eect of Interaction Method,𝐹(3,84)=45.24,𝑝<.05. Post-hoc tests resembled the parametric model; all other eects were 𝑝>.05
11
, 2023, Draxler et al.
Table 2. Participants’ responses to selected Likert matrix questions (1 Strongly disagree 7 Strongly agree)
Writing Editing Choosing Getting
MD SD MD SD MD SD MD SD
I felt like the AI was acting as a tool which I could control. 4 2.15 5 1.52 5 1.75 4 2.27
I felt like the AI system was acting as a ghostwriter, writing
the postcard on my behalf.
2.5 2.18 5 1.63 6 1.71 6.5 2.32
I felt like I was writing the text and the articial intelligence
was assisting me.
1.5 2.44 3 1.99 2.5 2.16 2 2.03
I felt like the articial intelligence was writing the text and
I was assisting.
1 1.73 5.5 1.91 5 1.82 4 2.13
Writing
Editing
Choosing
Getting
Me (-50) -25 0 25 AI (+50)
To whom should this text belong?
Fig. 2. Mean ratings for each Interaction Method of the item “To whom should this text belong” answered on a visual analog scale
(range of -50 me and +50 AI) with individual data points as boxes. Error bars indicate ±one standard error of the mean.
diered from each other (all
𝑝>.
05), see also Figure 2. We did not nd an eect of Personalization (
𝐹(
1
,
28
)=
0
.
52,
𝑝=.
477,
𝜂2
𝑝=.
02), or an interaction eect of Interaction Method
×
Personalization (
𝐹(
2
.
02
,
56
.
58
)=
0
.
48,
𝑝=.
625,
𝜂2
𝑝=.
02). We followed up on the non-signicant main eect of Personalization with an independent
Bayesian
𝑡
-test and found that the model assuming no dierence is 2.38 times more like than the model assuming
dierences in means given the priors17.
Therefore, we can follow that text generated by a (pseudo-)personalized AI did change the sense of ownership (H1.1).
In particular, all AI-related methods elicited a sense of ownership dierent from Writing. However, the personalization
did not aect whether participants attributed authorship (H3). Note that in Figure 2, one can see that some participants
attributed a sense of ownership to the AI in the Writing condition. This could be due to participants taking over
formulations of the AI-generated texts from the other conditions or due to careless responses.
H1.2: Declaration of Authorship. Did the sense of ownership translate into declaring AI as an author on the website?
For this, we considered the free-text entries in the author eld of our blog upload form (see Section 4.1.2). We classied
17
Priors were set to default here with
2/2
on the standardized Cauchy prior as implemented in [
78
]. Prior sensitivity checks showed that with wider
priors the model assuming no dierence between means became more likely.
12
The AI Ghostwriter Eect , 2023,
the entries as referring to the participants themselves (i.e., a name or acronym), the AI, an impersonal attribution
(“Human”, “Random”), or a combination of several categories (e.g., “AI and Kyle”).
Curiously, although participants had little sense of ownership, they still added their name to the postcard when
publishing it on the website, see Figure 3. We nd that people declared themselves as authors for the Writing condition,
but that they also declare themselves as authors for the (pseudo-)personalized AI-generated texts. We tested the
dierences in proportions against chance with Cochrane’s
𝑄
-test. Interaction Method had a signicant eect on the
frequency of mentioning “AI”,
𝑄(
3
)=
18
.
54,
𝑝<.
001. Only six to seven participants per Interaction Method (except
Writing) mentioned AI when declaring authorship on the postcard. For the Editing condition, all mentions of the AI
were in collaboration with the participant or an impersonal attribution. For example, they put their name and “AI”.
For the Choosing condition, three of six mentions only listed “AI”, the other three were declared as a collaboration
of the AI and the participant or a “human”. For the Getting condition, this was the case for only one of the seven
occurrences, all other cases mentioned “AI” as the sole author. Planned comparisons between the Writing and the
other conditions were all signicant (all
𝑝<.
03), however, no pairwise comparisons for Editing,Getting,Choosing,
were signicant (all
𝑝>.
05). Thus, we follow that while AI was declared as an author for postcards produced with AI
interactions, this was not the case for all participants (partial support for H1.2).
Figure 3 further illustrates the discrepancy between the sense of authorship and the declared authorship, contrasting
the sense of authorship against the count data for all the conditions with AI-generated personalized texts. Although
participants judged the text to belong to the AI in most of the cases, they still put their name underneath the postcard
in most of the cases. We refer to this discrepancy as the AI Ghostwriter Eect.
The questionnaire results and participant statements provide insights into reasons for the AI Ghostwriter Eect. For
example, when participants were not writing their own texts, they tended to feel that the AI was a tool and that it acted
as a ghostwriter (see Table 2). Seventeen participants (56.7%) stated that in their opinion, it should not be mandatory to
mark texts that were created with the help of AI as such. Reasons they mentioned include that the AI “is really just a
tool to say what the writer wants to say, and that “as long as the person using the AI reviews the text before it is sent then
it should be ok. Several participants also felt that disclosing AI support would not be valued, e.g.: “People would feel
oended if they thought I couldn’t be bothered to text time myself. Others explicitly mentioned human ghostwriters as
a comparison: “prior to machine generated text there was a whole industry of humans lling content space in a similar
manner. The (pseudo-) personalization also made a dierence: One participant from the non-personalized group noted
that marking is not necessary, “especially if it reects who you are and you chose the specic message that suits you the
best.
Conversely, 13 participants (43.3%) felt that disclosing AI contributions should be mandatory. Arguments included
ethics and transparency, e.g.: “It’s important to know whether you are interacting with a human or an algorithm” and
“to understand if something has been said that doesn’t really make sense. Others mentioned self-protection and that
otherwise “it feels a bit like cheating.
H2.1: Sense of Control. We analyzed participants’ responses on the item “Who was in control over the content of
the postcard?”, which was answered on a visual analog scale (range of -50 me and +50 AI) with a repeated-measures
ANOVA model. Again the sphericity assumption was violated,
𝑊>.
60,
𝑝<.
02, and we applied a Greenhousse-
Geisser correction to all within-subject factors and their interaction terms. The normality of residuals was not violated
(
𝑝=.
14). We found that Interaction Method aected the sense of control,
𝐹(
2
.
25
,
62
.
95
)=
28
.
18,
𝑝<.
001,
𝜂2
𝑝=.
50.
Bonferroni-corrected 𝑡-tests revealed signicant dierences between all comparisons (all 𝑝<.05), with the exception
13
, 2023, Draxler et al.
30 (100%)
0 (0%)
23 (76.7%)
7 (23.3%)
24 (80%)
6 (20%)
23 (76.7%)
7 (23.3%)
Writing
Editing
Choosing
Getting
Me (-50) -25 0 25 AI (+50)
To whom should this text belong?
Writing Editing Choosing Getting no mention of AI mention of AI
Fig. 3. Sense of ownership (x-axis) and declaration of authorship (marker shape: circle vs triangle) as a function of Interaction
Method for AI-generated texts. Error bars indicate
±
one standard error of the mean. The numbers on the right indicate the counts of
authorship declarations binarized for any mention of AI as an author as a function of Interaction Method in our sample (
𝑛=
30).
Note that all participants underwent all Interaction Methods. One participant declared the writer of the postcard to be “random”
for all Interaction Methods that involved AI-generated texts. These were allocated to the no-mention-of-AI category.
Writing
Editing
Choosing
Getting
Me (-50) -25 0 25 AI (+50)
Who was in control over the content of the postcard?
Fig. 4. Sense of control as a function of Interaction Method for AI-generated texts. Error bars indicate
±
one standard error of the
mean.
of the comparison between Getting and Choosing (
𝑝>.
99, see also Figure 4). Therefore, we can consider that our
manipulation of Interaction Method successfully varied the sense of control in the human-AI interaction.
There was neither an eect of Personalization,
𝐹(
1
,
28
)=
1
.
05,
𝑝=.
315,
𝜂2
𝑝=.
04, nor an interaction eect,
𝐹(
2
.
25
,
62
.
95
)=
0
.
60,
𝑝=.
570.
𝜂2
𝑝=.
02. Comparing personalization with a Bayesian
𝑡
-test
18
, we could quantify that
the model assuming no dierence was 1.98 times more likely than the model assuming dierences between groups.
18
Priors were set to default here with
2/2
on the standardized Cauchy prior, as implemented in [
78
]. Again, prior sensitivity analysis indicated that
with wider priors, the model assuming no dierence between means became more likely.
14
The AI Ghostwriter Eect , 2023,
−50
−25
0
25
50
−50 −25 0 25 50
Who was in control over
the content of the postcard?
To whom should this text belong?
A
−50 −25 0 25 50
Who took the lead in writing?
B
Fig. 5. A: Sense of control as a function of the sense of ownership for AI-generated texts. B: Sense of leadership as a function of the
sense of ownership for AI-generated texts. Dots show the raw data and the line represents the marginal eect of Sense of control on
Sense of authorship without the eect of Interaction Method.
We expanded the ANOVA model by tting a linear mixed model to predict the sense of ownership from the sense of
control and the Interaction Method. The model included participants as a random eect to account for the structure
of the repeated measures in the data. The model’s total explanatory power was substantial (conditional
𝑅2=
64%;
marginal
𝑅2=
55%). As in the prior analysis of authorship, we nd a signicant eect of Interaction Method on
the sense of ownership,
𝐹(
3
,
86
.
02
)=
28
.
00,
𝑝<.
001. The sense of control also predicted the sense of ownership,
𝐹(
1
,
101
.
55
)=
6
.
47,
𝑝=.
02. There was a statistically signicant and positive eect (
𝛽=
0
.
21, 95% CI [0.05, 0.37],
𝑡(
109
)=
2
.
57,
𝑝=
0
.
012;
𝛽𝑧=
0
.
20, 95% CI [0.05, 0.36]). For a predictive plot of the eect of sense of control on the
sense of ownership with raw data, see Figure 5A. Neither the main eect of Personalization nor the interaction of
Personalization ×Interaction Method was signicant (both 𝑝>.05).
H2.2: Sense of Leadership. Participants’ responses to “Who took the lead in writing?”, again answered on a visual
analog scale (range of -50 me to +50 AI), were analyzed in a mixed ANOVA model19. There was a signicant eect of
Interaction Method,
𝐹(
2
.
25
,
62
.
95
)=
28
.
18,
𝑝<.
001,
𝜂2
𝑝=.
50, but no eect of Personalization,
𝐹(
1
,
28
)=
1
.
05,
𝑝=.
315,
𝜂2
𝑝=.
04, nor an interaction eect,
𝐹(
2
.
25
,
62
.
95
)=
0
.
60,
𝑝=.
570,
𝜂2
𝑝=.
02. Note that a Bayesian
𝑡
-test, as
modeled for the sense of control, indicated that the model assuming no dierence for Personalization was 2.52 times
more likely than the model assuming dierences. Post-hoc tests showed signicant dierences for all Interaction
Methods with the exception of the dierence between Getting and Choosing,𝑝>.99, see Figure 6.
We again expanded the ANOVA by tting a linear mixed model to predict the sense of ownership from the sense
of leadership. We found a similar pattern of results. There was a signicant eect of Interaction Method on the
19
Again the assumption of normality of residuals was violated (
𝑝=0.03
). We, therefore, computed a non-parametric ANOVA [
63
]. Resembling the
parametric model, there was an eect of Interaction Method,
𝐹(3,84)=30.26
,
𝑝<.05
. Non-parametric post-hoc tests resembled the parametric
analysis.
15
, 2023, Draxler et al.
Writing
Editing
Choosing
Getting
Me (-50) -25 0 25 AI (+50)
Who took the lead in writing during this interaction method?
Fig. 6. Sense of leadership as a function of Interaction Method for AI-generated texts. Error bars indicate
±
one standard error of
the mean.
sense of ownership,
𝐹(
3
,
84
.
91
)=
23
.
71,
𝑝<.
001 and no main or interaction eect with Personalization (both
𝑝
>.05)
but a signicant main eect of sense of leadership,
𝐹(
3
,
95
.
65
)=
5
.
93,
𝑝=.
016. The relation of the sense of ownership
and the sense of leadership was positive (
𝛽=
0
.
21, 95% CI [0.04, 0.37],
𝑡(
109
)=
2
.
46,
𝑝=
0
.
015;
𝛽𝑧=
0
.
19, 95% CI [0.04,
0.34]), see also Figure 5B.
Interaction Methods aected the sense of control over the content and sense of leadership in the interaction. This
predicted the sense of authorship and, thus, the feeling to whom a text should belong (H2).
4.3 User Experience
We analyzed all subscales of the AttrakDi with a repeated-measures ANOVA model with Greenhouse-Geisser correction.
For hedonic quality identication there was no eect of Personalization,
𝐹(
1
,
28
)=
0
.
76,
𝑝=.
391,
𝜂2
𝑝=.
03, but a
signicant eect of Interaction Method,
𝐹(
2
.
23
,
62
.
32
)=
8
.
50,
𝑝<.
001,
𝜂2
𝑝=.
23. Bonferroni-corrected post-hoc
comparisons on Interaction Method indicated that the Getting (
𝑀=
3
.
74,
𝑆𝐷 =
1
.
28) interaction was perceived
worse than Writing (
𝑀=
4
.
78,
𝑆𝐷 =
0
.
88), Choosing (
𝑀=
4
.
47,
𝑆𝐷 =
1
.
21), and Editing (
𝑀=
4
.
81,
𝑆𝐷 =
1
.
06;
all other comparisons
𝑝>.
05). The Personalization
×
Interaction Method term did not reach signicance,
𝐹(
2
.
23
,
62
.
32
)=
2
.
24,
𝑝=.
109,
𝜂2
𝑝=.
07. Therefore, the Getting interaction can be regarded as the least self-relatable
method of interaction.
For the ANOVA on hedonic quality stimulation, we nd a similar pattern of results, Personalization,
𝐹(
1
,
28
)=
0
.
34,
𝑝=.
562,
𝜂2
𝑝=.
01,Interaction Method,
𝐹(
3
,
84
)=
4
.
61,
𝑝=.
005,
𝜂2
𝑝=.
14, and Personalization
×
Interaction
Method,
𝐹(
3
,
84
)=
2
.
29,
𝑝=.
084,
𝜂2
𝑝=.
08. Comparisons that were Bonferroni-corrected on the factor Interaction
Method only indicated dierences between the Getting (
𝑀=
3
.
91,
𝑆𝐷 =
1
.
33), and the Editing (
𝑀=
4
.
81,
𝑆𝐷 =
1
.
06)
condition (all other comparisons
𝑝>.
05). Unsurprisingly, the Getting interaction was, thus, deemed to be less
stimulating than the Editing method.
For the analysis of attractiveness, the same pattern applies: Personalization,
𝐹(
1
,
28
)=
0
.
97,
𝑝=.
332,
𝜂2
𝑝=.
03,
Interaction Method,
𝐹(
3
,
84
)=
8
.
07,
𝑝<.
001,
𝜂2
𝑝=.
22, and Personalization
×
Interaction Method,
𝐹(
3
,
84
)=
2
.
27,
𝑝=.
086,
𝜂2
𝑝=.
08. Again, Getting (
𝑀=
3
.
99,
𝑆𝐷 =
1
.
66), was signicantly less attractive than Writing,
16
The AI Ghostwriter Eect , 2023,
too
self-oriented self-oriented desired
neutral task-oriented
superfluous too
task-oriented
Writing
Editing
Choosing
Getting
Pragmatic Quality
Hedonic Quality
Fig. 7. ArakDi for the Interaction Methods in Study 1.
(
𝑀=
5
.
24,
𝑆𝐷 =
0
.
89), Choosing (
𝑀=
4
.
94,
𝑆𝐷 =
1
.
39), and Editing (
𝑀=
5
.
13,
𝑆𝐷 =
1
.
38; all other comparisons
𝑝>.05).
The ANOVA on the pragmatic qualities revealed a dierent pattern of results. There was again no eect of Per-
sonalization,
𝐹(
1
,
28
)=
1
.
85,
𝑝=.
185,
𝜂2
𝑝=.
06 but an eect of Interaction Method,
𝐹(
3
,
84
)=
4
.
20,
𝑝=.
008,
𝜂2
𝑝=.
13. Note, however, that this was qualied by a signicant Personalization
×
Interaction Method interaction
term,
𝐹(
3
,
84
)=
3
.
46,
𝑝=.
020,
𝜂2
𝑝=.
11. A Bonferroni-corrected comparison of means across Personalization
×
Interaction Method revealed that pragmatic qualities for the participants that did not receive personalized texts in
the Getting condition (
𝑀=
4
.
32,
𝑆𝐷 =
1
.
04) signicantly diered from the Choosing condition (
𝑀=
5
.
14,
𝑆𝐷 =
0
.
84)
but also from the Writing condition (
𝑀=
5
.
70,
𝑆𝐷 =
0
.
74; all other comparisons
𝑝>.
05) in the group that received
personalized texts. This indicates that Getting is not subjectively perceived as very pragmatic, even though it is clearly
the Interaction Method with the most time-ecient interaction pattern.
Overall, Figure 7 shows that Getting was the most neutral Interaction Method in absolute terms, and that the
user experience for Writing and Editing was comparable. Choosing used minimal interactions with the AI model
and was, thus, probably rated as more pragmatic than Getting.
5 USER STUDY 2: DECLARED AUTHORSHIP WITH HUMAN AND AI GHOSTWRITERS
With Study 1, we demonstrated that personalized AI-generated texts diminish the sense of ownership compared to
self-written texts. This stands in contrast to the declaration of authorship. In the majority of cases, participants did not
declare the AI system as an author of the text when publishing it online. We summarized this as the AI Ghostwriter
Eect. We found that Interaction Methods aect the sense of control in the human-AI interaction, which increases
the sense of ownership. We found that Personalization did not increase the sense of ownership or sense of control, or
17
, 2023, Draxler et al.
descriptively, authorship declaration. Therefore, true personalization was not a necessary condition for the AI Ghost-
writer Eect. This pattern of results allows us to run a Wizard-of-Oz study with respect to personalization. Therefore, in
Study 2, we describe the postcards to be personalized but actually assign the non-personalized texts randomly to the
participants.
In Study 2, we address four important limitations of the previous study: First, we found that participants attributed a
sense of authorship to the AI for the self-written texts. This could be due to our experimental design where people
underwent all Interaction Methods and only then were asked to declare their sense of ownership for the four
postcards. Participants could have taken up information in their self-written texts that appeared in the previous
AI-generated texts. We address this limitation by only using one Interaction Method. To thoroughly test the AI
Ghostwriter Eect, we chose Getting, because it had the lowest sense of ownership and the most mentions of AI as
an author. Second, to attribute any authorship to the AI-generated texts, participants had to come up with a label for
the AI. This prompted one participant to label the postcard with “random”. In Study 2, we provide participants with
six pre-dened authorship labels that they can choose from, motivated by Study 1. Third, one can entertain that not
declaring collaboration is not specic to human-AI interaction but also happens in human-human interaction, i.e., in
the case of human ghostwriters. Thus, we add a ctional human ghostwriter. Fourth, the small sample size did not allow
for the accurate estimation of relative frequencies in declaring the AI as an author. A larger study will allow for a more
accurate estimation of this eect. As listed in Section 3, we hypothesize that we can replicate the AI Ghostwriter Eect
and that it exceeds the ghostwriter eect for human-human interaction. Study 2 was conducted in December 2022.
5.1 Method
5.1.1 Procedure. The study consisted of two parts: (1) a pseudo-personalization survey and (2) interaction with
generated postcards. Before starting with Part 1, we informed participants about the study procedure and their rights
and obtained consent.
Part 1: Personalization Questions. First, participants were asked to upload ve texts of at least 500 words that they
had written in the past. We told them that the texts should preferably be informal texts such as personal emails
and that they had the option to remove or change any detail they did not want to share. This part was intended
to make participants believe that texts would be generated for them. Since the previous study had shown no eect
of personalization, the texts were not actually used. However, participants entered a name or pseudonym, and we
performed a pseudo-personalization by applying the provided name as the greeting in the postcard. Participants also
answered questions on their attitude towards AI (cf. Table 5).
Part 2: Authorship Attribution. One day later, we randomly presented participants with one of the 15 previously
generated postcard texts. They were told that this text was written either by a human or by an AI ghostwriter and
that the text was personalized using their input from Part 1. They were then shown a blog-style upload form for the
generated text and were asked to choose an image and provide metadata: a title, the date, and an author attribution. This
time, the author(s) were selected from a randomized list including the options AI,GPT-3,Human,Sasha (the name of the
ctional ghostwriter), Other + text eld, and the participants’ name,initials, or pseudonym provided in Part 1. Directly
after the upload, the participants answered questions on their perceived agency and how they liked the resulting
postcard. This process was repeated for the second ghostwriter type, i.e., AI or human. The order of the conditions was
counterbalanced. After the interaction with the two ghostwriter types, we administered questions on the participants’
reasoning on authorship attribution and re-assessed their attitude towards AI after participation in the study.
18
The AI Ghostwriter Eect , 2023,
5.1.2 Apparatus & Measures. The study was implemented with two Qualtrics surveys. As in Study 1, we measured the
participants’ sense of ownership, control, and leadership. Moreover, we counted how often AI was mentioned as an
author. An overview of all recorded measures is shown in Table 5.
5.1.3 Participants. We recruited 107 native English speakers via Prolic. One participant was excluded after Part 1
for not following the instructions. All others were invited to participate in Part 2, and 100 did so. Of these, four failed
an attention check. Analyses only include the remaining 96 participants that correctly completed both parts. These
reported their gender as male (53) or female (43); the options non-binary, other, or undisclosed were not selected. They
were between 20 and 75 years old (
𝑀=
36
.
3,
𝑆𝐷 =
13
.
3). Their current countries of residence were the UK (41), South
Africa (25), Canada (15), Ireland (5), and 1-3 participants each listed Mexico, Australia, Spain, Israel, Portugal, and the US.
As in Study 1, we asked for their experience with text generation. Fifty-ve said they had previously used smart-reply
features, 81 word or sentence suggestions, 63 auto-completion, and 86 auto-correction. Thirteen participants sometimes
have people writing texts on their behalf. The full study took around 40 minutes to complete, and participation was
compensated with £5.
5.2 Results
H4: Dierences in Sense of Ownership. We compared the sense of ownership for the Human and AI condition and
found a signicant dierence,
𝑡(
95
)=
2
.
05,
𝑝=.
043,
𝑑=
0
.
21
20
. People attributed more ownership to themselves
when the ghostwriter was presented as an AI, compared to when it was presented as a human, see Figure 8.
H5: Dierences in Authorship Declaration. We found dierences in the declaration of authorship (H3). Generally,
while people more often mentioned AI as an author (about 71.3% in Study 2 versus a max. of 23.3% in Study 1), we
could see that they mentioned AI signicantly less often
𝜒2(
1
)=
31
.
92,
𝑝<.
001, than mentioning “Sasha” or any entity
other than themselves, see Table 3.
This trend also shows in the question of whether it should be mandatory to mark texts that were created with the
help of a human/AI as such. For the human condition, 63 participants (65.6%) answered yes, compared to 59 (61.5%) for
the AI condition (all others selected no). For the declaration of human contributions, participants listed reasons such
as emotional involvement, e.g., “we don’t want to hurt their feelings, right?” and “I would expect to be given credit if it
was me”, and that “it is their intellectual property. Another argument was derived from the perceived authenticity of
human-written texts, e.g., “if it’s truly from a human telling you about the lived experience then it should have their name
on it” and “because it truly reects the value of a human being. One participant who felt that only human ghostwriters
should be credited explained that “that person is not [a] tool, whereas the system AI is a tool that can [be] used.” Reasons
for not naming a human ghostwriter were references to existing ghostwriting practices, e.g., “people that do this for a
living” and “it is the equivalent of a personal assistant, who wouldn’t necessarily sign their own name. One participant
again referred to the personalization: “i am a little bit in between but as i had given examples of my writing style then no.
As in Study 1, participants argued that AI contributions should be credited for ethics and transparency, e.g.: “It seems
ethical to declare this”, and it “help[s] give an idea to the receiver to understand the background context. In addition,
not declaring contributions “could have legal ramications. Interestingly, several participants added that crediting AI
conrms its potential, e.g., “this will also help put a positive image on AI. and “It gives people an understanding of how far
AI has come. Those who did not consider marking AI mandatory frequently described AI as a tool, e.g., “it expresses the
20Normality was violated (𝑝<.01). A non-parametric Wilcoxon test was also signicant, 𝑊=1095,𝑝=.018,𝑟=.10.
19
, 2023, Draxler et al.
AI
Human
Me (-50) -25 0 25 AI/Sasha (+50)
To whom should this text belong?
Fig. 8. Sense of authorship as a function of Ghostwriter. Error bars indicate ±one standard error of the mean.
AI
Human
Me (-50) -25 0 25 AI/Sasha (+50)
Who took the lead in writing during this interaction method?
Fig. 9. Sense of leadership as a function of Ghostwriter. Error bars indicate ±one standard error of the mean.
Table 3. Counts for each Ghostwriter on declared authorship binarized for any mention of AI/Human as authors. Note that all
participants underwent all Ghostwriter conditions.
AI Human
mention of Ghostwriter 71 (73.9%) 79 (82.3%)
no mention of Ghostwriter 25 (26.0%) 17 (17.7%)
feelings and viewpoints of the owner. it is just a tool used by the owner.” and “AI is a tool we can use, just like we don’t mark
that we used Word or Google Docs to show we made something. Similarly, one person said that “Articial Intelligence is
used on behalf of a person. It should be up to the person publishing whether or not to disclose AI usage. The context also
played a role, e.g., “the recipient might think that not much thought would have been sent into the sending of the postcard.
H6: AI Ghostwriter Eect. To evaluate H6, we computed the Spearman rank correlation of the sense of ownership and
declaration of AI as an author. The link between the sense of ownership and declaring authorship of AI was signicant
but very small,
𝑟𝑠=.
24,
𝑝=.
017. Note that the same holds true for the link in human ghostwriters,
𝑟𝑠=.
26,
𝑝=.
011.
Therefore, the sense of ownership does not map directly onto the declaration of authorship in a 1:1 relation or is strongly
correlated, supporting H6.
20
The AI Ghostwriter Eect , 2023,
AI
Human
Me (-50) -25 0 25 AI/Sasha (+50)
Who was in control over the content of the postcard during this interaction method?
Fig. 10. Sense of control as a function of Ghostwriter. Error bars indicate ±one standard error of the mean.
Subjective Control and Leadership. We found no dierence in sense of control between both conditions,
𝑡(
95
)=
0
.
28,
𝑝=.
784,
𝑑=
0
.
03
21
(see Figure 10). For the sense of leadership, we found no signicant dierence in means with a
Welch-corrected
𝑡
-test,
𝑡(
95
)=
1
.
10,
𝑝=.
274,
𝑑=
0
.
11. However, there was a signicant eect when comparing
ranks with a Wilcoxon test,
𝑊=
876,
𝑝=.
021,
𝑟=.
12, which was probably due to the large skew in the data (normality
was violated, 𝑝<.01, see Figure 9).
6 DISCUSSION
Our studies show that there is a discrepancy between the sense of ownership and declaration of authorship when
working with personalized AI text generation. This constitutes the AI Ghostwriter Eect. In a text-generation scenario,
people will often declare themselves as the author, and not the AI who generated the text for them. This is the case
even when they feel that the ownership lies with the AI.
Study 1 gave a rst indication of the AI Ghostwriter Eect in a postcard-writing scenario: For all Interaction
Methods where participants received AI support, they tended to state that the text should belong to the AI (H1.1).
Despite this, less than one-fth of the participants declared the AI as an author of the postcard (H1.2). The study also
showed that the sense of control and leadership were positively correlated with the level of inuence over the text
generation, which ranged from receiving a non-editable AI text (Getting) to writing the text without AI support
(Writing; H2). It did not make a dierence for the sense of ownership and the declaration of authorship whether a
participant received a personalized or pseudo-personalized generated text (H3). Therefore, it was not necessary to
include Personalization as an independent variable in Study 2.
With the pre-registered Study 2, we were able to replicate the AI Ghostwriter Eect in a larger sample and to compare
it to human ghostwriting. In a Wizard-of-Oz design, the same pseudo-personalized texts were presented as either being
produced by an “AI”or by a ghostwriter introduced as a person named “Sasha”. The participants’ sense of ownership
was signicantly lower when they received a supposedly human-written text than an AI-written text. Together with
the participant statements, this suggests that the AI is seen as a tool rather than an independent co-author (H4). In line
with the tendency of humans to more readily exploit AI than a human [
62
], the human ghostwriter was also credited
more often than the AI ghostwriter (H5). Study 2 provided participants with a multiple-choice list of possible options to
choose from, rather than an open author declaration. Thus, they could immediately see that it was possible to choose
21
A Bayesian
𝑡
-test with default priors indicated that the model assuming no dierences is about 8.5 times more likely than the model assuming a
dierence.
21
, 2023, Draxler et al.
the AI as an author. Nonetheless, there were still cases where they did not acknowledge the AI or human ghostwriter
(H6).
Our ndings extend the research on human-AI collaboration in writing. Specically, we focus on cognitive and
behavioral implications and on interaction design. They also motivate recommendations for the design of AI-text-
generation interfaces and for the extension of authorship frameworks to support text production.
6.1 Implications for Cognition and Behavior in Writing
When AI systems are introduced to the writing process, they can impact cognitive and behavioral patterns in text
production, such as changes in control and the dynamics of the established writing process that, in turn, change
ownership and authorship for users.
Involvement and Psychological Ownership. With automation introduced by interactive AI, inuence over the interac-
tion is one core issue [
12
,
93
]. Be reminded that levels of inuence did not aect the sense of ownership directly in Study
1 but that the sense of ownership did change as a function of the sense of control over the content and sense of leadership
in the interaction which was aected by our Interaction Methods. The regression models, taking into account
individual variation with respect to control, highlight that there is a large variation in the mental model of human-AI
interaction with respect to user involvement. This is in line with Pierce et al
. [83]
who found that ownership increases
with one’s own subjective involvement. Thus, not only objective control, e.g., as induced by dierent Interaction
Methods, but the subjective experience of control in human-AI interaction is essential to understanding how users
relate to AI-generated content in terms of ownership.
AI Support and the Cognition of Writing. Including text suggestion tools in writing tasks also shapes the cognitive
processes involved in writing. Originally, the commonly used Flower and Hayes model [
40
] describes writing as an
iterative process comprising the subprocesses planning, translating, and revising at dierent levels of granularity,
all internal to the writer. Hayes’ revised model [
53
] extends and adapts this model, e.g., by introducing the “Task
Environment” as an element. This also acknowledges the role of technology in transcription, but its position is still that
of a tool at the writer’s command. Bhat et al
. [13]
argue that, by now, suggestion systems surpass this passive role. In
fact, they found that suggestion systems inuence what people write, even when suggestions do not reect their own
opinions (see also [
60
]). Consequently, they propose an adaptation of Hayes’ model with interconnections between the
system and the text written so far. The active role of AI systems also has consequences for authorship: if the text is
inuenced by a system, can an author still take full responsibility?
Algorithm Exploitation. We found a large gap between the declaration of authorship in human-AI interaction when
compared to human-human interaction (Study 2). Relative to a human writer, our participants mentioned AI support
less often. This is in line with recent studies of algorithm exploitation. Karpus et al
. [62]
have found that while users of
automated vehicles are well aware of trac rules and the role of automated vehicles on the road, drivers are keen on
exploiting automated cars as compared to cars driven by human drivers. Thus, we nd that algorithm exploitation also
holds for text production with AI concerning ghost-authorship. Nevertheless, how algorithm exploitation develops and
how it can be mitigated still needs to be researched more closely.
22
The AI Ghostwriter Eect , 2023,
6.2 Designing With and For Perceived Authorship
The AI Ghostwriter Eect has implications for interaction design: Human-centered design for AI writing tools should
consider (perceived) authorship as a design dimension. A fundamental next step is to gain more insight into the relevant
UI characteristics. These might include levels of control [
94
], roles in the writing process [
72
], features for reecting
on produced text, representation of AI (e.g. anthropomorphized or not, considering the results for AI versus human
ghostwriter in Study 2), and how users can attribute authorship after nishing writing (see Section 6.3).
For instance, as our UI comparison in Study 1 suggests, the UI would at least need to aord manual edits of AI-written
text to facilitate perceived authorship and control on the user’s part, if that is a design goal in a particular use case.
While Getting put the model in full control over writing the text, providing editing options shifted perceived control
and leadership towards the user, and descriptively also perceived ownership (slightly). Beyond our UIs here, displaying
more than one suggestion might also aect this perception if combined with editing, as users could then express more
“editorial decisions” overall. More generally, rich GUI components could be studied that allow users to make decisions
about the AI in advance or intermittently: For example, a GUI could oer item pickers to adjust the sentiment, scope, or
length of the text (cf. the UI of Grammarly
22
). A future study could examine how such controls over the generative
process (vs. the nal result) aect the eects observed here.
6.3 Implications for UI Design for Authorship Declaration
Besides implications on UI design for writing with AI, our ndings also motivate a closer look at UI design for attributing
authorship. In our studies, we used a free text eld (Study 1) and a drop-down list (Study 2) but the design space is
much larger: For instance, let us assume that a design goal is transparency about the use of AI text generation. In this
case, we might take inspiration from (early) mobile email “disclaimers” (e.g. “Written on the go, may contain typos” or
just “Sent from my iPhone”). Similarly, authors might end with an annotation such as “Written with model X”, which
would reveal AI inuence on the level of the whole text.
On a deeper level, such as sentences, interactions following the “details-on-demand” concept from information
visualization [
92
] might reveal more: For instance, news articles typically list author names. They could be extended
such that clicking on a name highlights the parts of the texts authored by that person. “AI generated” might then be
simply another option, besides human names. Note that this does not necessarily imply that the AI is presented as on
par with human authors. Therefore, declaring AI in the writing process should be designed on the user’s mental model.
6.4 Extending Contribution Frameworks for Human-AI Collaboration
We found that when authors are given the explicit option to add AI (Study 2), then they are more likely to do so than
when they are not (Study 1). Echoing this, researchers in recent publications struggle with how they should report text
generation [
70
] and policies vary from banning text generation, to mandatory labeling of AI-based text generation [
38
].
Our data cannot support this simple approach to the problem. For example, we found that perceived ownership heavily
varies with subjective controllability when using the AI model. Thus, a more continuous approach to AI declaration may
be needed to t the writer’s mental model. The CRediT taxonomy [
2
,
3
,
16
] for roles in authorship contribution serves
the very purpose of nely declaring author involvement in dierent stages of scientic manuscript production (e.g., A
wrote the draft, B supervised the study). Adapting frameworks like CRediT to suit the contributions of algorithms could
enrich the debate from a human-centered perspective grounded in common practices.
22https://app.grammarly.com/
23
, 2023, Draxler et al.
6.5 Limitations
Some limitations have to be taken into account with regard to the scenario, the personalization, and the interfaces used
in the studies. First, the scenario of writing a postcard could be judged as being somewhat articial and simple, and
ideally, future studies should replicate the AI Ghostwriter Eect in other writing tasks that require longer and more
nuanced texts. However, previous studies have often used academic writing tasks (e.g., [
1
,
79
]), where personalized
writing might not be as relevant, and where the context may not generalize with regard to authorship given that in
academic settings, there are practices and rules in place.
Second, we did only personalize the generated texts for half of the participants in Study 1 and not at all for Study
2. Moreover, the number of prompt-completion pairs used for ne-tuning was lower than what is recommended by
OpenAI
23
. One could thus assert that true personalization might produce dierent results. However, we found no
dierences between true and pseudo-personalization in terms of the sense of ownership and declared authorship.
This suggests that the AI Ghostwriter Eect does not depend on the quality of personalization but only on whether
participants can relate to the text generated. Likewise, some of the participants were not fully satised with the level of
personalization. Nevertheless, with the increase in the quality of personalization in AI-generated texts, the size of the
AI Ghostwriter Eect should further increase. Thus, we deem that the choice of pseudo-personalization does not limit
our results but rather species that the existence of the eect is not tied to the quality of personalization.
Third, we found strong dierences in author declaration between both studies. In Study 1 only about 20% of
participants mentioned AI in the free text eld underneath the postcard, in Study 2, about 70% mentioned AI when
choosing from a selection menu. Future studies may closely investigate how to design interfaces of authorship declaration
so that contributions of AI systems are declared. Here, future studies could also look at whether GPT or other LLMs
might be disclosed as tools used while writing, e.g., in the methods section of a scientic article.
Fourth, our studies can merely describe how people do declare authorship for personalized AI-generated texts. As
interdisciplinary eorts of cognitive science and HCI may develop a prescriptive framework that can recommend how
people should declare authorship when supported by AI, it remains open how people will apply these prescriptive
frameworks.
Fifth, for the sake of brevity the data analyzed in this study is purely quantitative and can thus only describe single
points of judgment, impressions, and behavior. A more qualitative approach to studying writing, following Bhat et al
.
[13]
, could paint a more nuanced picture with regard to the reasoning of participants in crediting or not crediting AI in
the writing of a personalized text.
Sixth, we focused on (pseudo-)personalized AI-generated texts as they parallel the process of ghostwriting. Therefore,
our results may be biased in favor of attributing authorship to oneself. Whether the discrepancy also holds for interaction
in non-personalized AI writing contexts still needs to be investigated.
7 CONCLUSION AND OUTLOOK
Our ndings based on two empirical studies with a total of 126 participants reveal the AI Ghostwriter Eect; we show
that people often do not disclose AI as an author when using personalized AI-generated texts in writing, although they
attribute ownership to the AI. This eect was independent of interaction methods and reduced when switching from
open-ended author declaration elds to predened response suggestions. Comparing human and AI ghostwriters, we
found attributing authorship to oneself to be more prevalent when writing with AI support. Given the dierences in
23https://platform.openai.com/docs/guides/ne- tuning
24
The AI Ghostwriter Eect , 2023,
perceived ownership and author declaration, we call for the development of authorship attribution frameworks that
take the user and their relation to the generative model into account.
REFERENCES
[1]
Liz Allen, Alison O’Connell, and Veronique Kiermer. 2019. How can we ensure visibility and diversity in research contributions? How the
Contributor Role Taxonomy (CRediT) is helping the shift from authorship to contributorship. Learned Publishing 32, 1 (Jan. 2019), 71–74.
https://doi.org/10.1002/leap.1210
[2]
Liz Allen, Alison O’Connell, and Veronique Kiermer. 2019. How can we ensure visibility and diversity in research contributions? How the
Contributor Role Taxonomy (CRediT) is helping the shift from authorship to contributorship. Learned Publishing 32, 1 (2019), 71–74.
[3]
Liz Allen, Jo Scott, Amy Brand, Marjorie Hlava, and Micah Altman. 2014. Publishing: Credit where credit is due. Nature 508, 7496 (2014), 312–313.
[4]
Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2018. Sentiment Bias in Predictive Text Recommendations Results in Biased Writing.
In Proceedings of the 44th Graphics Interface Conference (Toronto, Canada) (GI ’18). Canadian Human-Computer Communications Society, Waterloo,
CAN, 42–49. https://doi.org/10.20380/GI2018.07
[5]
Kenneth C. Arnold, Krysta Chauncey, and Krzysztof Z. Gajos. 2020. Predictive Text Encourages Predictable Writing. In Proceedings of the 25th
International Conference on Intelligent User Interfaces (Cagliari, Italy) (IUI ’20). Association for Computing Machinery, New York, NY, USA, 128–138.
https://doi.org/10.1145/3377325.3377523
[6]
Kenneth C. Arnold, Krzysztof Z. Gajos, and Adam T. Kalai. 2016. On Suggesting Phrases vs. Predicting Words for Mobile Text Composition. In
Proceedings of the 29th Annual Symposium on User Interface Software and Technology. ACM, Tokyo Japan, 603–608. https://doi.org/10.1145/2984511.
2984584
[7] Clark D. Asay. 2020. Independent Creation in a World of AI. FIU Law Review 14, 2 (Jan. 2020). https://doi.org/10.25148/lawrev.14.2.5
[8]
Rosa Maria Ballardini. 2019. AI-generated content: authorship and inventorship in the age of articial intelligence. In Online Distribution of Content
in the EU. Edward Elgar Publishing, 117–135. https://doi.org/10.4337/9781788119900.00015
[9]
Nikola Banovic, Ticha Sethapakdi, Yasasvi Hari, Anind K. Dey, and Jennifer Manko. 2019. The Limits of Expert Text Entry Speed on Mobile
Keyboards with Autocorrect. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and Services
(Taipei, Taiwan) (MobileHCI ’19). Association for Computing Machinery, New York, NY, USA, Article 15, 12 pages. https://doi.org/10.1145/3338286.
3340126
[10]
Christoph Bartneck, Dana Kulic, and Elizabeth Croft. 2017. Measuring the anthropomorphism, animacy, likeability, perceived intelligence, and
perceived safety of robots. (2017), 311700 Bytes. https://doi.org/10.6084/M9.FIGSHARE.5154805 Artwork Size: 311700 Bytes Publisher: gshare.
[11]
Ac Bd, Christine Bauer, and Afsaneh Doryab. 2016. Solving the Battle of First-Authorship: Using Interactive Technology to Highlight Contributions.
In Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, San Jose California USA, 609–620.
https://doi.org/10.1145/2851581.2892582
[12]
Joanna Bergström, Jarrod Knibbe, Henning Pohl, and Kasper Hornbæk. 2022. Sense of Agency and User Experience: Is There a Link? ACM
Transactions on Computer-Human Interaction 29, 4 (Aug. 2022), 1–22. https://doi.org/10.1145/3490493
[13]
Advait Bhat, Saaket Agashe, Niharika Mohile, Parth Oberoi, Ravi Jangir, and Anirudha Joshi. 2022. Studying writer-suggestion interaction: A
qualitative study to understand writer interaction with aligned/misaligned next-phrase suggestion. https://doi.org/10.48550/arXiv.2208.00636
arXiv:2208.00636 [cs].
[14]
Xiaojun Bi, Tom Ouyang, and Shumin Zhai. 2014. Both Complete and Correct? Multi-Objective Optimization of Touchscreen Keyboard. In
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Toronto, Ontario, Canada) (CHI ’14). Association for Computing
Machinery, New York, N Y, USA, 2297–2306. https://doi.org/10.1145/2556288.2557414
[15]
Olo C. Biermann, Ning F. Ma, and Dongwook Yoon. 2022. From Tool to Companion: Storywriters Want AI Writers to Respect Their Personal
Values and Writing Strategies. In Designing Interactive Systems Conference. ACM, Virtual Event Australia, 1209–1227. https://doi.org/10.1145/
3532106.3533506
[16]
Amy Brand, Liz Allen, Micah Altman, Marjorie Hlava, and Jo Scott. 2015. Beyond authorship: attribution, contribution, collaboration, and credit.
Learned Publishing 28, 2 (2015), 151–155.
[17]
Deborah Brandt. 2007. "Who’s the President?": Ghostwriting and Shifting Values in Literacy. College English 69, 6 (2007), 549–571. Publisher:
JSTOR.
[18]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish
Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler,
Jerey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner,
Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural
Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901.
https://proceedings.neurips.cc/paper/2020/le/1457c0d6bfcb4967418bfb8ac142f64a- Paper.pdf
[19]
Daniel Buschek, Benjamin Bisinger, and Florian Alt. 2018. ResearchIME: A Mobile Keyboard Application for Studying Free Typing Behaviour
in the Wild. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’18). ACM, New York, NY, USA. https:
25
, 2023, Draxler et al.
//doi.org/10.1145/3173574.3173829 event-place: Montreal, Quebec, CA.
[20]
Daniel Buschek, Martin Zürn, and Malin Eiband. 2021. The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition
Behaviour of Native and Non-Native English Writers. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama,
Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 732, 13 pages. https://doi.org/10.1145/3411764.3445372
[21]
Ana Caraban, Evangelos Karapanos, Daniel Gonçalves, and Pedro Campos. 2019. 23 Ways to Nudge: A Review of Technology-Mediated Nudging
in Human-Computer Interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI
’19). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300733
[22]
John Carvalho, Angie Chung, and Michael Koliska. 2021. Defying transparency: Ghostwriting from the Jazz Age to social media. Journalism 22, 3
(March 2021), 709–725. https://doi.org/10.1177/1464884918804700
[23]
Mia Xu Chen, Benjamin N. Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin Lu, Jackie Tsay, Yinan Wang, Andrew M. Dai, Zhifeng Chen,
Timothy Sohn, and Yonghui Wu. 2019. Gmail Smart Compose: Real-Time Assisted Writing. In Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining. ACM, Anchorage AK USA, 2287–2295. https://doi.org/10.1145/3292500.3330723
[24]
Mia Xu Chen, Benjamin N. Lee, Gagan Bansal, Yuan Cao, Shuyuan Zhang, Justin Lu, Jackie Tsay, Yinan Wang, Andrew M. Dai, Zhifeng Chen,
Timothy Sohn, and Yonghui Wu. 2019. Gmail Smart Compose: Real-Time Assisted Writing. In Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery & Data Mining (Anchorage, AK, USA) (KDD ’19). Association for Computing Machinery, New York, NY, USA,
2287–2295. https://doi.org/10.1145/3292500.3330723
[25]
Hyerim Cho, Chris Hubbles, and Heather Moulaison-Sandy. 2022. Individuals responsible for video games: an exploration of cataloging practice,
user need and authorship theory. Journal of Documentation 78, 6 (Oct. 2022), 1420–1436. https://doi.org/10.1108/JD-10- 2021-0198
[26]
John Joon Young Chung, Shiqing He, and Eytan Adar. 2021. The Intersection of Users, Roles, Interactions, and Technologies in Creativity Support
Tools. In Designing Interactive Systems Conference 2021. ACM, Virtual Event USA, 1817–1833. https://doi.org/10.1145/3461778.3462050
[27]
Larry D. Claxton. 2005. Scientic authorship. Mutation Research/Reviews in Mutation Research 589, 1 (Jan. 2005), 31–45. https://doi.org/10.1016/j.
mrrev.2004.07.002
[28] Ralph B d’Agostino. 1971. An omnibus test of normality for moderate and large size samples. Biometrika 58, 2 (1971), 341–348.
[29]
Robert Dale. 2021. GPT-3: What’s it good for? Natural Language Engineering 27, 1 (Jan. 2021), 113–118. https://doi.org/10.1017/S1351324920000601
[30]
Girish Dalvi, Shashank Ahire, Nagraj Emmadi, Manjiri Joshi, Anirudha Joshi, Sanjay Ghosh, Prasad Ghone, and Narendra Parmar. 2016. Does
Prediction Really Help in Marathi Text Input? Empirical Analysis of a Longitudinal Study. In Proceedings of the 18th International Conference on
Human-Computer Interaction with Mobile Devices and Services (Florence, Italy) (MobileHCI ’16). Association for Computing Machinery, New York,
NY, USA, 35–46. https://doi.org/10.1145/2935334.2935366
[31]
Hai Dang, Karim Benharrak, Florian Lehmann, and Daniel Buschek. 2022. Beyond Text Generation: Supporting Writers with Continuous Automatic
Text Summaries. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association
for Computing Machinery, New York, N Y, USA, Article 98, 13 pages. https://doi.org/10.1145/3526113.3545672
[32]
N Dehouche. 2021. Plagiarism in the age of massive Generative Pre-trained Transformers (GPT-3). Ethics in Science and Environmental Politics 21
(March 2021), 17–23. https://doi.org/10.3354/esep00195
[33]
Nicholas Diakopoulos, Kurt Luther, Yevgeniy (Eugene) Medynskiy, and Irfan Essa. 2007. The evolution of authorship in a remix society. In
Proceedings of the 18th conference on Hypertext and hypermedia - HT ’07. ACM Press, Manchester, UK, 133. https://doi.org/10.1145/1286240.1286272
[34]
Shiran Dudy, Steven Bedrick, and Bonnie Webber. 2021. Refocusing on Relevance: Personalization in NLG. (2021). https://doi.org/10.48550/ARXIV.
2109.05140 Publisher: arXiv Version Number: 1.
[35]
Mark Dunlop and John Levine. 2012. Multidimensional Pareto Optimization of Touchscreen Keyboards for Speed, Familiarity and Improved
Spell Checking. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA) (CHI ’12). Association for
Computing Machinery, New York, N Y, USA, 2669–2678. https://doi.org/10.1145/2207676.2208659
[36]
Lisa A. Elkin, Matthew Kay, James J. Higgins, and Jacob O. Wobbrock. 2021. An Aligned Rank Transform Procedure for Multifactor Contrast
Tests. In The 34th Annual ACM Symposium on User Interface Software and Technology. ACM, Virtual Event USA, 754–768. https://doi.org/10.1145/
3472749.3474784
[37]
Katherine Elkins and Jon Chun. 2020. Can GPT-3 Pass a Writer’s Turing Test? Journal of Cultural Analytics 5, 2 (Sept. 2020). https://doi.org/10.
22148/001c.17212
[38]
Annette Flanagin, Kirsten Bibbins-Domingo, Michael Berkwits, and Stacy L. Christiansen. 2023. Nonhuman “Authors” and Implications for
the Integrity of Scientic Publication and Medical Knowledge. JAMA 329, 8 (02 2023), 637–639. https://doi.org/10.1001/jama.2023.1344
arXiv:https://jamanetwork.com/journals/jama/articlepdf/2801170/jama_anagin_2023_ed_230004_1676659370.84192.pdf
[39]
Luciano Floridi and Massimo Chiriatti. 2020. GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines 30, 4 (Dec. 2020), 681–694.
https://doi.org/10.1007/s11023-020- 09548-1
[40]
Linda Flower and John R. Hayes. 1981. A Cognitive Process Theory of Writing. College Composition and Communication 32, 4 (Dec. 1981), 365.
https://doi.org/10.2307/356600
[41]
Andrew Fowler, Kurt Partridge, Ciprian Chelba, Xiaojun Bi, Tom Ouyang, and Shumin Zhai. 2015. Eects of Language Modeling and Its
Personalization on Touchscreen Typing Performance. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems
(Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 649–658. https://doi.org/10.1145/2702123.2702503
26
The AI Ghostwriter Eect , 2023,
[42]
Giorgio Franceschelli and Mirco Musolesi. 2022. Copyright in generative deep learning. Data & Policy 4 (2022), e17. https://doi.org/10.1017/dap.
2022.10
[43]
Tiany Derville Gallicano, Kevin Brett, and Toby Hopp. 2013. Is ghost blogging like speechwriting? A survey of practitioners about the ethics of
ghost blogging. Public Relations Journal 7, 3 (2013), 1–41.
[44]
Maliheh Ghajargar, Jerey Bardzell, and Love Lagerkvist. 2022. A Redhead Walks into a Bar: Experiences of Writing Fiction with Articial
Intelligence. In 25th International Academic Mindtrek conference. ACM, Tampere Finland, 230–241. https://doi.org/10.1145/3569219.3569418
[45]
Gene V Glass, Percy D Peckham, and James R Sanders. 1972. Consequences of failure to meet assumptions underlying the xed eects analyses of
variance and covariance. Review of educational research 42, 3 (1972), 237–288.
[46]
Mitchell Gordon, Tom Ouyang, and Shumin Zhai. 2016. WatchWriter: Tap and Gesture Typing on a Smartwatch Miniature Keyboard with Statistical
Decoding. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for
Computing Machinery, New York, N Y, USA, 3817–3821. https://doi.org/10.1145/2858036.2858242
[47]
Tanya Goyal, Junyi Jessy Li, and Greg Durrett. 2022. News Summarization and Evaluation in the Era of GPT-3. (2022). https://doi.org/10.48550/
ARXIV.2209.12356 Publisher: arXiv Version Number: 1.
[48] Andreas Graefe. 2016. Guide to Automated Journalism. (2016). https://doi.org/10.7916/D80G3XDJ Publisher: Columbia University.
[49]
Peter C Gøtzsche, Asbjørn Hróbjartsson, Helle Krogh Johansen, Mette T Haahr, Douglas G Altman, and An-Wen Chan. 2007. Ghost Authorship in
Industry-Initiated Randomised Trials. PLoS Medicine 4, 1 (Jan. 2007), e19. https://doi.org/10.1371/journal.pmed.0040019
[50]
Peter C Gøtzsche, Jerome P Kassirer, Karen L Woolley, Elizabeth Wager, Adam Jacobs, Art Gertel, and Cindy Hamilton. 2009. What Should Be Done
To Tackle Ghostwriting in the Medical Literature? PLoS Medicine 6, 2 (Feb. 2009), e1000023. https://doi.org/10.1371/journal.pmed.1000023
[51]
Michael R Harwell, Elaine N Rubinstein, William S Hayes, and Corley C Olds. 1992. Summarizing Monte Carlo results in methodological research:
The one-and two-factor xed eects ANOVA cases. Journal of educational statistics 17, 4 (1992), 315–339.
[52]
Marc Hassenzahl and Andrew Monk. 2010. The Inference of Perceived Usability From Beauty. Human-Computer Interaction 25, 3 (July 2010),
235–260. https://doi.org/10.1080/07370024.2010.500139
[53]
John R. Hayes. 2012. Modeling and Remodeling Writing. Written Communication 29, 3 (July 2012), 369–388. https://doi.org/10.1177/
0741088312451260 Publisher: SAGE Publications Inc.
[54] D. Jeery Higginbotham. 1992. Evaluation of keystroke savings across ve assistive communication technologies. Augmentative and Alternative
Communication 8, 4 (Jan. 1992), 258–272. https://doi.org/10.1080/07434619212331276303
[55]
Joo-Wha Hong, Ignacio Cruz, and Dmitri Williams. 2021. AI, you can drive my car: How we evaluate human drivers vs. self-driving cars. Computers
in Human Behavior 125 (Dec. 2021), 106944. https://doi.org/10.1016/j.chb.2021.106944
[56]
Joo Wha Hong, Qiyao Peng, and Dmitri Williams. 2021. Are you ready for articial Mozart and Skrillex? An experiment testing expectancy
violation theory and AI music. New Media & Society 23, 7 (July 2021), 1920–1935. https://doi.org/10.1177/1461444820925798
[57]
Anna Y.Q. Huang, Owen H.T. Lu, and Stephen J.H. Yang. 2023. Eects of articial Intelligence–Enabled personalized recommendations on
learners’ learning engagement, motivation, and outcomes in a ipped classroom. Computers & Education 194 (March 2023), 104684. https:
//doi.org/10.1016/j.compedu.2022.104684
[58]
Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Homan,
Monica Dinculescu, and Douglas Eck. 2018. Music Transformer. (2018). https://doi.org/10.48550/ARXIV.1809.04281 Publisher: arXiv Version
Number: 3.
[59]
Jani Ihalainen. 2018. Computer creativity: articial intelligence and copyright. Journal of Intellectual Property Law & Practice 13, 9 (Sept. 2018),
724–728. https://doi.org/10.1093/jiplp/jpy031
[60]
Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, and Mor Naaman. 2023. Co-Writing with Opinionated Language Models Aects
Users’ Views. https://doi.org/10.1145/3544548.3581196 arXiv:2302.00560 [cs].
[61]
Anjuli Kannan, Karol Kurach, Sujith Ravi, Tobias Kaufmann, Andrew Tomkins, Balint Miklos, Greg Corrado, Laszlo Lukacs, Marina Ganea, Peter
Young, and Vivek Ramavajjala. 2016. Smart Reply: Automated Response Suggestion for Email. In Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York,
NY, USA, 955–964. https://doi.org/10.1145/2939672.2939801
[62]
Jurgis Karpus, Adrian Krüger, Julia Tovar Verba, Bahador Bahrami, and Ophelia Deroy. 2021. Algorithm exploitation: Humans are keen to exploit
benevolent AI. iScience 24, 6 (June 2021), 102679. https://doi.org/10.1016/j.isci.2021.102679
[63]
Matthew Kay, Lisa A. Elkin, James J. Higgins, and Jacob O. Wobbrock. 2021. mjskay/ARTool: ARTool 0.11.0. https://doi.org/10.5281/ZENODO.594511
[64]
Tom Kent. 2019. An ethical checklist for robot journalism. https://medium.com/@tjrkent/an-ethical- checklist- for-robot-journalism- 1f41dcbd7be2
[65]
Dongwhan Kim and Joonhwan Lee. 2019. Designing an Algorithm-Driven Text Generation System for Personalized and Interactive News Reading.
International Journal of Human–Computer Interaction 35, 2 (Jan. 2019), 109–122. https://doi.org/10.1080/10447318.2018.1437864
[66]
Sangmi Kim, Seong-Gyu Kim, Yoonsin Jeon, Soojin Jun, and Jinwoo Kim. 2016. Appropriate or Remix? The Eects of Social Recognition and
Psychological Ownership on Intention to Share in Online Communities. Human–Computer Interaction 31, 2 (March 2016), 97–132. https:
//doi.org/10.1080/07370024.2015.1022425
[67]
Joanna Kisker, Thomas Gruber, and Benjamin Schöne. 2021. Virtual reality experiences promote autobiographical retrieval mechanisms: Electro-
physiological correlates of laboratory and virtual experiences. Psychological Research 85, 7 (Oct. 2021), 2485–2501. https://doi.org/10.1007/s00426-
020-01417- x
27
, 2023, Draxler et al.
[68]
Thomas Kosch, Robin Welsch, Lewis Chuang, and Albrecht Schmidt. 2022. The Placebo Eect of Articial Intelligence in Human-Computer
Interaction. ACM Transactions on Computer-Human Interaction (June 2022), 3529225. https://doi.org/10.1145/3529225
[69]
Per Ola Kristensson and Keith Vertanen. 2014. The inviscid text entry rate and its application as a grand goal for mobile text entry. In Proceedings
of the 16th international conference on Human-computer interaction with mobile devices & services (MobileHCI ’14). Association for Computing
Machinery, Toronto, ON, Canada, 335–338. https://doi.org/10.1145/2628363.2628405
[70]
Tiany H. Kung, Morgan Cheatham, ChatGPT, Arielle Medenilla, Czarina Sillos, Lorie De Leon, Camille Elepaño, Maria Madriaga, Rimel Aggabao,
Giezel Diaz-Candido, James Maningo, and Victor Tseng. 2022. Performance of ChatGPT on USMLE: Potential for AI-Assisted Medical Education Using
Large Language Models. preprint. Medical Education. https://doi.org/10.1101/2022.12.19.22283643
[71]
Mina Lee, Percy Liang, and Qian Yang. 2022. CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model
Capabilities. In CHI Conference on Human Factors in Computing Systems. ACM, New Orleans LA USA, 1–19. https://doi.org/10.1145/3491102.3502030
[72]
Florian Lehmann, Niklas Markert, Hai Dang, and Daniel Buschek. 2022. Suggestion Lists vs. Continuous Generation: Interaction Design for Writing
with Generative Models on Mobile Devices Aect Text Length, Wording and Perceived Authorship. In Mensch und Computer 2022. ACM, Darmstadt
Germany, 192–208. https://doi.org/10.1145/3543758.3543947
[73]
Angelica Lermann Henestrosa, Hannah Greving, and Joachim Kimmerle. 2023. Automated journalism: The eects of AI authorship and evaluative
information on the perception of a science journalism article. Computers in Human Behavior 138 (Jan. 2023), 107445. https://doi.org/10.1016/j.chb.
2022.107445
[74]
Lisa Lines. 2016. Ghostwriters guaranteeing grades? The quality of online ghostwriting services available to tertiary students in Australia. Teaching
in Higher Education 21, 8 (Nov. 2016), 889–914. https://doi.org/10.1080/13562517.2016.1198759
[75]
Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, and Weizhu Chen. 2021. What Makes Good In-Context Examples for
GPT-$3$? (2021). https://doi.org/10.48550/ARXIV.2101.06804 Publisher: arXiv Version Number: 1.
[76]
Li Lucy and David Bamman. 2021. Gender and Representation Bias in GPT-3 Generated Stories. In Proceedings of the Third Workshop on Narrative
Understanding. Association for Computational Linguistics, Virtual, 48–55. https://doi.org/10.18653/v1/2021.nuse-1.5
[77]
Tal Montal and Zvi Reich. 2017. I, Robot. You, Journalist. Who is the Author?: Authorship, bylines and full disclosure in automated journalism.
Digital Journalism 5, 7 (Aug. 2017), 829–849. https://doi.org/10.1080/21670811.2016.1209083
[78]
Richard D. Morey and Jerey N. Rouder. 2022. BayesFactor: Computation of Bayes Factors for Common Designs. https://CRAN.R-project.org/
package=BayesFactor R package version 0.9.12-4.4.
[79]
Magne Nylenna, Frode Fagerbakk, and Peter Kierulf. 2014. Authorship: attitudes and practice among Norwegian researchers. BMC Medical Ethics
15, 1 (Dec. 2014), 53. https://doi.org/10.1186/1472- 6939-15-53
[80] Siobhan O’Connor and ChatGPT. 2023. Open articial intelligence platforms in nursing education: Tools for academic progress or abuse? Nurse
Education in Practice 66 (Jan. 2023), 103537. https://doi.org/10.1016/j.nepr.2022.103537
[81]
Kseniia Palin, Anna Maria Feit, Sunjun Kim, Per Ola Kristensson, and Antti Oulasvirta. 2019. How do People Type on Mobile Devices? Observations
from a Study with 37,000 Volunteers. In Proceedings of the 21st International Conference on Human-Computer Interaction with Mobile Devices and
Services (MobileHCI ’19). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3338286.3340120
[82]
Fengjiao Peng, Veronica Crista LaBelle, Emily Christen Yue, and Rosalind W. Picard. 2018. A Trip to the Moon: Personalized Animated Movies
for Self-reection. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, Montreal QC Canada, 1–10. https:
//doi.org/10.1145/3173574.3173827
[83]
Jon L. Pierce, Tatiana Kostova, and Kurt T. Dirks. 2003. The State of Psychological Ownership: Integrating and Extending a Century of Research.
Review of General Psychology 7, 1 (March 2003), 84–107. https://doi.org/10.1037/1089- 2680.7.1.84
[84]
Peter Potash, Alexey Romanov, and Anna Rumshisky. 2015. Ghostwriter: Using an lstm for automatic rap lyric generation. In Proceedings of the
2015 Conference on Empirical Methods in Natural Language Processing. 1919–1924.
[85]
Philip Quinn and Shumin Zhai. 2016. A Cost-Benet Study of Text Entry Suggestion Interaction. In Proceedings of the 2016 CHI Conference on
Human Factors in Computing Systems (San Jose, California, USA) (CHI ’16). Association for Computing Machinery, New York, N Y, USA, 83–88.
https://doi.org/10.1145/2858036.2858305
[86]
Björn Rasch and Jan Born. 2013. About Sleep’s Role in Memory. Physiological Reviews 93, 2 (April 2013), 681–766. https://doi.org/10.1152/physrev.
00032.2012
[87]
Emily Reif, Daphne Ippolito, Ann Yuan, Andy Coenen, Chris Callison-Burch, and Jason Wei. 2021. A Recipe For Arbitrary Text Style Transfer with
Large Language Models. (2021). https://doi.org/10.48550/ARXIV.2109.03910 Publisher: arXiv Version Number: 4.
[88]
Linda A. Riley and Stuart C. Brown. 1996. Crafting a public image: An empirical study of the ethics of ghostwriting. Journal of Business Ethics 15, 7
(July 1996), 711–720. https://doi.org/10.1007/BF00381736
[89]
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kr Aberman. 2022. DreamBooth: Fine Tuning Text-to-Image
Diusion Models for Subject-Driven Generation. (2022). https://doi.org/10.48550/ARXIV.2208.12242 Publisher: arXiv Version Number: 1.
[90] Pamela Samuelson. 2020. AI authorship? Commun. ACM 63, 7 (June 2020), 20–22. https://doi.org/10.1145/3401718
[91]
Benjamin Schöne, Marlene Wessels, and Thomas Gruber. 2019. Experiences in Virtual Reality: a Window to Autobiographical Memory. Current
Psychology 38, 3 (June 2019), 715–719. https://doi.org/10.1007/s12144- 017-9648- y
[92]
Ben Shneiderman. 2003. The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations. In The Craft of Information Visualization,
BENJAMIN B. Bederson and BEN Shneiderman (Eds.). Morgan Kaufmann, San Francisco, 364–371. https://doi.org/10.1016/B978-155860915-
28
The AI Ghostwriter Eect , 2023,
0/50046-9
[93]
Ben Shneiderman. 2020. Bridging the Gap Between Ethics and Practice: Guidelines for Reliable, Safe, and Trustworthy Human-centered AI Systems.
ACM Transactions on Interactive Intelligent Systems 10, 4 (Dec. 2020), 1–31. https://doi.org/10.1145/3419764
[94]
Ben Shneiderman. 2020. Human-Centered Articial Intelligence: Reliable, Safe & Trustworthy. International Journal of Human–Computer Interaction
36, 6 (April 2020), 495–504. https://doi.org/10.1080/10447318.2020.1741118
[95]
Julia Silge and David Robinson. 2016. tidytext: Text Mining and Analysis Using Tidy Data Principles in R. Journal of Open Source Software 1, 3
(2016), 37. https://doi.org/10.21105/joss.00037
[96]
Nikhil Singh, Guillermo Bernal, Daria Savchenko, and Elena L. Glassman. 2022. Where to Hide a Stolen Elephant: Leaps in Creative Writing with
Multimodal Machine Intelligence. ACM Transactions on Computer-Human Interaction (Feb. 2022), 3511599. https://doi.org/10.1145/3511599
[97]
Bob L. T. Sturm, Maria Iglesias, Oded Ben-Tal, Marius Miron, and Emilia Gómez. 2019. Articial Intelligence and Music: Open Questions of
Copyright Law and Engineering Praxis. Arts 8, 3 (Sept. 2019), 115. https://doi.org/10.3390/arts8030115
[98]
Reid Swanson and Andrew S. Gordon. 2012. Say Anything: Using Textual Case-Based Reasoning to Enable Open-Domain Interactive Storytelling.
ACM Transactions on Interactive Intelligent Systems 2, 3 (Sept. 2012), 1–35. https://doi.org/10.1145/2362394.2362398
[99]
Adam Tapal, Ela Oren, Reuven Dar, and Baruch Eitam. 2017. The Sense of Agency Scale: A Measure of Consciously Perceived Control over One’s
Mind, Body, and the Immediate Environment. Frontiers in Psychology 8 (Sept. 2017), 1552. https://doi.org/10.3389/fpsyg.2017.01552
[100]
The PLoS Medicine Editors. 2009. Ghostwriting: The Dirty Little Secret of Medical Publishing That Just Got Bigger. PLoS Medicine 6, 9 (Sept. 2009),
e1000156. https://doi.org/10.1371/journal.pmed.1000156
[101]
Alexey Tikhonov and Ivan P Yamshchikov. 2018. Guess who? Multilingual approach for the automated generation of author-stylized poetry. In
2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, Athens, Greece, 787–794. https://doi.org/10.1109/SLT.2018.8639573
[102]
Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, and Dongwon Lee. 2021. TURINGBENCH: A Benchmark Environment for Turing Test in the Age
of Neural Text Generation. (2021). https://doi.org/10.48550/ARXIV.2109.13296 Publisher: arXiv Version Number: 1.
[103]
Kristen Vaccaro, Dylan Huang, Motahhare Eslami, Christian Sandvig, Kevin Hamilton, and Karrie Karahalios. 2018. The Illusion of Control: Placebo
Eects of Control Settings. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, Montreal QC Canada, 1–13.
https://doi.org/10.1145/3173574.3173590
[104]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is
All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,
and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/le/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[105]
Keith Vertanen, Haythem Memmi, Justin Emge, Shyam Reyal, and Per Ola Kristensson. 2015. VelociTap: Investigating Fast Mobile Text Entry using
Sentence-Based Decoding of Touchscreen Keyboard Input. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing
Systems (CHI ’15). Association for Computing Machinery, Seoul, Republic of Korea, 659–668. https://doi.org/10.1145/2702123.2702135
[106]
Ziwen Wang, Jie Wang, Haiqian Gu, Fei Su, and Bojin Zhuang. 2018. Automatic Conditional Generation of Personalized Social Media Short Texts.
In PRICAI 2018: Trends in Articial Intelligence, Xin Geng and Byeong-Ho Kang (Eds.). Vol. 11013. Springer International Publishing, Cham, 56–63.
https://doi.org/10.1007/978-3- 319-97310- 4_7 Series Title: Lecture Notes in Computer Science.
[107]
J. S. Wislar, A. Flanagin, P. B. Fontanarosa, and C. D. DeAngelis. 2011. Honorary and ghost authorship in high impact biomedical journals: a cross
sectional survey. BMJ 343, oct25 1 (Oct. 2011), d6128–d6128. https://doi.org/10.1136/bmj.d6128
[108]
Paweł W. Woźniak, Jakob Karolus, Florian Lang, Caroline Eckerth, Johannes Schöning, Yvonne Rogers, and Jasmin Niess. 2021. Creepy Technol-
ogy:What Is It and How Do You Measure It?. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. ACM, Yokohama
Japan, 1–13. https://doi.org/10.1145/3411764.3445299
[109]
Beta Writer. 2019. Lithium-Ion Batteries: A Machine-Generated Summary of Current Research. Springer International Publishing, Cham. https:
//doi.org/10.1007/978-3- 030-16800- 1
[110]
Diyi Yang and Lucie Flek. 2021. Towards User-Centric Text-to-Text Generation: A Survey. In Text, Speech, and Dialogue, Kamil Ekštein, František
Pártl, and Miloslav Konopík (Eds.). Vol. 12848. Springer International Publishing, Cham, 3–22. https://doi.org/10.1007/978- 3- 030-83527- 9_1 Series
Title: Lecture Notes in Computer Science.
[111]
Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International
Conference on Intelligent User Interfaces (IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. https://doi.org/10.1145/
3490099.3511105
A STUDY 1: FULL LIST OF MEASURES
29
, 2023, Draxler et al.
Table 4. Measures applied in Part Study 1. Unless otherwise mentioned, we used 7-point Likert scales (1 strongly disagree, 7 -
strongly agree). When necessary, questions were slightly modified to be applicable in the context of text generation (e.g. “used”
instead of “worn” for the creepiness scale).
Part 2a: Before the system interaction
Prior experience
Please list all previous experiences you have in writing with generated text
(none Writing with word or sentence suggestions Writing with auto-completion - Writing with auto-correction
- Using the smart reply feature - Other (+ text eld))
Are texts sometimes written on your behalf by other people, for example by an assistant at work who prepares
texts, presentations, or speeches for you? (Yes / No)
Mood
How is your mood today? (Rating: sad smiley 1 happy smiley 5)
Part 2b: Repeated for each interaction method (IM)
Sense of Ownership
Questions on perceived authorship and ownership derived from the authorship criteria dened by the Interna-
tional Committee of Medial Journal Editors24 and Nylenna et al. [79]:
I am the main contributor to the content of this postcard.
I have made substantial contributions to the content of the text.
I drafted this postcard.
I revised this postcard critically.
I have given nal approval of this text being uploaded.
I am accountable for all aspects of the text.
I am responsible for at least part of the text.
My name should appear underneath this postcard.
I feel like I am the author of the text. [72]
To whom should this text belong? (Slider: Me AI)
Sense of Leadership
I felt like I was writing the text and the articial intelligence was assisting me. [72]
I felt like the articial intelligence was writing the text and I was assisting. [72]
Who took the lead in writing during this interaction method? (Slider: Me AI)
24https://www.icmje.org/recommendations/browse/roles-and- responsibilities/dening-the-role- of-authors- and- contributors.html
30
The AI Ghostwriter Eect , 2023,
Sense of Control
Who was in control of the content? (Slider: Me AI)
Sense of Agency Scale [99]
I was just an instrument in the hands of the AI system.
The resulting text just happened without my intention.
I was the author of my actions while interacting with the system.
The consequences of my actions felt like they don’t logically follow my actions while interacting with the
system.
The outcomes of my actions generally surprised me.
While I am interacting with the system, I feel like I am a remote controlled robot.
I am completely responsible for the resulting text.
It felt like I was in control of the text during the task. [12]
I felt like the AI system was acting as a ghostwriter, writing the postcard on my behalf.
I felt like the AI system was acting as a tool which I could control.
Text match
I would actually send this postcard to my friends and family.
I would have written a similar postcard by myself.
The postcard mostly contains words and/or phrases that I usually use when writing in English. [72]
I am satised with the text. [72]
31
, 2023, Draxler et al.
User Experience
AttrakDi [52]
How much did you like this postcard? (Rating: 1-5 stars)
How hard was it to understand and use this method? (Very dicult - very easy)
I had fun while interacting with the system.
Open questions on the interaction method
How could this interaction method be improved?
What are the advantages of this method with respect to the writing interaction?
What are the disadvantages of this method with respect to the writing interaction?
Specic questions for individual interaction methods
Choosing: Would you have liked to have more than three texts to choose from?
Choosing: What would be a good number of texts to choose from?
Choosing: Why did you choose your selected postcard and not one of the two other postcards?
Choosing: Which strategies did you apply when selecting one of the postcards?
Editing: Did you want to change more words than which was possible in this task?
Editing: Number of words changed
Editing: Change logs (e.g., added, deleted, or changed words or punctuation)
Part 2c: Overall assessment after all IMs
Mood
The participants’ current mood [91]
Creepiness
The Perceived Creepiness of Technology Scale (PCTS) for the system in general [108]
Part 3: Two days after the interaction
Memory
Sleep and mood to identify possible impacts on the participants’ memory [86,91]
Text remembrance based on [67]
What do you recognize this text as? (Vividly remembered - Familiar - Rather unknown - Denitely unknown)
How was this text created? (Text was written by myself - Text was generated by AI and edited by me - Text
was generated by AI and chosen by me - Text was generated by AI - I don’t know)
Retrospective assessment of the postcard quality
How much did you like this postcard? (Rating: 1-5 stars)
The role of AI in writing
32
The AI Ghostwriter Eect , 2023,
Open questions on the interaction methods and the role of AI technology in general
Advantages of automatic text generation by an AI
Disadvantages/risks of automatic text generation by an AI
Use such technologies in the future
Use cases for AI to generate text
Should it be mandatory to mark texts that were created with the help of AI as such (Yes - No - Other) + open text
eld for reasons
Feedback
B STUDY 2: FULL LIST OF MEASURES
Table 5. Measures applied in Study 2. Unless otherwise mentioned, we used 7-point Likert scales. (1 strongly disagree, 7 - strongly
agree)
Part 1
Prior experience
Please list all previous experiences you have in writing with generated text
(none Writing with word or sentence suggestions Writing with auto-completion - Writing with auto-correction
- Using the smart reply feature - Other (+ text eld))
Are texts sometimes written on your behalf by other people, for example by an assistant at work who prepares
texts, presentations, or speeches for you? (Yes / No)
Attitudes towards AI
Perceived relevance and usefulness of AI [55]
AI is a positive force in the world.
AI research should be funded more.
AI is generally helpful.
There is a need to use AI.
Perceived AI competence [55]. How would you rate your condence in the following:
Explaining what articial intelligence is.
Having a conversation about articial intelligence.
My knowledge about articial intelligence.
Creativity of AI as assessed in [56]
I think AI can be creative on its own.
I believe AI can make something new by itself.
Products developed by AI can be considered as creative works.
33
, 2023, Draxler et al.
Part 2a: One iteration each for the human- and AI-generated text (counterbalanced)
Sense of Ownership
As in Study 1
Sense of Leadership
As in Study 1
Sense of Control
As in Study 1
Text match
Questions from Study 1
I felt like the text was written for me personally.
Part 2b: After both texts
Attitudes towards AI
Questions on relevance, usefulness, creativity of AI and participants’ AI competence as in Study 2, Part 1
Godspeed questionnaires on anthropomorphism and intelligence [10], as used in [73]
Open questions on authorship
How did you decide what author attribution should be added to the postcard written by <AI/Sasha>?
In
your opinion
, should it be mandatory to mark texts that were created with the help of an <AI/human> as
such?
Do you think that other people expect you to attribute <AI/Sasha> in your text? Why?
Preference
Which postcard did you like better? (The one written by the AI - The one written by Sasha - I’m indierent)
Received 20 February 2007; revised 12 March 2009; accepted 5 June 2009
34
... Nevertheless, other authors argue otherwise because ghostwriters have always been present in many different ways: in politics, business, academic environments and autobiographies, among others. (Layton-Turner 2023;Vázquez and Rennolds 2023;Draxler et al. 2024). Nevertheless, this is not an impediment or barrier for some authors to develop a strategy with ChatGPT as an assistant in writing books (Boucher 2023;Draxler et al. 2024). ...
... (Layton-Turner 2023;Vázquez and Rennolds 2023;Draxler et al. 2024). Nevertheless, this is not an impediment or barrier for some authors to develop a strategy with ChatGPT as an assistant in writing books (Boucher 2023;Draxler et al. 2024). In many cases, there is a discrepancy between the sense of ownership and declaration of authorship when working with personalized AI text generation. ...
... In a text-generation scenario, people often declare themselves as the author and not the AI who generated the text for them; this is the case even when they feel the ownership lies with the AI. In short, it has gone from a feeling of guilt or shame to an acceptance that has given normality to the phenomenon of ghostwriting (Draxler et al. 2024). ...
Article
Full-text available
The main objective of this research is to quantify for the first time the phenomenon of books authored by ChatGPT for sale through Amazon. To do this and with the help of a quantitative descriptive methodology, this work analyzes the evolution of books written by ChatGPT, and the characteristics of these works—typology, language, price, extension and type of edition—are determined. Also, this research studies those authors using ChatGPT and discusses the scope of this phenomenon, proposing measures and actions that promote transparency and good practices in the publishing industry regarding books written by artificial intelligence (AI) tools. Finally, this work concludes that ChatGPT is having a significant impact on the publishing world, from both the creation and distribution of content, so the appropriate balance between the use of AI and human intervention will continue to be essential to maintain a diverse and creative publishing industry.
... The advent of generative AI has opened exciting possibilities in professional art creation by generating high-quality outputs almost instantly. However, recent research [1,4,6,7] highlights several challenges that hinder its practical adoption by professional users: ...
Preprint
Full-text available
We present LACE, a hybrid Human-AI co-creative system integrated into Adobe Photoshop supporting turn-taking and parallel interaction modes for iterative image generation. Through a study with 21 participants across representational, abstract, and design tasks, we found turn-taking preferred in early stages for idea generation, and parallel modes suited for detailed refinement. While this shorter workshop paper provides key insights and highlights, the comprehensive findings and detailed analysis are presented in a longer version available separately on arXiv.
Preprint
Full-text available
We evaluated the performance of a large language model called ChatGPT on the United States Medical Licensing Exam (USMLE), which consists of three exams: Step 1, Step 2CK, and Step 3. ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement. Additionally, ChatGPT demonstrated a high level of concordance and insight in its explanations. These results suggest that large language models may have the potential to assist with medical education, and potentially, even clinical decision-making.
Conference Paper
Full-text available
Human creativity has been often aided and supported by artificial tools, spanning traditional tools such as ideation cards, pens, and paper, to computed and software. Tools for creativity are increasingly using artificial intelligence to not only support the creative process, but also to act upon the creation with a higher level of agency. This paper focuses on writing fiction as a creative activity and explores human-AI co-writing through a research product, which employs a natural language processing model, the Generative Pre-trained Transformer 3 (GPT-3), to assist the co-authoring of narrative fiction. We report on two progressive – not comparative – autoethnographic studies to attain our own creative practices in light of our engagement with the research product: (1) a co-writing activity initiated by basic textual prompts using basic elements of narrative and (2) a co-writing activity initiated by more advanced textual prompts using elements of narrative, including dialects and metaphors undertaken by one of the authors of this paper who has doctoral training in literature. In both studies, we quickly came up against the limitations of the system; then, we repositioned our goals and practices to maximize our chances of success. As a result, we discovered not only limitations but also hidden capabilities, which not only altered our creative practices and outcomes, but which began to change the ways we were relating to the AI as collaborator.
Article
Full-text available
Machine-generated artworks are now part of the contemporary art scene: they are attracting significant investments and they are presented in exhibitions together with those created by human artists. These artworks are mainly based on generative deep learning (GDL) techniques, which have seen a formidable development and remarkable refinement in the very recent years. Given the inherent characteristics of these techniques, a series of novel legal problems arise. In this article, we consider a set of key questions in the area of GDL for the arts, including the following: is it possible to use copyrighted works as training set for generative models? How do we legally store their copies in order to perform the training process? Who (if someone) will own the copyright on the generated data? We try to answer these questions considering the law in force in both the United States and the European Union, and potential future alternatives. We then extend our analysis to code generation, which is an emerging area of GDL. Finally, we also formulate a set of practical guidelines for artists and developers working on deep learning generated art, as well as some policy suggestions for policymakers.
Article
The flipped classroom approach is aimed at improving learning outcomes by promoting learning motivation and engagement. Recommendation systems can also be used to improve learning outcomes. With the rapid development of artificial intelligence (AI) technology, various systems have been developed to facilitate student learning. Accordingly, we applied AI-enabled personalized video recommendations to stimulate students' learning motivation and engagement during a systems programming course in a flipped classroom setting. We assigned students to control and experimental groups comprising 59 and 43 college students, respectively. The students in both groups received flipped classroom instruction, but only those in the experimental group received AI-enabled personalized video recommendations. We quantitatively measured students’ engagement based on their learning profiles in a learning management system. The results revealed that the AI-enabled personalized video recommendations could significantly improve the learning performance and engagement of students with a moderate motivation level.
Article
Sense of control is increasingly used as a measure of quality in human-computer interaction. Control has been investigated mainly at a high level, using subjective questionnaire data, but also at a low level, using objective data on participants’ sense of agency. However, it remains unclear how differences in higher level, experienced control reflect lower level sense of control. We study that link in two experiments. In the first one we measure the low-level sense of agency with button, touchpad, and on-skin input. The results show a higher sense of agency with on-skin input. In the second experiment, participants played a simple game controlled with the same three inputs. We find that on-skin input results in both increased sense and experience of control compared to touchpad input. However, the corresponding difference is not found between on-skin and button input, whereas the button performed better in the experiment task. These results suggest that other factors of user experience spill over to the experienced control at rates that overcome differences in the sense of control. We discuss the implications for using subjective measures about the sense of control in evaluating qualities of interaction.
Preprint
We propose a text editor to help users plan, structure and reflect on their writing process. It provides continuously updated paragraph-wise summaries as margin annotations, using automatic text summarization. Summary levels range from full text, to selected (central) sentences, down to a collection of keywords. To understand how users interact with this system during writing, we conducted two user studies (N=4 and N=8) in which people wrote analytic essays about a given topic and article. As a key finding, the summaries gave users an external perspective on their writing and helped them to revise the content and scope of their drafted paragraphs. People further used the tool to quickly gain an overview of the text and developed strategies to integrate insights from the automated summaries. More broadly, this work explores and highlights the value of designing AI tools for writers, with Natural Language Processing (NLP) capabilities that go beyond direct text generation and correction.
Article
Texts produced by artificial intelligence (AI) are becoming increasingly prevalent in digital journalism. Research suggests that these texts do not differ from human-written texts in their perceived credibility or trustworthiness where simple and short text types are concerned. However, it is unclear how AI-written texts beyond simple fact reporting are perceived. Therefore, this research aimed to expand upon the existing literature on automated journalism by investigating the influence of AI authorship (vs. human authorship) and evaluative information presentation (vs. neutral information presentation). The results of three preregistered experimental studies revealed no differences in perceived credibility and trustworthiness between AI-written and human-written texts. However, presenting information in an evaluative way decreased the perception of credibility and trustworthiness. Moreover, the AI was perceived as less anthropomorphic than the human author. The belief in the machine heuristic was stronger for an AI than for a human author, particularly when participants had actually read an article allegedly written by an AI. A pooled analysis across the data of all three studies underpinned the main effect of information presentation. Concluding, we discuss the findings against the background of AI perception theory and suggest implications for future research.