Table 1 - uploaded by Kenneth Holmqvist
Content may be subject to copyright.
Mean Frequencies and Percentages of Pauses Occurring at Different Locations in the Text, by Length of Pause Boundary at Which Pause Occurred 

Mean Frequencies and Percentages of Pauses Occurring at Different Locations in the Text, by Length of Pause Boundary at Which Pause Occurred 

Source publication
Article
Full-text available
Writers typically spend a certain proportion of time looking back over the text that they have written. This is likely to serve a number of different functions, which are currently poorly understood. In this article, we present two systems, ScriptLog+ TimeLine and EyeWrite, that adopt different and complementary approaches to exploring this activit...

Context in source publication

Context 1
... for keystrokes also report the nature of the in this way, was much more common at sentence boundar- ies and paragraph boundaries than between characters and words (mean percentage of boundary locations associated with pausing were, for character, word, sentence, and para- graph boundaries, 2%, 6%, 31%, and 45%, respectively). Also consistent with previous findings (Chanquoy, Foulin, & Fayol, 1996;Spelman Miller, 2000), and as Table 1 in- dicates, pause lengths tended to be greater for pauses oc- curring at sentence and paragraph boundaries. Figure 5 shows fixations occurring during pauses at different kinds of text location, by distance in number of words from pause location. ...

Similar publications

Chapter
Full-text available
Although writing pauses can be considered as main location of high-level processes, the latter can also occur in parallel with graphomotor execution. When a writer composes a text from source documents, the combined analysis of eye and pen movements makes it possible to identify some of these parallel processes and infer their nature. The present s...
Article
Full-text available
This study aimed to develop a reading profile of children from the third to seventh grade levels of elementary school. Fifty five children, between seven and 14 years of age, participated in the study. Four texts were previously developed by the researcher for reading evaluation - one composed of short words, another with long words, a third syntac...
Article
Full-text available
The thesis of this article is that Bereiter and Scardamalia's (1987) knowledge-telling strategy may be viewed as a family of strategies. In particular, when young writers compose expository themes from their own knowledge, they may use one of three writing strategies: a flexible-focus strategy, a fixed-topic strategy, or a topic-elaboration strateg...
Article
Full-text available
Face-to-face communication has several sources of contextual information that enables language comprehension. This information is used, for instance, to perceive mood of interlocutors, clarifying ambiguous messages. However, these contextual cues are absent in text-based communication. Emoticons have been proposed as cues used to stress the emotion...
Conference Paper
Full-text available
We describe the Annodis corpus of discourse structures for French. The corpus joins two perspectives on discourse on a variety of textual genres: a bottom-up approach and a top-down approach. The bottom-up view builds incrementally a structure from elementary discourse units, while the top-down view focuses on the selective annotation of multi-leve...

Citations

... Eye tracking was introduced as a method in cognitive writing research in the early 2000s, when different research groups combined eye tracking with keystroke logging (ScriptLog; Andersson et al., 2006. See also Wengelin et al., 2009 for an overview of the methodological challenges) and handwriting (Eye & Pen; Alamargot et al., 2006). Later solutions include EyeWrite (Simpson & Torrance, 2007), and combinations with Inputlog and eye tracking (Leijten & Van Waes, 2013). ...
... Some assumptions can further be made regarding gaze behavior and linguistic processing during writing: we can assume that when a familiar word is fixated, the reader has access to the word's morphology, syntax, and associated semantic information. If the reader fixates another word, the information of this word is instead available (cf. Wengelin et al., 2009). However, one important difference between reading in general and reading concurrently with one's own writing is that writers will know their own texts and that the reading therefore rarely occurs due to information acquisition. ...
... When eye tracking was first introduced in writing research, one challenge was to accurately align and synchronize the time feed from the gaze behavior with the time feed from text production. Some of the first attempts to combine writing and eye tracking solved this issue by developing integrated systems which allowed for automatic display and analyses: Eye & Pen, for handwriting (Alamargot et al., 2006); ScriptLog and EyeWrite, for keystroke logging (Wengelin et al., 2009;Simpson & Torrance, 2007); and recent examples including a web editor solution with CyWrite (Chukharev-Hudilainen et al., 2019). These attempts have proven useful in several studies (see below). ...
Chapter
This volume brings together the perspectives of new and established scholars who have connected with the broad fields of first language (L1) and second language (L2) writing to discuss critically key methodological developments and challenges in the study of L2 writing processes. The focus is on studies of composing and of engagement with feedback on written drafts, with particular attention to methods of process-tracing through data such as concurrent or stimulated verbal reports, interviews, diaries, digital recording, visual screen capture, eye tracking, keystroke logging, questionnaires, and/or ethnographic observation. The chapters in the book illustrate how progress has been made in developing research methods and empirical understandings of writing processes, in introducing methodological innovations, and in pointing to future methodological directions. It will be an essential methodological guide for novice and experienced researchers, senior students, and educators investigating the processes of writing in additional languages.
... While direct observation of the writing processes can provide information about what writers do (Wengelin et al., 2009), understanding what writers intend to do, their objectives, and their perceptions of the difficulties encountered during the process is not straightforward with direct observation (Baaijen & Galbraith, 2018;De Smet et al., 2014;Leijten & Van Waes, 2013), especially when writing complex texts (Sala-Bubaré et al., 2021). Understanding writers' goals, intentions, and reasons is crucial as writing is generally understood as a problem-solving and goal-directed activity (Bereiter & Scardamalia, 1987;Flower & Hayes, 1981) in historically and socially situated scenarios (Bazerman & Prior, 2003;Castelló, 2023;Castelló & Sala-Bubaré, 2023). ...
Article
Writing is a critical skill in many academic and professional contexts, and multilingual writers often struggle to learn and master it. Understanding the processes and products involved in writing in these contexts is crucial to design better interventions and resources to help writers succeed in their writing endeavors. Yet, writing studies exploring the writing processes in authentic communicative situations are still scarce, partly due to the complexity of natural writing processes. In the article, we present a pedagogically and methodologically innovative task to explore multilingual writers’ processes and products when writing authentic texts. The task combines a range of unintrusive instruments that allow us to observe the writing processes (keystroke logging and screen recorder), collect writers’ perceptions and goals (writing logs, survey, and discussion) and assess their text's evolution, an extended research article abstract. The analysis integrates all data sources into Episodes to understand how and why writing processes and texts evolve. In the article, we describe the task in detail and discuss the main pedagogical and methodological benefits, as well as the challenges and future lines for writing research and teaching.
... However, there often exists a disconnect between keystroke level logs and useful insight on cognitive processes that can be derived from it as the data is too finegrained. Complementary techniques such as eye-tracking and thinking-aloud protocols are often used in combination to capture additional context on the writing [22] [40]. In addition, newer graphic and statistical data analysis techniques offer new perspectives on the writing process. ...
Conference Paper
Full-text available
With the recent release of Chat-GPT by OpenAI, the automated text generation capabilities of GPT-3 are seen as transformative and potentially systemically disruptive for higher education. While the impact on teaching and learning practices is still unknown, it is apparent that alongside risks these tools offer the potential to augment human intelligence (intelligence augmentation, or IA). However, strategies for such IA, involving partnership of tool-human, will be needed to support learning. In the context of writing, an investigation of potential approaches is needed given empirical data and studies are currently limited. We introduce a novel visual representation CoAuthorViz to examine keystroke logs from a writing assistant where writers interacted with GPT-3 writing suggestions to co-write with the machine. We demonstrate the use of our visualization by exemplifying different kinds of writing behaviour from users writing with GPT-3 support and derive metrics such as their usage of GPT-3 suggestions in relation to overall writing quality indicators. We also release the materials open source to further progress our understanding of desirable user behaviour when working with such state-of-the-art AI tools.
... However, similar to programmers, writers objected to always produce complete, well-formed sentences, as this was not compatible with their writing habits. It also does not reflect the writing process as has been observed in various studies: authors often start revising a sentence before a complete first version of this sentence is finished [see 34,41,56]. Dale [12] predicted in 1997: ...
Preprint
Full-text available
Research on writing tools started with the increased availability of computers in the 1970s. After a first phase addressing the needs of programmers and data scientists, research in the late 1980s started to focus on writing-specific needs. Several projects aimed at supporting writers and letting them concentrate on the creative aspects of writing by having the writing tool take care of the mundane aspects using NLP techniques. Due to technical limitations at that time the projects failed and research in this area stopped. However, today's computing power and NLP resources make the ideas from these projects technically feasible; in fact, we see projects explicitly continuing from where abandoned projects stopped, and we see new applications integrating NLP resources without making references to those old projects. To design intelligent writing assistants with the possibilities offered by today's technology, we should re-examine the goals and lessons learned from previous projects to define the important dimensions to be considered.
... Lastly, keystroke logging records each character typed (or removed) by a writer to investigate writing in real time. This approach has the advantage of being unobtrusive on its own, but is often necessarily combined with other methods and measures, including protocol analysis [10], eye-tracking [11], fMRI, galvanic skin response (GSR) or EEG (see [7] for an overview). However, on its own, keystroke logging can be too fine-grained a measure in isolation: "it is often difficult to connect the fine grain of logging data to the underlying cognitive processes" [12] (p.358). ...
Article
Full-text available
Writing is a complex process at the center of much of modern human activity. Despite appearing to be a linear process, writing conceals many highly non-linear processes. Previous research has focused on three phases of writing: planning, translation and transcription, and revision. While research has shown these are non-linear, they are often treated linearly when measured. Here, we introduce measures to detect and quantify subcycles of planning (exploration) and translation (exploitation) during the writing process. We apply these to a novel dataset that recorded the creation of a text in all its phases, from early attempts to the finishing touches on a final version. This dataset comes from a series of writing workshops in which, through innovative versioning software, we were able to record all the steps in the construction of a text. 61 junior researchers in science wrote a scientific essay intended for a general readership. We recorded each essay as a writing cloud, defined as a complex topological structure capturing the history of the essay itself. Through this unique dataset of writing clouds, we expose a representation of the writing process that quantifies its complexity and the writer's efforts throughout the draft and through time. Interestingly, this representation highlights the phases of "translation flow", where authors improve existing ideas, and exploration, where creative deviations appear as the writer returns to the planning phase. These turning points between translation and exploration become rarer as the writing process progresses and the author approaches the final version. Our results and the new measures introduced have the potential to foster the discussion about the non-linear nature of writing and support the development of tools that can lead to more creative and impactful writing processes.
... One of the reasons for this is that in contrast to the static texts that are used in traditional reading research (for overviews, see Engbert et al., 2002;Rayner, 1998) the texts that emerge over time in writing are dynamic and typically comprise information that is constantly growing or changing, more or less unpredictably, as the writing task proceeds (cf. Alamargot et al., 2011;Wengelin et al., 2009). Thus, it is challenging for writing researchers to create one-to-one-relations between the writer's gaze direction and relevant information in the emerging text (e.g., a word) and this is particularly true when the text exceeds one screen and requires scrolling (e.g., de Smet et al., 2018;Torrance et al., 2016a;Wengelin et al., 2009). ...
... Alamargot et al., 2011;Wengelin et al., 2009). Thus, it is challenging for writing researchers to create one-to-one-relations between the writer's gaze direction and relevant information in the emerging text (e.g., a word) and this is particularly true when the text exceeds one screen and requires scrolling (e.g., de Smet et al., 2018;Torrance et al., 2016a;Wengelin et al., 2009). Therefore, the established method in traditional reading research, where gaze behaviour is analysed in respect to fixed areas of interest on a static text image presented on the computer screen, is not optimal for corresponding purposes in writing research. ...
... To solve this, researchers have developed and gradually refined software that combines eye tracking with keystroke logging or handwriting capture to allow inobtrusive collection of detailed real-time temporal data of writers' eye movement that enables them to link writers' visual behaviour to cognitive processes. Examples of such software include ScriptLog + TimeLine (Andersson, et al., 2006, Wengelin, et al., 2009, EyeWrite (Simpson & Torrance, 2007), Eye and Pen (Alamargot et al., 2006), Inputlog's merging function (Leijten & Van Waes, 2013), New ScriptLog (Wengelin et al., 2019), and CyWrite (Chukharev-Hudilainen et al., 2019). Synchronizing keystroke logging with eye tracking, or handwriting capture, enables researchers to observe how writers fixate and move their eyes between elements of their own emerging texts, and through that gain understanding of how gaze behaviour and typing interact during composition. ...
Article
Full-text available
Knowledge about writers’ eye movements and their effects on the writing process, and its product—the finally edited text—is still limited. Previous research has demonstrated that there are differences between reading texts written by someone else and reading one’s own emerging text and that writers frequently look back into their own texts (Torrance et al., 2016). For handwriting, Alamargot et al. (2007) found support that these lookbacks could occur in parallel with transcription, but to our knowledge this type of parallel processing has not been explored further, and definitely not in the context of computer writing. Considering that language production models are moving away from previous sequential or serial models (e.g., Levelt, 1989) towards models in which linguistic processes can operate in parallel (Olive, 2014), this is slightly surprising. In the present paper, we introduce a methodological approach to examine writers’ parallel processing in which we take our point of departure in visual attention rather than in the keystrokes. Capitalizing on New ScriptLog’s feature to link gaze with typing across different functional units in the writing task, we introduce and describe a method to capture and examine sequences of typing during fixations, outline how these can be examined in relation to each other, and test our approach by exploring typing during fixations in a text composition task with 14 competent adult writers.
... Lastly, keystroke logging records each character typed (or removed) by a writer to investigate writing in real time. This approach has the advantage of being unobtrusive on its own, but is often necessarily combined with other methods and measures, including protocol analysis [39], eye-tracking [40], fmri, galvanic skin response (gsr) or EEG (see [27] for an overview). However, on its own, keystroke logging can be too fine-grained a measure if used in isolation: "it is often difficult to connect the fine grain of logging data to the underlying cognitive processes" [44] (p.358). ...
Preprint
Full-text available
Writing is a complex process at the center of much of modern human activity. Despite it appears to be a linear process, writing conceals many highly non-linear processes. Previous research has focused on three phases of writing: planning, translation and transcription, and revision. While research has shown these are non-linear, they are often treated linearly when measured. Here, we introduce measures to detect and quantify subcycles of planning (exploration) and translation (exploitation) during the writing process. We apply these to a novel dataset that recorded the creation of a text in all its phases, from early attempts to the finishing touches on a final version. This dataset comes from a series of writing workshops in which, through innovative versioning software, we were able to record all the steps in the construction of a text. More than 60 junior researchers in science wrote a scientific essay intended for a general readership. We recorded each essay as a writing cloud, defined as a complex topological structure capturing the history of the essay itself. Through this unique dataset of writing clouds, we expose a representation of the writing process that quantifies its complexity and the writer's efforts throughout the draft and through time. Interestingly, this representation highlights the phases of "translation flow", where authors improve existing ideas, and exploration, where creative deviations appear as the writer returns to the planning phase. These turning points between translation and exploration become rarer as the writing process progresses and the author approaches the final version. Our results and the new measures introduced have the potential to foster the discussion about the non-linear nature of writing and support the development of tools that can support more creative and impactful writing processes.
... For example, findings from keystroke analyses suggest that text production (i.e., writing) often has bursts with more keyboard presses and pauses where nothing is written/pressed. The alternation between these two phases and the lengths of them can be informative of mental activity; some pauses may indicate thinking about what to say next, whereas others may signal evaluation or even disengagement (Bixler & D'Mello, 2013;Wengelin et al., 2009). ...
Article
Full-text available
Task-unrelated thought (TUT), commonly referred to as mind wandering, is a mental state where a person’s attention moves away from the task-at-hand. This state is extremely common, yet not much is known about how to measure it, especially during dyadic interactions. We thus built a model to detect when a person experiences TUTs while talking to another person through a computer-mediated conversation, using their keystroke patterns. The best model was able to differentiate between task-unrelated thoughts and task-related thoughts with a kappa of 0.363, using features extracted from a 15 second window. We also present a feature analysis to provide additional insights into how various typing behaviors can be linked to our ongoing mental states.
... However, pauses exist only temporarily during the writing process, which makes them difficult to study. To capture the writing process and make pauses tangible, keystroke logging is used (Sullivan & Lindgren, 2006;Wengelin, 2006;Wengelin et al., 2009). Keystroke logging is a technique to register the entire writing process by recording each keystroke and the time that elapses before, after and during its use (Leijten & Van Waes, 2013). ...
Article
Researchers often decide on the number of trials included in an experiment without adhering to an empirical method or framework. This might compromise generalizability and unnecessarily increase participant burden. In this article we want to put forward generalizability theory as a guide for task reduction. We will use a sentence production task to demonstrate how a generalizability and a decision study can help researchers to estimate the minimum number of trials and of items per trial that are necessary to generalize over trials. We obtained writing process data for 116 participants. Each of them completed a sentence production task that had 40 trials. Pause times between and within all words, target nouns and target verbs were logged with the keystroke logging tool ScriptLog. Results demonstrate that generalizability theory can serve as an empirical framework to ensure generalizable measurements on the one hand, and reduce participant burden to a minimum on the other. This finding is particularly valuable for studies with vulnerable target groups, such as participants suffering from aphasia, dyslexia or Alzheimer"s disease.
... Chanquoy et al., 1990;Matsuhashi, 1981;Schilperoord, 2001). Some of the relatively well established findings from such research include: (i) a relationship between fluency and writing quality (Alves & Limpo, 2015;Alves et al., 2008;Chenoweth & Hayes, 2001;Connelly et al., 2006;Medimorec & Risko, 2016;Medimorec et al., 2017;Olive et al., 2009); (ii) a relationship between the frequency and duration of pauses and units of the text-typically pauses are more frequent and longer at more global text boundaries, increasing as one moves up from within-word boundaries, through boundaries between words and sentences to paragraph boundaries (Alamargot et al., 2007;Baaijen et al., 2012;Medimorec & Risko, 2017;Spelman Miller, 2000;Wengelin et al, 2009); and (iii) a relationship between the complexity of the writing task and the frequency and duration of pauses (Beauvais et al., 2011;Medimorec & Risko, 2017;Van Hell et al., 2008). An important qualification here is that these relationships-particularly those between fluency and writing quality-may be strongly moderated by the age and experience of the writers. ...
Article
Full-text available
This paper argues that traditional threshold-based approaches to the analysis of pauses in writing fail to capture the complexity of the cognitive processes involved in text production. It proposes that, to capture these processes, pause analysis should focus on the transition times between linearly produced units of text. Following a review of some of the problematic features of traditional pause analysis, the paper is divided into two sections. These are designed to demonstrate: (i) how to isolate relevant transitions within a text and calculate their durations; and (ii) the use of mixture modelling to identify structure within the distributions of pauses at different locations. The paper uses a set of keystroke logs collected from 32 university students writing argumentative texts about current affairs topics to demonstrate these methods. In the first section, it defines how pauses are calculated using a reproducible framework, explains the distinction between linear and non-linear text transitions, and explains how relevant sections of text are identified. It provides Excel scripts for automatically identifying relevant pauses and calculating their duration. The second section applies mixture modelling to linear transitions at sentence, sub sentence, between-word and within-word boundaries for each participant. It concludes that these transitions cannot be characterised by a single distribution of “cognitive” pauses. It proposes, further, that transitions between words should be characterised by a three-component distribution reflecting lexical, supra-lexical and reflective processes, while transitions at other text locations can be modelled by two-component distributions distinguishing between fluent and less fluent or more reflective processing. The paper concludes by recommending that, rather than imposing fixed thresholds to distinguish processes, researchers should instead impose a common set of theoretically informed distributions on the data and estimate how the parameters of these distributions vary for different individuals and under different conditions.