Conference PaperPDF Available

Online Writing Data Representation: A Graph Theory Approach

Authors:

Abstract and Figures

There are currently several systems to collect online writing data in keystroke logging. Each of these systems provides reliable and very precise data. Unfortunately, due to the large amount of data recorded, it is almost impossible to analyze except for very limited recordings. In this paper, we propose a representation technique based upon graph theory that provides a new viewpoint to understand the writing process. The current application is aimed at representing the data provided by ScriptLog although the concepts can be applied in other contexts.
Content may be subject to copyright.
Online Writing Data Representation:
A Graph Theory Approach
Gilles Caporossi1and Christophe Leblay2
1GERAD and HEC Montr´eal, Montr´eal, Canada
2ITEM, Paris, France and SOLKI, Jyv¨askyl¨a, Finland
gilles.caporossi@gerad.ca,Christophe.Leblay@kolumbus.fi
Abstract. There are currently several systems to collect online writ-
ing data in keystroke logging. Each of these systems provides reliable
and very precise data. Unfortunately, due to the large amount of data
recorded, it is almost impossible to analyze except for very limited record-
ings. In this paper, we propose a representation technique based upon
graph theory that provides a new viewpoint to understand the writ-
ing process. The current application is aimed at representing the data
provided by ScriptLog although the concepts can be applied in other
contexts.
1 Introduction
The recent approaches of the study of writing based upon online record contrast
with those based upon paper versions. The later ones are oriented on the page
space and the former ones emphasis on the temporal dimension.
The models based upon online records have been developed in the 80s, with the
pioneering work of Matsuhashi [13]. Returning to a bipolar division, Matsuhashi
suggests distinguishing between the conceptual level (semantics, grammar and
spelling) and sequential plan (the planning and phrasing).
The work that will follow will be, for the vast majority of them, related to the
software for recording. Thus, the work of Ahlsen & Str¨omqvist [2] and Wengelin
[19] are directly related to software applications ScriptLog, those of Sullivan &
Lindgren [16] applications JEdit, those of Van Waes & Schellens [17], or Van
Waes & Leijten [18] software InPutLog, those of Jakobsen [9] Translog software,
and finally those of Chesnet & Alamargot [6] Eye and Pen software.
Without having to deny all work on the revision seen as a product (final
text), these new approaches are all pointing the finger that writing is primarily
a temporal activity. The multitude of software approaches developed for online
recording of the writing activity shows a clear interest in the study of the pro-
cess of writing. Would it be from the cognitive psychology or from the didactic
point of view, analyzing the writing activity as a process is very important for
researchers. What all these studies share is a common concern: that the record
of writing is associated with its representation. As the raw data collected by the
software is so detailed, it is difficult to analyze without preprocessing and with-
out a proper representation that may be used conveniently by the researcher.
J.Gama,E.Bradley,andJ.Hollm´en (Eds.): IDA 2011, LNCS 7014, pp. 80–89, 2011.
c
Springer-Verlag Berlin Heidelberg 2011
Online Writing Data Representation: A Graph Theory Approach 81
In this paper, we propose a new representation technique that visually allows
identification of basic operations (such as insertion, deletion, etc.), but also to
identify portions in the document according to the processing activity of the
writer. Each time this new representation technique was presented to psycholo-
gists or linguists it received a very positive feedback.
The paper is organized as follows : The next section describes the data in-
volved and the third exposes some visualizing techniques. In the fourth and fifth
section, we describe the proposed methodology and some transpositions from
classical linguistic transformations of texts. The concepts proposed in the paper
are illustrated by examples from the recording of the writing a short essay of 15
minutes under ScriptLog 1.4 (Mac version). This corpus composed entirely of
Finnish-speaking writers [10] includes novice writers (first year of college Finnish,
French, Core) and expert writers (university teachers).
2Data
Elementary events that are recorded by a software such as ScriptLog are key-
board keystrokes or mouse clicks, as shown on figure 1. They represent the basic
units that technically suffice to represent the whole writing process. Using them
with no previous treatment may however not be a convenient way to look at
the data. For instance, studying the pauses, their lengths and locations does not
require the same level of information as studying the text revision process. In
both cases, the same basic information is used but the researcher needs to apply
agregation to a different level depending on his needs. Preprocessing to the data
sometimes cannot be avoided. Given the large amount of data produced by the
system (the log file corresponding to the data that was recorded in 15 minutes
may yield up to 2000 lines), this preprocessing should, if possible, be automated
to avoid errors.
Fig. 1. Excerpt from a log file (log) obtained using ScriptLog
82 G. Caporossi and C. Leblay
3 Visualization Techniques
Visualization is an important part of this writing process but few visualization
techniques actually exists. One of those techniques is the so-called linear rep-
resentation in which any character written is displayed. Should a portion be
deleted, it is crossed out instead of being deleted in order to show the text pro-
duction in its process and not only as a final product. Cursor movements by
arrows or mouse are also identified so that it is possible to follow the text con-
struction process. An example of linear representation is given on figure 2. Such
a representation has the advantage to display the text but remains difficult to
understand in the case of a complex creation.
Fig. 2. Linear representation of a short 15 minutes text
Another way to visualize the creation process is based upon few values rep-
resenting the text produced so far. Au fil de la plume in Gen`ese du texte [5]
displays the position of the cursor as well as the total length of the text as a
function of time. Such an approach indicates zones where the writer modifies
already written text (see figure 3). This type of representation is also referred to
as ”GIS representation” and is used in various software such as InputLog [16].
One weakness of the ”GIS representation” is that the position of visible text
may not be correct as soon as an insertion or deletion occurs at a prior position.
The position of any character being altered, it is difficult to figure out the part
of the text involved by any subsequent modification. Another weak point is the
lack of reference to the text corresponding to points on the graphic.
4 Method:GraphRepresentation
To assess the problem of the moving position of written text after revision, we
propose here a slightly different approach in which each character is not described
by the absolute position when written. Instead, we use a relative position which
Online Writing Data Representation: A Graph Theory Approach 83
Fig. 3. Au fil de la plume - Gen`ese du Texte
proves to be more suitable to represent the dynamic aspect involved in the
writing activity. Sequences of keystroke are merged together to form an entity
involved in the conception of the text, which is represented by a node in the
graph. Should two nodes interact, either by a chronological or spatial relation,
they are joined by an edge, or link, showing this relation.
Graphs are mathematical tools based on nodes or vertices that are possibly
connected by links or edges. Some application fields may be more or less related
to graph theory. Chemistry is closely related to graph theory as some graph
theoretical results directly apply to chemistry [3][4]. Some other applications
refer to the algorithmic part underlying graph theory and networks, such as
transportation, scheduling and communication. Since the 1990s, graphs are also
used for the representation purpose in human sciences by the means of concept
maps [14]. In the present paper, we propose to use graph representation to
visualize a new kind of data.
Some examples of graph representations of the writing process are drawn on
figure 4, for a novice production, and figure 5, for an expert.
– The size of a vertex is related to the number of elementary operations it
represents. In the case of the novice writer, there are few large nodes, which
shows a higher frequency of errors or typos. The text corresponding to each
node could be displayed within the node, which would provide a representa-
tion close to the linear representation, but we decided not to do so here in
order to keep the representations as simple as possible.
– The structure of the graph is also very informative, the structure of the
graph of the novice is almost linear while a portion (in the middle of the
graph) of the graph of the expert is much more complicated. This complex
portion, between nodes 37 and 79 represents a part of the production that
was rewritten and changed on a higher level, clearly not just from the lexical
point of view.
84 G. Caporossi and C. Leblay
Fig. 4. Graph visualization : an example of novice writer
The key or mouse events found in the records are of three types: (i) additions
or insertions a character or a space (ii) deletions of characters or spaces, and
(iii) cursor moves by the mean of arrows or mouse. Spatially and temporally
contiguous sequences are merged and represented by the nodes of the graph.
4.1 Nodes
The size and color of each node is an indicator of the number of elementary
events it represents and their nature respectively. An addition that has later
been removed appears in yellow, an addition that remains until the final text
is drawn in red and deletions are displayed in blue. The final text thus appears
in red while modifications that do not appear in the final text are either yellow
or blue depending on their nature. The nodes are numbered according to their
creation sequence.
4.2 Links
The nodes are connected by links or edges representing a spatial or temporal
relation. The shape and color of edges indicate the nature of this relation. A solid
line represents the chronological link (solid lines draw a path from the node 0
to the last node thru all nodes in the chronological order). Other links between
nodes necessarily correspond to spatial relations. The link between an addition
node and its deletion counterpart is drawn in blue and the spatial link between
nodes that are part of the final text appear in red. Reading the content of nodes
Online Writing Data Representation: A Graph Theory Approach 85
Fig. 5. Graph visualization : an example of expert writer
along the red link will therefore display the whole text in its final version. Note
that the path describing the final text is composed of red nodes that are linked
by red links.
5 Analysis of Graphical Patterns
Different avenues are possible to the analysis of graphs and we will concentrate
here on the most useful ones. We will first identify patterns that correspond
to some classical operations involved in the writing process. From a technical
point of view, some operations will correspond to special subgraphs which could
easily be recognized. The identification of these subgraphs is useful to analyze
the graph as a representation of the writing process.
5.1 Additions and Insertions
Adding text could occur in three ways : (i) adding text at the end of the node
that is being written will not be represented by any special pattern; (ii) inserting
text in the node currently being written, but not at the end will cause this node
to be split and a triangle with a solid red line appears as illustrated on figure 6.
This solid red line is crossed in a way or the other depending if we follow the
spatial or the chronological order. (iii) Inserting text in a node that is already
written will cause this node to split and the corresponding configuration is shown
on figure 7. From the graphic and linguistic standpoints, insertions corresponds
86 G. Caporossi and C. Leblay
Fig. 6. Insertion in current node
Fig. 7. Insertion
to addition inside (ii - iii) while the addition is the development of the text at
its end (i).
5.2 Deletions
In the case of deletions as additions, different subgraphs are found depending if
we are erasing the end of the last node, a part of the last node or a part of a
node that was already written. The case of an immediate suppression (e.g. after
a typing error) is shown in figure 8 where the text from node 4 is immediately
removed by node 5, a deletion in the last node but not at its end is presented on
figure 9 where node 110 is removed by node 112. A delayed removal will result
in the subgraph shown in figure 10.
Fig. 8. Immediate deletion
5.3 Substitutions
In addition to these simple operations, some more complex operations may be
viewed as sequences of these simple operations, but they nevertheless correspond
to special subgraphs that may easily be recognized. For instance, replacement
Online Writing Data Representation: A Graph Theory Approach 87
Fig. 9. Deletion of a part of the last node
Fig. 10. Delayed elimination
may be viewed as a deletion immediately followed by an insertion at the same
place. Figure 11 represents the subgraph corresponding to a replacement in a
node that was already written. The replacement of a string in the last node, but
not at its end is shown on figure 12. The interpretation of the replacement at the
end of the last node is more complex because from a technical point of view, it
is impossible to know whether the addition comes instead of the deleted portion
or after. The deletion is not bounded and the addition may extend beyond the
replacement, which is difficult to identify. In this case, some other information
must be used by the researcher to interpret this sequence in a way or the other.
6 Summary and Future Research Directions
In this paper, we propose a new representation technique of written language
production in which the problem of moving text position is handled. This tech-
nique allows the researcher to easily identify the portion of the document the
writer is modifying. According to the reaction of researchers from linguistics
working on the writing process, this representation is easier to understand than
those previously available. An important aspect being the capability to visualize
modification patterns from a spacial and temporal point of view on the same
representation. It also seems that the intuition is more stimulated by a graph
representation than it could be by linear or GIS representations.
Some important aspects of the graph representation in writing need further
investigations.
Emphasis on the temporal aspect by inserting nodes corresponding to long
pauses (the definition of the minimum duration of a pause may be defined by
the user), or indicating the time and duration corresponding to each node.
– Distinguish the various levels of text improvements as defined by Faigley
and Witte [7] by distinguishing surface modification (correction of typos,
88 G. Caporossi and C. Leblay
Fig. 11. Delayed substitution
Fig. 12. Substitution in the last node
orthographical adjustment..) from text-based modification (reformulation,
syntactic..). A first step in this direction would be to differentiate nodes
involving more than a single word, which may be identified by the presence
of a space before and after visible characters. Indeed, a second step would be
to improve the qualification of the nature of the transformation represented
by a node. This last part requires tools from computational linguistic.
The graph drawing aspect is actually achieved by hand. Devising an algo-
rithm that would automatically place vertices in such a way that (i) patterns
are easy to recognize and (ii) the spatial aspect is preserved as much as pos-
sible, so that following the writing process remains easy.
References
1. Alamargot, D., Chanquoy, L.: Through the Models of Writing. Studies in Writing.
Kluwer Academic Pubishers, Dordrecht (2001)
2. Ahls´en, E., Str¨omqvist, S.: ScriptLog: A tool for logging the writing process and its
possible diagnostic use. In: Loncke, F., Clibbens, J., Arvidson, H., Lloyd, L. (eds.)
Argumentative and Alternative Communication: New Directions in Research and
Practice, pp. 144–149. Whurr Publishers, London (1999)
3. Caporossi, G., Cvetkovi´c, D., Gutman, I., Hansen, P.: Variable Neighborhood
Search for Extremal Graphs. 2. Finding Graphs with Extremal Energy. J. Chem.
Inf. Comput. Sci. 39, 984–996 (1999)
Online Writing Data Representation: A Graph Theory Approach 89
4. Caporossi, G., Gutman, I., Hansen, P.: Variable Neighborhood Search for Extremal
Graphs. 4. Chemical Trees with Extremal Connectivity Index. Computers and
Chemistry 23, 469–477 (1999)
5. Chenouf, Y., Foucambert, J., Violet, M.: Gen`ese du texte.Technical report 30802
- Institut National de Recherche P´edagogique (1996)
6. Chesnet, D., Alamargot, D.: Analyse en temps r´eel des activit´es oculaires et grapho-
motrices du scripteur: int´erˆetsdudispositifEyeandPen.Lann´ee Psychologique 32,
477–520 (2005)
7. Faigley, L., Witte, S.: Analysing revision. College composition and communica-
tion 32, 400–414 (1981)
8. Jakobsen, A.L.: Logging target text production with Translog. In: Hansen, G.
(ed.) Probing the Process in Translation. Methods and Results, Samfundslitter-
atur, Copenhagen, pp. 9–20 (1999)
9. Jakobsen, A.L.: Research Methods in Translation: Translog. In: Sullivan, K.P.H.,
Lindgren, E. (eds.) Computer Key-Stroke Logging and Writing: Methods and Ap-
plications, pp. 95–105. Elsevier, Amsterdam (2006)
10. Leblay, C.: Les invariants processuels. En de¸c`adubienetdumaecrire. Pratiques
143/144, 153–167 (2009)
11. Leijten, M., Van Waes, L.: Writing with speech recognition: the adaptation process
of professional writers. Interacting with Computers 17, 736–772 (2005)
12. Lindgren, E., Sullivan, K.P.H., Lindgren, U., Spelman Miller, K.: GIS for writing:
applying geographic information system techniques to data-mine writing’s cogni-
tive processes. In: Ri- Jlaarsdam, G. (series ed.), Torrance, M., Van Waes, L.,
Galbraith, D. (vol. eds.) Writing and Cognition: Research and Applications, pp.
83–96. Elsevier, Amsterdam (2007)
13. Matsuhashi, A.: Revising the plan and altering the text. In: Matsuhashi, A. (ed.)
Writing in Real Time, pp. 197–223. Ablex Publishing Corporation, Norwood (1987)
14. Novak, J.D.: Concept maps and Vee diagrams: Two metacognitive tools for science
and mathematics education. Instructional Science 19, 29–52 (1990)
15. Str¨omqvist, S., Karlsson, H.: ScriptLog for Windows - User’s manual. Technical
report - University of Lund: Department of Linguistic and University College of
Stavanger: Centre for Reading Research (2002)
16. Sullivan, K.P.H., Lindgren, E. (eds.): Computer Keystroke Logging and Writing:
Methods and Applications. Elsevier, Amsterdam (2006)
17. Van Waes, L., Schellens, P.J.: Writing profiles: The effect of the writing mode on
pausing and revision patterns of experienced writers. Journal of Pragmatics 35(6),
829–853 (2003)
18. Van Waes, L., Leijten, M.: Inputlog: New Perspectives on the Logging of On-Line
Writing Processes in a Windows Environment. In: Sullivan, K.P.H., Lindgren, E.
(eds.) Computer Key-Stroke Logging and Writing: Methods and Applications, pp.
73–93. Elsevier, Amsterdam (2006)
19. Wengelin, ¨
A.: Examining pauses in writing: Theories, methods and empirical data.
In: Sullivan, K.P.H., Lindgren, E. (eds.) Computer Key-Stroke Logging and Writ-
ing: Methods and Applications, pp. 107–130. Elsevier, Amsterdam (2006)
... Le tout premier programme proche des approches génétiques est Genèse du texte (Foucambert, 1992). Plusieurs autres travaux suivront : Doquet-Lacoste (2004) avec ce même programme, puis Leblay (2011) Caporossi et Leblay (2011, 2015 proposent une modélisation très simple utilisée déjà dans de nombreux domaines. À considérer l' efficacité des plans de métro avec lesquels nous nous déplaçons dans les transports en commun, il est tentant de copier ce mode de représentation de l'information pour l'appliquer à l' objet qui nous intéresse. ...
... 4a), la seconde, celle d'un expert ( fig. 4b) (Caporossi et Leblay, 2011, 2015. Afin de confirmer le degré d'expertise des cinq écritures monolingues pressenties comme expertes et de s'assurer que certaines écritures ont bien été des productions caractéristiques d'une expertise scripturale, toutes les copies ont été soumises à trois évaluateurs. ...
Article
Le choix méthodologique de ce travail pose la génétique textuelle comme discipline de référence ; celle-ci prend en compte prioritairement une description des phénomènes intrinsèques de scripturalité, bien avant de s’intéresser au seul produit final. Les concepts clés de la génétique textuelle (successivité, immédiateté, linéarité) sont réactualisés et transférables à de nouveaux corpus hétérogènes, collaboratifs, polygraphiques et professionnels. En ajoutant l’appareil conceptuel génétique, il devient possible de suivre et de visualiser le pas-à-pas des transformations : le retour critique sur sa propre écriture s’apprend, se raffine à l’aide de ressources visuelles. À partir de repères élémentaires en génétique textuelle, il s’agit de considérer les différentes manières d’étudier ce que la génétique nomme avant-texte, afin de pouvoir classer les principaux corpus génétiques, depuis les premiers corpus littéraires jusqu’aux plus récents, multimodaux, dont font partie les écritures professionnelles monolingue (rédaction) et plurilingue (traduction).
... One visualization technique that does represent word tokens can be seen in Fig. 2 (Caporossi and Leblay, 2011). While the discrete nodes show word tokens as well as the relationship between the order in which tokens were produced, this visualization fails to capture the temporal dynamics or varying rates of the LS Graph. ...
Preprint
Full-text available
TypeShift is a tool for visualizing linguistic patterns in the timing of typing production. Language production is a complex process which draws on linguistic, cognitive and motor skills. By visualizing holistic trends in the typing process, TypeShift aims to elucidate the often noisy information signals that are used to represent typing patterns, both at the word-level and character-level. It accomplishes this by enabling a researcher to compare and contrast specific linguistic phenomena, and compare an individual typing session to multiple group averages. Finally, although TypeShift was originally designed for typing data, it can easy be adapted to accommodate speech data, as well. A web demo is available at https://angoodkind.shinyapps.io/TypeShift/. The source code can be accessed at https://github.com/angoodkind/TypeShift.
... Document-level metrics (such as cohesion, and other linguistic measures) [5] do not distinguish slight changes made to a base text, and require finer grained measures for shorter texts. On the other hand, key strokes and character editing in writing which are used to visualize and study patterns of revision [9][10][11] are too finegrained to qualitatively study the actual changes made to the text. To meaningfully interpret what changes a student made to a given short text as a result of an intervention/ instruction, the need for automated visualizations to represent the process of drafting and revision at the sentence level arises. ...
Chapter
Full-text available
This paper introduces a novel technique of constructing Automated Revision Graphs (ARG) to facilitate the study of revisions in writing. ARG plots sentences of a written text as nodes, and their similarities to sentences from its previous draft as edges to visualize text as graph. Implemented in two forms: simple and multi-stage, the graphs demonstrate how sentence-level differences can be visualized in short texts to study revision products, processes, and student interaction with feedback in student writing.
... Finally, some scholars have examined ways to improve the analysis of keystroke logs or to combine keystroke log analysis with other methodologies to capture a richer description of the writing process. Major lines of work have included attempts to provide empirical evidence about the pause duration threshold that distinguishes execution from planning or evaluation behaviors (Chukharev-Hudilainen, 2014;Rosenqvist, 2015;Spelman Miller, 2002), exploration of more advanced (particularly statistical) methods for analyzing keystroke logs (Caporossi & Leblay, 2011;Chenu, Pellegrino, Jisa, & Fayol, 2014;Leblay & Caporossi, 2015;Perrin & Wildi, 2008;Van Waes & Leijten, 2015;Wallot & Grabowski, 2013), explorations of methodologies that combine keystroke logging with linguistic analysis (Macken, Hoste, Leijten, & Van Waes, 2012), and explorations of methodologies that combine keystroke logging with eye tracking (Andersson et al., 2006;Beers, Quinlan, & Harbaugh, 2010;Johansson, Wengelin, Johansson, & Holmqvist, 2010;Wengelin et al., 2009). ...
Technical Report
Full-text available
Writing process logs (keystroke logs) provide an excellent source of information about how writers have distributed their time and attention across the course of a writing task. However, relatively little is known about how the features that can readily be collected from such logs vary as a result of changes in writing task demands. In this study, we contrast the writing behavior of 463 8th-grade students in a school with low socioecomonic status in the western U.S. under 3 conditions: when they were copy typing (retyping an article), when they were drafting an essay, and when they were editing the essay they had drafted in a previous session. We observed striking differences in the characteristics of the resulting keystroke logs, reflecting differences in the mix of writing processes emphasized in each task. Copy typing was characterized by relatively slow typing, little time spent on long pause behaviors (except between words, when writers would have been scanning the text they were copying), and strictly local editing events. Drafting was characterized by relatively fluent typing, significant amounts of backspacing (reflecting false starts and sentence-level revision and editing), significantly longer pauses at sentence and word boundaries (reflecting idea generation and the process of translating ideas into words), and a moderate amount of time spent jumping to points within the most recently produced sentence to sentence in order to make edits to phrasing. Editing was characterized by very long pauses before jumping to another location to the text to make an edit, with very little time spent typing out individual words, phrases, or sentences. These differences indicate the importance of interpreting keystroke logs in the light of task demands and suggest that different features will be provide significant information about writers’ performance, depending on the writing processes most emphasized in a particular literacy task.
... Resource intensive manual observation and coding can be improved with advanced online trace data collection and analysis techniques to develop visualizations that represent the process of drafting and revision. To visualize modification patterns in an online document, Caporossi and Leblay [5] developed a graph theory approach to represent the movement of text through a document using log data of keystrokes and cursor movements from the document editing process. However, there is no evidence that educators would find keystroke-level data insightful for understanding revision patterns, nor that students would find this meaningful feedback to improve their writing. ...
Chapter
Text revision is regarded as an important process in improving written products. To study the process of revision activity from authentic classroom contexts, this paper introduces a novel visualization method called Revision Graph to aid detailed analysis of the writing process. This opens up the possibility of exploring the stages in students’ revision of drafts, which can lead to further automation of revision analysis for researchers, and formative feedback to students on their writing. The Revision Graph could also be applied to study the direct impact of automated feedback on students’ revisions and written outputs in stages of their revision, thus evaluating its effectiveness in pedagogic contexts.
... Despite their varied forms, WP visualizations typically convey information about the timing and location of writing behaviors of interest. They include written text marked up with special symbols, known as S-notation (Severinson Eklundh & Kollberg, 1996); timelines combining keystroke and eye-tracking information (Wengelin et al., 2009), progression diagrams showing number of revisions as a function of both timing and location (Perrin, 2003); dynamic, GIS-based representations (Lindgren, Sullivan, Lindgren, & Spellman Miller, 2007); and graph visualizations resembling network diagrams (Caporossi & Leblay, 2011). One issue in the design of WP visualizations is how to achieve sufficient amounts of macro-and micro-level detail so as to support analysis of larger trends (e.g., to see how much time was allocated to formulation versus revision) while also allowing text-level views (e.g., to see what type of revision was made at a particular point in time). ...
Article
Assessment for learning (AfL) seeks to support instruction by providing information about students’ current state of learning, the desired end state of learning, and ways to close the gap. AfL of second-language (L2) writing faces challenges insofar as feedback from instructors tends to focus on written products while neglecting most of the processes that gave rise to them, such as planning, formulation, and evaluation. Meanwhile, researchers studying writing processes have been using keystroke logging (KL) and eye-tracking (ET) to analyze and visualize process engagement. This study explores whether such technologies can support more meaningful AfL of L2 writing. Two Chinese L1 students studying at a U.S. university who served as case studies completed a series of argumentative writing tasks while a KL-ET system traced their processes and then produced visualizations that were used for individualized tutoring. Data sources included the visualizations, tutoring-session transcripts, the participants’ assessed final essays, and written reflections. Findings showed the technologies, in combination with the assessment dialogues they facilitated, made it possible to (1) position the participants in relation to developmental models of writing; (2) identify and address problems with planning, formulation, and revision; and (3) reveal deep-seated motivational issues that constrained the participants’ learning.
... Despite their varied forms, WP visualizations typically convey information about the timing and location of writing behaviors of interest. They include written text marked up with special symbols, known as S-notation (Severinson Eklundh & Kollberg, 1996); timelines combining keystroke and eye-tracking information (Wengelin et al., 2009), progression diagrams showing number of revisions as a function of both timing and location (Perrin, 2003); dynamic, GISbased representations (Lindgren et al., 2007); and graph visualizations resembling network diagrams (Caporossi & Leblay, 2011). One issue in the design of WP visualizations is how to achieve sufficient amounts of macro-and micro-level detail so as to support analysis of larger trends (e.g., to see how much time was allocated to formulation versus revision) while also allowing text-level views (e.g., to see what type of revision was made at a particular point in time). ...
Preprint
Full-text available
Assessment for learning (AfL) seeks to support instruction by providing information about students' current state of learning, the desired end state of learning, and ways to close the gap. AfL of second-language (L2) writing faces challenges insofar as feedback from instructors tends to focus on written products while neglecting most of the processes that gave rise to them, such as planning, formulation, and evaluation. Meanwhile, researchers studying writing processes have been using keystroke logging (KL) and eye-tracking (ET) to analyze and visualize process engagement. This study explores whether such technologies can support more meaningful AfL of L2 writing. Two Chinese L1 students studying at a U.S. university who served as case studies completed a series of argumentative writing tasks while a KL-ET system traced their processes and then produced visualizations that were used for individualized tutoring. Data sources included the visualizations, tutoring-session transcripts, the participants' assessed final essays, and written reflections. Findings showed the technologies, in combination with the assessment dialogues they facilitated, made it possible to (1) position the participants in relation to developmental models of writing; (2) identify and address problems with planning, formulation, and revision; and (3) reveal deep-seated motivational issues that constrained the participants' learning.
Chapter
The use of Massive Open Online Courses (MOOCs) is rapidly increasing due to the convenience and ease that provide to learners. However, MOOCs suffer from high drop out rate owing mostly to the confusion and frustration going with the learning process. Based on MOOCs discussion forums, this paper aims to explore different levels of confusion in specific concept using prerequisite based ontology for extracting relevant posts, and Bidirectional Encoder Representations from Transformers (BERT) classification algorithm to describe the degree of confusion for each post. The analysis of discussion posts from Stanford University dataset affirms the effectiveness of our model. BERT achieve good classification accuracy; this will help in early drop out detection and also facilitate future support for learners in confusion state.
Chapter
Personalized recommendation as a practical approach to overcoming information overloading has been widely used in e-learning. Based on learners individual knowledge level, we propose a new model that can predict learners needs for recommendation using dynamic graph-based knowledge tracing. By applying the Gated Recurrent Unit (GRU) and the Attention model, this approach designs a dynamic graph over different time steps. Through learning feature information and topology representation of nodes/learners, this model can predict with high accuracy of 80,63% learners with low knowledge acquisition and prepare them for further recommendation.
Book
Full-text available
Alamargot, D. & Chanquoy, L., (2001). Through the models of writing. Dordrecht-Boston-London : Kluwer Academic Publishers. Denis Alamargot and Lucile Chanquoy’s book offers a vivid and original presenta- tion of main trends in the research field devoted to writing. First, it provides both young and senior scientists with a comparative view of current theoretical models of composition, with different levels of reading made available: each element of these models is clearly situated in its historical context, and scrutinized in its further evo- lution. Second, this well documented theoretical analysis of writing mechanisms is checked against empirical data extracted from a lot of updated experimental studies; and lack of necessary data is thought to be underlined and defined when noted. Following the usual description of writing phases initially proposed by Hayes and Flowers, the first part of this book presents planning, translating and revision processes and compares them to other researchers’ conceptions (from Bereiter and Scardamalia, to Kellogg or Galbraith). Such presentations of isolated models do ex- ist in literature; but the present work really gives a good comparative analysis of components inside each of models, in a clear and cumulative way; a fine-grained ob- servation of differences between similarly-looking models is also performed.
Article
Full-text available
Among the different ways of conducting real time analysis of written productions, recording the variation of graphomotoric activity is particularly interesting because it is objective, non-intrusive and offers a continuous measure of the temporal aspects of writing. Stemming from research on oral production, this method can nevertheless be insufficient to assess some specific processes of writing. Based on the synchronized measurement of the graphomotor and ocular activities of the writer, the « Eye and pen » device offers a new framework to study the temporal characteristics of written composition. It becomes possible to improve investigations on processes engaged during the course of a pause as well as during a period of transcription. At an experimental level, this device will allow advances in the study of the visual component engaged during writing and of the functioning and dynamics of writing processes. At a methodological level, it allows already the study of handwriting in a multimedia computer environment (such as the upcoming screen-pad) .
Article
This chapter presents the use of the Geographical Information Systems (GIS) for data mining and visualising information about cognitive activities involved in writing. The information can be collected from various sources, such as keystroke logs, manual analysis of stimulated recall sessions and think-aloud protocols. After an introduction to the GIS, an English as a foreign language (EFL) writing session is used to explain how to create the various GIS layers from the different information/analysis sources, and show how they can be easily data mined using the GIS techniques to improve our understanding of the cognitive processes in writing. The illustrative graphs used to provide an insight into the methodology are based on keystroke-logged data, manual researcher-based analyses and coded stimulated recall data that were collected after the writing session. Also a tool for visualisation and data mining, the GIS technique can support analysis of the interaction of cognitive processes during writing focusing on the individual writer, differences between writers or the writing processes in general. Depending on the research question, GIS affords the possibility to aggregate data to the level of writers, de-aggregate data in any way chosen or display data as attributes of individuals. © 2007 by Elsevier Ltd. All rights of reproduction in any form reserved.
Article
Dans ce travail, il s’agit, a partir de la genetique du texte, de saisir un avant-texte numerise, au moyen de l’enregistrement tant des evenements d’ecriture que des operations d’ecriture, lesquels forment des invariants processuels, auxquels aucun scripteur n’echappe dans sa production ecrite. Ces invariants se situent en deca de pratiques pretendument bonnes ou mauvaises, en deca de pratiques expertes, ou novices, que celles-ci soient ecrites en langues maternelle, etrangere ou seconde. Le role du deja ecrit se presente alors comme un facteur determinant : il autorise de mettre en parallele les operations qui s’inscrivent a la suite du deja ecrit et les operations qui font retour dans le deja ecrit. Il devient alors possible de comparer des productions expertes a des productions novices. Ce qui semble les differencier est la maniere dont les scripteurs experimentes se positionnent face au texte deja ecrit, c’est-a-dire face au volume textuel en mouvement, construit sur les gestes d’ajout et suppression, eux-memes interpretes en operations d’ajouts, de suppressions, de remplacements et de deplacements. Ainsi, un scripteur experimente reviendra (bien) plus souvent dans son deja ecrit pour, principalement, y effectuer des ajouts et des remplacements