ArticlePDF Available

Abstract and Figures

ReaderBench is an automated software framework designed to support both students and tutors by making use of text mining techniques, advanced natural language processing, and social network analysis tools. ReaderBench is centered on comprehension prediction and assessment based on a cohesion-based representation of the discourse applied on different sources (e.g., textual materials, behavior tracks, metacognitive explanations, Computer Supported Collaborative Learning – CSCL – conversations). Therefore, ReaderBench can act as a Personal Learning Environment (PLE) which incorporates both individual and collaborative assessments. Besides the a priori evaluation of textual materials’ complexity presented to learners, our system supports the identification of reading strategies evident within the learners’ self-explanations or summaries. Moreover, ReaderBench integrates a dedicated cohesion-based module to assess participation and collaboration in CSCL conversations.
Content may be subject to copyright.
ReaderBench: An Integrated Cohesion-Centered
Framework
Mihai Dascalu
1()
, Larise L. Stavarache
1
, Philippe Dessus
2
, Stefan Trausan-Matu
1
,
Danielle S. McNamara
3
, and Maryse Bianco
2
1Computer Science Department, University Politehnica of Bucharest, Bucharest, Romania
mihai.dascalu@cs.pub.ro,larise.stavarache@ro.ibm.com,
stefan.trausan@cs.pub.ro
2LSE, Université Grenoble Alpes, Grenoble, France
{philippe.dessus,maryse.bianco}@upmf-grenoble.fr
3LSI, Arizona State University, Tempe, USA
dsmcnama@asu.edu
Abstract. ReaderBench is an automated software framework designed to
support both students and tutors by making use of text mining techniques,
advanced natural language processing, and social network analysis tools. Read
erBench is centered on comprehension prediction and assessment based on a
cohesion-based representation of the discourse applied on dierent sources (e.g.,
textual materials, behavior tracks, metacognitive explanations, Computer
Supported Collaborative Learning – CSCL – conversations). Therefore, Reader
Bench ca n ac t a s a Pe r so na l L ea r ni ng En v ir on me nt ( PL E) wh ic h in co rp o ra te s b ot h
individual and collaborative assessments. Besides the a priori evaluation of
textual materials’ complexity presented to learners, our system supports the iden
tification of reading strategies evident within the learners’ self-explanations or
summaries. Moreover, ReaderBench integrates a dedicated cohesion-based
module to assess participation and collaboration in CSCL conversations.
Keywords: Textual complexity assessment · Identification of reading strategies ·
Comprehension prediction · Participation and collaboration evaluation
1ReaderBenchs Purpose
Designed as support for both tutors and students, our implemented system, ReaderBench
[1, 2], can be best described as an educational learning helper tool to enhance the quality
of the learning process. ReaderBench is a fully functional framework that enhances
learning using various techniques such as textual complexity assessment [1, 2], voice
modeling for CSCL discourse analysis [3], topics modeling using Latent Semantic
Analysis and Latent Dirichlet Allocation [2], and virtual communities of practice anal
ysis [4]. Our system was developed building upon indices provided in renowned systems
such as E-rater, iSTART, and Coh-Metrix. However, ReaderBench provides an
© Springer International Publishing Switzerland 2015
G. Conole et al. (Eds.): EC-TEL 2015, LNCS 9307, pp. 505–508, 2015.
DOI: 10.1007/978-3-319-24258-3_47
integration of these systems. ReaderBench includes multi-lingual comprehension-
centered analyses focused on semantics, cohesion and dialogism [5]. For tutors, Read
erBench provides (a) the evaluation of reading material’s textual complexity, (b) the
measurement of social collaboration within a group endeavors, and (c) the evaluation
of learners’ summaries and self-explanations. For learners, ReaderBench provides (a)
the improvement of learning capabilities through the use of reading strategies, and (b)
the evaluation of students’ comprehension levels and performance with respect to other
students. ReaderBench maps directly onto classroom education, combining individual
learning methods with Computer Supported Collaborative Learning (CSCL) techniques.
2Envisioned Educational Scenarios
ReaderBench (RB) t argets both tutors and students by addressing individual and collab
orative learning methods through a cohesion-based discourse analysis and dialogical
discourse model [1]. Overall, its design is not meant to replace the tutor, but to act as
support for both tutors and students by enabling continuous assessment. Learners can
assess their self-explanations or collaborative contributions within chat forums. Tutors,
on the other hand, have the opportunity to analyze the proposed reading materials in
order to best match the student’s reading level. They can also easily grade student
summaries or evaluate students’ participation and collaboration within CSCL conver
sations. In order to better grasp the potential implementation of our system, the generic
learning ows behind ReaderBench, which are easily adaptable to a wide range of
educational scenarios, are presented in Figs. 1 and 2.
Fig. 1. Generic individual learning scenario integrating the use of ReaderBench (RB).
506 M. Dascalu et al.
Fig. 2. Generic collaborative learning scenario integrating the use of ReaderBench (RB).
3Validation Experiments
Multiple experiments have been performed, out of which only three are selected for brief
presentation. Overall, various input sources were used for validating ReaderBench as a
reliable educational software framework.
Experiment 1 [6] included 80 students between 8 and 11 years old (3
rd
5
th
grade),
uniformly distributed in terms of their age who were asked to explain what they under
stood from two French stories of about 450 words. The students’ oral self-explanations
and their summaries were recorded and transcribed. Additionally, the students
completed a posttest to assess their comprehension of the reading materials. The results
indicated that paraphrases and the frequency of rhetorical phrases related to metacog
nition and self-regulation (e.g., “il me semble”, “je ne sais”, je comprends”) and
causality (e.g., puisque”, à cause de”) were easier to identify than information or
events stemming from students’ experiences. Furthermore, cohesion with the initial text,
as well as specific textual complexity factors, increased accuracy for the prediction of
learners’ comprehension.
Experiment 2 [3] included 110 students who were each asked to manually annotate 3
chats out of 10 selected conversations. We opted to distribute the evaluation of each
conversation due to the high amount of time it takes to manually assess a single discus
sion (on average, users reported 1.5 - 4 h for a deep understanding). The results indicated
a reliable automatic evaluation of both participation and collaboration. We validated the
machine vs. human agreement by computing intra-class correlations between raters for
each chat (avg ICC
participation
= .97; avg ICC
collaboration
= .90) and non-parametric corre
lations to the automatic scores (avg Rho
participation
= .84; avg Rho
collaboration
= .7 4) . O ve ra ll ,
the validations supported the accuracy of the models built on cohesion and dialogism,
whereas the proposed methods emphasized the dialogical perspective of collaboration
in CSCL conversations.
ReaderBench: An Integrated Cohesion-Centered Framework 507
Experiment 3 [7] consisted of building a textual complexity model that was distributed
into five complexity classes and directly mapped onto five primary grade classes of the
French national education system. Multiclass Support Vector Machine (SVM) classifi
cations were used to assess exact agreement (EA = .733) and adjacent agreement (AA = .
933), indicating that the accuracy of classification was quite high. Starting from the
previously trained textual complexity model, a specific corpus comprising of 16 docu
ments was used to determine the alignment of each complexity factor to human compre
hension scores. As expected, textual complexity cannot be reflected in a single factor,
but through multiple categories. Although the 16 documents were classified within the
same complexity class, significant dierences for individual indices were observed.
In conclusion, we aim through ReaderBench to further explore and enhance the
learning and instructional experiences for both students and tutors. Our goal is to provide
more rapid assessment, encourage collaboration and expertise sharing, while tracking
the learners’ progress with the support of our integrated framework.
Acknowledgments. This research was partially supported by the 644187 RAGE H2020-
ICT-2014 and the 2008-212578 LTfLL FP7 projects, by the NSF grants 1417997 and 1418378
to ASU, as well as by the POSDRU/159/1.5/S/132397 and 134398 projects by ANR DEVCOMP
Project ANR-10-blan-1907-01. We are also grateful to Cecile Perret for her help in preparing this
paper.
References
1. Dascalu, M.: Analyzing Discourse and Text Complexity For Learning and Collaborating.
Studies in Computational Intelligence, vol. 534. Springer, Switzerland (2014)
2. Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S., Nardy, A.: Mining texts, learners
productions and strategies with Reader Bench. In: Peña-Ayala, A. (ed.) Educational Data
Mining: Applications and Trends, pp. 335–377. Springer, Switzerland (2014)
3. Dascalu, M., Trausan-Matu, Ş., Dessus, P.: Validating the automated assessment of
participation and of collaboration in chat conversations. In: Trausan-Matu, S., Boyer, K.E.,
Crosby, M., Panourgia, K. (eds.) ITS 2014. LNCS, vol. 8474, pp. 230–235. Springer,
Heidelberg (2014)
4. Nistor, N., Trausan-Matu, S., Dascalu, M., Duttweiler, H., Chiru, C., Baltes, B., Smeaton, G.:
Finding student-centered open learning environments on the internet. Comput. Hum. Behav.
47(1), 119–127 (2015)
5. Dascalu, M., Trausan-Matu, S., Dessus, P., McNamara, D.S.: Discourse cohesion: A signature
of collaboration. In: 5th International Learning Analytics and Knowledge Conference (LAK
2015), pp. 350–354. ACM, Poughkeepsie, NY (2015)
6. Dascalu, M., Dessus, P., Bianco, M., Trausan-Matu, S.: Are automatically identified reading
strategies reliable predictors of comprehension? In: Trausan-Matu, S., Boyer, K.E., Crosby,
M., Panourgia, K. (eds.) ITS 2014. LNCS, vol. 8474, pp. 456–465. Springer, Heidelberg (2014)
7. Dascalu, M., Stavarache, L.L., Trausan-Matu, S., Dessus, P., Bianco, M.: Reflecting
comprehension through French textual complexity factors. In: ICTAI 2014, pp. 615–619.
IEEE, Limassol, Cyprus (2014)
508 M. Dascalu et al.
... The online tool that we built using our ReaderBench framework [1][2][3] is meant also to support individuals in creating better, more adequate CVs. Our tool examines their CVs and provides general statistics, warnings and suggestions meant to guide them in enhancing the quality of their CV, while referring to visual aspects and textual content. ...
... Since none of the previous tools exhibited characteristics of interest for the envisioned in-depth analyses, we decided to create our own tool integrated with the already available ReaderBench framework [1][2][3], which will be presented in the next section. ...
... The main interest from this framework is its capability to determine similarity scores between two documents based on semantic models (i.e., Latent Semantic Analysis [9], Latent Dirichlet Allocation [10], and Word2Vec [11]) as well as on semantic distances within lexicalized ontologies such as WordNet [12]. ReaderBench uses Cohesion Network Analysis and has been successfully employed in a wide range of educational contexts [13]. Moreover, the ReaderBench framework makes extensive use of multi-threaded processing while performing the underlying NLP tasks. ...
... This study is an extension of the ReaderBench framework [9], which already integrates various lexical, syntactic, semantic and discourse-centered complexity indices that were defined in previous studies. Indices computed at the surface level relate to lexical components of the text (word, sentence and paragraph counts, sentence lengths, commas and other punctuation marks) and fluency. ...
Chapter
Full-text available
Writing quality is an important component in defining students’ capabilities. However, providing comprehensive feedback to students about their writing is a cumbersome and time-consuming task that can dramatically impact the learning outcomes and learners’ performance. The aim of this paper is to introduce a fully automated method of generating essay feedback in order to help improve learners’ writing proficiency. Using the TASA (Touchstone Applied Science Associates, Inc.) corpus and the textual complexity indices reported by the ReaderBench framework, more than 740 indices were reduced to five components using a Principal Component Analysis (PCA). These components may represent some of the basic linguistic constructs of writing. Feedback on student writing for these five components is generated using an extensible rule engine system, easily modifiable through a configuration file, which analyzes the input text and detects potential feedback at various levels of granularity: sentence, paragraph or document levels. Our prototype consists of a user-friendly web interface to easily visualize feedback based on a combination of text color highlighting and suggestions of improvement.
... It is based on the concept of Lifeline, but brings educational value by allowing students to input free text. The introduced text is analyzed using ReaderBench [18,19], a framework relying on advanced Natural Language Processing techniques. ...
Chapter
Full-text available
Storytelling has been a part of human interaction since language emerged. It was initially used to convey information, describe events and people, and afterwards evolved into presenting examples of good and bad behaviors. However, stories are not limited to the early stages of child development as they can be used even in university lectures. The game described in this paper brings the power of storytelling in the learning environment, enabling teachers to present lessons as interactive stories. In the context of our serious game, students test their knowledge by answering with free text inputs to the questions presented by the virtual assistant in a challenging and entertaining environment. The prototype version was tested by a group of 26 students who found the game concept very interesting and provided valuable feedback in terms of user experience and functionality.
... In terms of the structure of the paper, the following section provides general information about semantic similarity and introduces the five measures available in the ReaderBench framework (Dascalu, Dessus, Bianco, Trausan-Matu, & Nardy, 2014;Dascalu et al., 2015a;Dascalu et al., 2015b): Latent Semantic Analysis (LSA) (Landauer & Dumais, 1997), Latent Dirichlet Allocation (LDA) (Blei, Ng, & Jordan, 2003) and three WordNetbased distance functions : Leacock Chodorow (Leacock & Chodorow, 1998), Wu Palmer (Wu & Palmer, 1994 and path length (Budanitsky & Hirst, 2006). The third section presents the results of our analysis alongside statistical information about the chat corpus. ...
Conference Paper
Full-text available
The goal of our research is to compare novel semantic techniques for identifying implicit links between utterances in multi-participant CSCL chat conversations. Cohesion, reflected by the strength of the semantic relations behind the automatically identified links, is assessed using WordNet-based semantic distances, as well as unsupervised semantic models, i.e. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). The analysis is built on top of the ReaderBench framework and multiple identification heuristics were compared, including: semantic cohesion metrics, normalized cohesion measures and Mihalcea’s formula. A corpus of 55 conversations in which participants used explicit links between utterances where they considered necessary for clarity was used for validation. Our study represents an in-depth analysis of multiple methods used to identify implicit links and reveals the accuracy of each technique in terms of capturing the explicit references made by users. Statistical similarity measures ensured the best overall identification accuracy when using Mihalcea’s formula, while WordNet-based techniques provided best results for un-normalized similarity scores applied on a window of 5 utterances and a time frame of 1 minute.
... Advanced NLP services are being developed, including the analysis of unstructured learning materials of students' textual traces, automated essay grading, sentiment analysis, concept map elaboration or identification of reading strategies. Our framework, ReaderBench [2,3,4,5], comprises of advanced NLP techniques used to expose a wide variety of language services. We can consider our framework as being unique as it provides a unitary core engine centered on cohesion and on dialogism [6,7], the latter being reflected in the implemented polyphonic model [8]. ...
Conference Paper
Full-text available
In this paper we introduce the online version of our ReaderBench framework, which includes multilingual comprehension-centered web services designed to address a wide range of individual and collaborative learning scenarios, as follows. First, students can be engaged in reading a course material, then eliciting their understanding of it; the reading strategies component provides an in-depth perspective of comprehension processes. Second, students can write an essay or a summary; the automated essay grading component provides them access to more than 200 textual complexity indices covering lexical, syntax, semantics and discourse structure measurements. Third, students can start discussing in a chat or a forum; the Computer Supported Collaborative Learning (CSCL) component provides in-depth conversation analysis in terms of evaluating each member's involvement in the CSCL environments. Eventually, the sentiment analysis, as well as the semantic models and topic mining components enable a clearer perspective in terms of learner's points of view and of underlying interests.
Book
This book gathers contributions to the 3rd International Conference on Smart Learning Ecosystems and Regional Developments (SLERD 2018), held at Aalborg University, Denmark on 23–25 May 2018. What characterizes smart learning ecosystems? What is their role in city and regional development and innovation? How can we promote citizen engagement in smart learning ecosystems? These are some of the questions addressed at SLERD 2018 and documented in these proceedings, which include a diverse range of papers intended to help understand, conceive, and promote innovative human-centric design and development methods, education/training practices, informal social learning, and citizen-driven policies. The papers elaborate on the notion of smart learning ecosystems, assess the relation of smart learning ecosystems with their physical surroundings, and identify new resources for smart learning. SLERD 2018 contributes to foster the social innovation sectors, ICT and economic development and deployment strategies, as well as new policies for smarter, more proactive citizens. As such, these proceedings are relevant for researchers and policymakers alike.
Chapter
Since the beginning of the century two thirds of the Portuguese territory is threatened by desertification and the decline of economic activities. To face such trends, regions need to implement innovative strategies that leverage on the endogenous resources of the territory to foster economic recovery and to promote entrepreneurship, creativity, smart learning, and innovation. This paper reports on the study of the Bons Sons music festival as an example of an initiative developed in a low-density population area that mobilized their endogenous territorial resources to promote growth and economic development. The case study, which was based on the descriptive and qualitative analysis of a semi structured interview with the artistic director of the festival, aims at understanding the role of digital technologies in the process of regional innovation. The article contributes with an analytical view of community networks mediation practices and offers a set of tips and recommendations for the effective creation and consolidation of mediation strategies, community networks, and learning ecosystems that foster regional innovation.
Chapter
Full-text available
The aim of this study is to visualize the status quo of the research on information literacy via co-citation analysis. A total of 1326 papers with full bibliographic records were retrieved from Web of Science database as the sample. CiteSpaceV was used to conduct visualization analysis to build knowledge map by identifying the representative countries, research hotspots, evolution path and research frontiers in the field of information literacy.
Conference Paper
Full-text available
In order to build coherent textual representations, readers use cognitive procedures and processes referred to as reading strategies; these specific procedures can be elicited through self-explanations in order to improve understanding. In addition, when faced with comprehension difficulties, learners can invoke regulation processes, also part of reading strategies, for facilitating the understanding of a text. Starting from these observations, several automated techniques have been developed in order to support learners in terms of efficiency and focus on the actual comprehension of the learning material. Our aim is to go one step further and determine how automatically identified reading strategies employed by pupils with age between 8 and 11 years can be related to their overall level of understanding. Multiple classifiers based on Support Vector Machines are built using the strategies’ identification heuristics in order to create an integrated model capable of predicting the learner’s comprehension level.
Conference Paper
Full-text available
As Computer Supported Collaborative Learning (CSCL) becomes increasingly adopted as an alternative to classic educational scenarios, we face an increasing need for automatic tools designed to support tutors in the time consuming process of analyzing conversations and interactions among students. Therefore, building upon a cohesion-based model of the discourse, we have validated ReaderBench, a system capable of evaluating collaboration based on a social knowledge-building perspective. Through the inter-twining of different participants' points of view, collaboration emerges and this process is reflected in the identified cohesive links between different speakers. Overall, the current experiments indicate that textual cohesion successfully detects collaboration between participants as ideas are shared and exchanged within an ongoing conversation.
Article
Full-text available
As Computer Supported Collaborative Learning (CSCL) becomes increasingly adopted as an alternative to classic educational scenarios, we face an increasing need for automatic tools designed to support tutors in the time consuming process of analyzing conversations and interactions among students. Therefore, building upon a cohesion-based model of the discourse, we have validated ReaderBench, a system capable of evaluating collaboration based on a social knowledge-building perspective. Through the intertwining of different participants' points of view, collaboration emerges and this process is reflected in the identified cohesive links between different speakers. Overall, the current experiments indicate that textual cohesion successfully detects collaboration between participants as ideas are shared and exchanged within an ongoing conversation.
Article
Full-text available
The chapter introduces ReaderBench, a multi-lingual and flexible environment that integrates text mining technologies for assessing a wide range of learners' productions and for supporting teachers in several ways. ReaderBench offers three main functionalities in terms of text analysis: cohesion-based assessment, reading strategies identification and textual complexity evaluation. All of these have been subject to empirical validations. ReaderBench may be used throughout an entire educational scenario, starting from the initial complexity assessment of the reading materials, the assignment of texts to learners, the detection of reading strategies reflected in one's self-explanations, and comprehension evaluation fostering learner's self-regulation process.
Article
Full-text available
In every instructional situation, reading textual materials and writing down thoughts are the core activities that represent both causes (from learner’s viewpoint) and indicators of learning (from teacher’s viewpoint). Reading is a cognitive activity whose oral or written traces are usually analyzed by teachers in order to infer either learners’ comprehension or reading strategies. Hence reading and writing are core activities that every teacher has to assess on a daily basis. Reading materials have to be scaled or tailored to suit pupils’ actual level, and reading strategies have to be analyzed for inferring learners’ level of text processing and understanding.
Article
Full-text available
Research efforts in terms of automatic textual complexity analysis are mainly focused on English vocabulary and few adaptations exist for other languages. Starting from a solid base in terms of discourse analysis and existing textual complexity assessment model for English, we introduce a French model trained on 200 documents extracted from school manuals pre-classified into five complexity classes. The underlying textual complexity metrics include surface, syntactic, morphological, semantic and discourse specific factors that are afterwards combined through the use of Support Vector Machines. In the end, each factor is correlated to pupil comprehension metrics scores, spanning throughout multiple classes, therefore creating a clearer perspective in terms of measurements impacting the perceived difficulty of a given text. In addition to purely quantitative surface factors, specific parts of speech and cohesion have proven to be reliable predictors of learners' comprehension level, creating nevertheless a strong background for building dependable French textual complexity models.
Article
Full-text available
Starting from the socio-constructivist concepts of (virtual) community of practice (vCoP) and internet-based argumentative open-ended learning environments, this study proposes and validates two tools for automated dialogue assessment, ReaderBench and Important Moments, developed on the ground of the polyphonic social knowledge building model. The analyzed corpus was the dialogue produced by an academic vCoP with N = 179 community members in 23 months, and consisting of 3685 interventions in 292 text-based discussion threads. The analysis results uncovered significant differences in the discussion threads produced by central and peripheral participants, such that central participants produced more interventions with higher collaborative dialogue quality, and the discussion threads they initiated were longer and involved a larger number of participants. Moreover, based on the automated analysis result, the vCoP participants could be classified in two clusters corresponding to the well-known core-periphery structure of CoPs. These findings are consistent with those revealed by other methods, and suggest that the employed tools are appropriate for identifying virtual communities that are appropriate as open-ended learning environments. Further research and development is needed to deepen quantitative vCoP models and test communication strategies recommended to students in vCoP-based argumentative open-ended learning environments.
Conference Paper
As Computer Supported Collaborative Learning (CSCL) gains a broader usage as a viable alternative to classic educational scenarios, the need for automated tools capable of supporting tutors in the time consuming process of analyzing conversations becomes more stringent. Moreover, in order to fully explore the benefits of such scenarios, a clear demarcation must be made between participation or active involvement, and collaboration that presumes the intertwining of ideas or points of view with other participants. Therefore, starting from a cohesion-based model of the discourse, we propose two computational models for assessing collaboration and participation. The first model is based on the cohesion graph and can be perceived as a longitudinal analysis of the ongoing conversation, thus accounting for participation from a social knowledge-building perspective. In the second approach, collaboration is regarded from a dialogical perspective as the intertwining or overlap of voices pertaining to different speakers, therefore enabling a transversal analysis of subsequent discussion slices.
Finding student-centered open learning environments on the internet
  • N Nistor
  • S Trausan-Matu
  • M Dascalu
  • H Duttweiler
  • C Chiru
  • B Baltes
  • G Smeaton
Nistor, N., Trausan-Matu, S., Dascalu, M., Duttweiler, H., Chiru, C., Baltes, B., Smeaton, G.: Finding student-centered open learning environments on the internet. Comput. Hum. Behav. 47(1), 119-127 (2015)