Article

Reading Demands in Secondary School: Does the Linguistic Complexity of Textbooks Increase With Grade Level and the Academic Orientation of the School Track?

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

An adequate level of linguistic complexity in learning materials is believed to be of crucial importance for learning. The implication for school textbooks is that reading complexity should differ systematically between grade levels and between higher and lower tracks in line with what can be called the systematic complexification assumption. However, research has yet to test this hypothesis with a real-world sample of textbooks. In the present study, we used automatic measures from computational linguistic research to analyze 2,928 texts from geography textbooks from four publishers in Germany in terms of their reading demands. We measured a wide range of lexical, syntactic, morphological, and cohesion-related features and developed text classification models for predicting the grade level (Grades 5 to 10) and school track (academic vs. vocational) of the texts using these features. We also tested ten linguistic features that are considered to be particularly important for a reader’s understanding. The results provided only partial support for systematic complexification. The text classification models showed accuracy rates that were clearly above chance but with considerable room for improvement. Furthermore, there were significant differences across grade levels and school tracks for some of the ten linguistic features. Finally, there were marked differences among publishers. The discussion outlines key components for a systematic research program on the causes and consequences of the lack of systematic complexification in reading materials.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... With the development of readers' capacities, texts should become more complex (Snow, 2002). Researchers have argued that reading materials can provide readers with comprehensible input and facilitate learning of a language when they are based on the concept of the zone of proximal development proposed by Vygotsky (Berendes et al., 2017). That is, what the readers can do with some adult or expert guidance. ...
... Although the matching of linguistic complexity and grade levels has been studied extensively (e.g., Rog and Burton, 2001;Mesmer et al., 2012;Berendes et al., 2017;Holster et al., 2017;Jin et al., 2020), relatively few studies have targeted Chinese EFL students and even fewer targeted very young EFL children's reading materials. Jin et al. (2020) analyzed a large corpus of teaching materials for Chinese EFL students from Grade 1 to 12. ...
... Syntactic complexity is widely discussed in the field of leveled reading (Mesmer et al., 2012;Frantz et al., 2015;Berendes et al., 2017;Jin et al., 2020). Traditional measurements are Mean length of utterance (MLU), Total number of utterance (TOT-UTT), average sentence length, and number of modifiers. ...
Article
Full-text available
We examined the linguistic features of texts in twenty-nine picture books used in an early English as a Foreign Language program in China. We used the software CLAN to automatically extract indices of linguistic complexity that are typically used to analyze child-directed speech and tested if these indices aligned with expert judgments on the books’ appropriate grade level (Kindergarten-1 through Kindergarten-3). Of the eleven characteristics investigated, seven showed significant between-level differences with moderate effect sizes. Across all levels, vocabulary complexity (i.e., frequency of types, frequency of tokens, and vocabulary diversity) and syntactic complexity (i.e., number of verbs per utterance, number of Developmental-Sentence-Scoring-eligible utterances, mean length of utterance in morphemes, and total number of non-zero morphemes) increased, also in alignment with experts’ judgments. Indices of child language development can thus be used to estimate text complexity in picture books. The study contributes to a better understanding of children’s picture book difficulty and has methodological implications for investigating text characteristics for very young children learning English as a foreign language.
... Readability analysis serves an important practical need as it helps to assess the accessibility of reading materials to readers. While a number of studies have been conducted on the accessibility of printed textbook materials (Berendes et al., 2018;Maslin, 2007), the accessibility of OER and specifical accessibility of OER to English learners are underrepresented in the research literature, with most studies on the topic of online material accessibility concentrating in the field of healthcare, as mentioned in the previous section. While health information materials and Wikipedia articles are not necessarily produced by educational institutions, these are similar to OER in that they are online materials hosted on public platforms that aim to inform and educate a wide audience. ...
... The study found that there was a progression of difficulty with passages being easier in the beginning of the year and more difficult at the end. The evidence of grade-level-based complexification was further supported by the study of Berendes et al. (2018) who found that there were significant differences between textbook reading materials of Grades 5/6 and 9/10 for seven of the 10 linguistic features with 9/10 grade materials being more demanding. Three features (word length, ratio of genitive nouns to all nouns, and ratio of derived nouns to all nouns) showed significant differences for all grade comparisons. ...
... These differences concerned the measures of word and sentence length and amount of advanced lexis. This result supports the evidence on the contributors to text difficulty which have also been reported to be word and sentence structure (number of syllables per word and number of words per sentence) as well as word meaning (word rareness and corresponding level of proficiency) (Berendes et al., 2018;Harrison, 1980;Maslin, 2007). On the other hand, ANOVA analysis conducted with the Saylor courses showed that such differences occur only between 'remedial' level 0 courses and more senior courses. ...
Article
Open Educational Resources aim to offer learning to all, yet the language level used in resources could be a barrier to many potential learners. This paper examines the readability of 200 OER courses in English from two major OER course platforms. We compared the means of readability metrics between these OER courses at different educational levels and subject categories that the platforms offer using inferential statistics as well as cluster analyses. Results prove that there is a progression of difficulty between lower and higher educational levels with introductory courses being easier to read. However, the analysis also highlighted that more than 86% of the courses require an advanced level of English language proficiency. On the other hand, subject matter does not appear to be linked with the readability of the courses. This study contributes further to the current discussion of the inclusiveness of OER and the factors that hinder its universal use. The study addresses a gap in the literature as, to our knowledge, no other studies have analysed the linguistic accessibility of OER English learners and consideration of the meaning of the educational levels assigned to OER courses has been limited.
... Syntactic complexity refers to the difficulty of the phrase structure and the associated dependencies within a text (Berendes et al., 2018). For instance, complex syntactical structures comprise the use of nominals, or the inclusion of subordinate clauses. ...
... For instance, complex syntactical structures comprise the use of nominals, or the inclusion of subordinate clauses. Complex syntactical structures often unnecessarily increases the sentence length, and as such increases readers' cognitive efforts to process the text (Berendes et al., 2018). Contrarily, concreteness refers to the use of content words that are concrete and meaningful which make texts easier to process and comprehend (McNamara, 2013). ...
... As further treatment checks, that means that differences regarding students' (meta-) comprehension can be ascribed to textual differences of the learning materials, we realized two further safeguards. First, we conducted an automated linguistic complexity analysis to disentangle whether the low-complex and the high-complex texts descriptively differ on the central dimensions of linguistic text complexity by means of an established computer-linguistic analysis tool for German language (Berendes et al., 2018;Hancke, Vajjala, & Meurers, 2012). The findings indicated that the high-complex text was less cohesive and less concrete than the low-complex text (see Appendix A, Table A). ...
Article
Free access until June 07, 2020: https://authors.elsevier.com/a/1awRp3QACxdXiX - In this experiment, we examined whether linguistic text complexity affects effects of explaining modality on students’ learning. Students (N = 115) read a high-complex and a low-complex text. Additionally, they generated a written or an oral explanation to a fictious peer. A control group of students retrieved the content. For the low-complex text, we found no significant differences between conditions. For the high-complex text, oral explaining yielded better comprehension than writing explanations. The retrieval condition showed the lowest performance. Mediation analyses revealed that the effect of explaining modality while learning from the high-complex text was mediated by the personal references and the comprehensiveness of the generated explanations. Our findings suggest that the effect of explaining modality emerges when students are required to learn from difficult texts. Furthermore, they show that oral explaining is effective as, likely due to increases of social presence, it triggers distinct generative processes during explaining.
... It is commonly asserted that disciplinary textbooks are important because they are the primary resource for disciplinary and academic vocabulary learning, although students also acquire academic language from oral speech as well (e.g., Berendes et al., 2018; ACADEMIC VOCABULARY VOLUME 9 2017; Ravid & Tolchinsky, 2002). A classic finding in reading research is that students can learn new word meanings from reading texts, including school textbooks (e.g., Cunningham & Stanovich, 1991, 1998Nagy & Anderson, 1984). ...
... To our knowledge, studies of textbook academic vocabulary shifts across elementary grades are rare. However, it is well established that texts in general become more challenging as grades increase (e.g., Berendes et al., 2018), and some researchers have revealed specific linguistic features that change in nature and volume with the rise of grades. For instance, in one study, as grades rose from first through third grade, word decodability became more complex, and word meanings were more challenging, while repetition of words and phrases decreased (Fitzgerald et al., 2015). ...
... We first provide an overview of the statistical predictor classification model building process. The predictor classification model building process (Bishop, 2006;Brieman, 2001) has been commonly used in prior research including research involving production of general academic word lists (e.g., Berendes et al., 2018;Coxhead, 2000;Fitzgerald, Elmore et al., 2020;Fitzgerald, Relyea et al., 2021;Gardener & Davies, 2014). ...
Article
Full-text available
The purpose of the study was to assess the volume of academic vocabulary in elementary grades disciplinary textbooks. Academic vocabulary was examined in a corpus of best-selling elementary grades textbooks in three disciplinary areas-science, mathematics, and social studies. Academic words in texts were determined through automated procedures involving statistical modeling. Four academic vocabulary variables were created: Total Academic Words; Discipline-Match Academic Words (science domain-specific academic words in science textbooks, and so on); High Challenge Total Academic Words; and High Challenge Discipline-Match Academic Words. Longitudinal multi-level Poisson regression was conducted for selected research issues. Main conclusions were: (a) The estimated overall elementary grades volume of academic vocabulary in disciplinary textbooks was relatively high. Summed across all grades and disciplines, 31% of all of the estimated unique word types in the textbooks were academic word types. By the end of elementary school, children who read or listened to disciplinary textbooks like the ones in the present study corpus would have been exposed to approximately one academic word type for every three unique word types encountered. (b) Moreover, approximately one or two of every four or five academic word types was estimated to be a word that would present challenge to typically developing and struggling students. (c) For all three disciplines, with minor exception, the estimated volume of newly-appearing academic words in a grade increased through the earliest grades, tended to peak in third or fourth grade, and then decelerated slightly thereafter.
... Syntactic complexity refers to the difficulty of the phrase structure and the associated dependencies within a text (Berendes et al., 2018). For instance, complex syntactical structures comprise the use of nominals, or the inclusion of subordinate clauses. ...
... For instance, complex syntactical structures comprise the use of nominals, or the inclusion of subordinate clauses. Complex syntactical structures often unnecessarily increases the sentence length, and as such increases readers' cognitive efforts to process the text (Berendes et al., 2018). Contrarily, concreteness refers to the use of content words that are concrete and meaningful which make texts easier to process and comprehend (McNamara, 2013). ...
... As further treatment checks, that means that differences regarding students' (meta-) comprehension can be ascribed to textual differences of the learning materials, we realized two further safeguards. First, we conducted an automated linguistic complexity analysis to disentangle whether the low-complex and the high-complex texts descriptively differ on the central dimensions of linguistic text complexity by means of an established computer-linguistic analysis tool for German language (Berendes et al., 2018;Hancke, Vajjala, & Meurers, 2012). The findings indicated that the high-complex text was less cohesive and less concrete than the low-complex text (see Appendix A, Table A). ...
Preprint
In this experiment, we examined whether text difficulty moderates the effect of the modality of explaining on students’ learning. Students (N = 115) read a high-difficult and a low-difficult text. Additionally, students generated either a written or an oral explanation. A control group of students retrieved the content. For the low-difficult text, we found no significant differences between conditions. For the high-difficult text, however, oral explaining yielded better comprehension than writing explanations. The retrieval condition showed the lowest performance. Mediation analyses revealed that the effect of explaining modality was mediated by the number of personal references and the comprehensiveness of the generated explanations. Our findings suggest that the effect of explaining modality emerges when students are required to learn from difficult text materials. Furthermore, the findings show that oral explaining is effective, as it triggers distinct generative processes due to increased social presence during explaining.
... School textbooks are replete with various features of the academic register. For instance, information in school textbooks is often presented in an abstract way (Achugar & Schleppegrell, 2005;Berendes et al., 2018). As early as elementary school, textbooks in mathematics, science, and social studies expose children to relatively high amounts of academic vocabulary (Fitzgerald et al., 2020(Fitzgerald et al., , 2022. ...
... As early as elementary school, textbooks in mathematics, science, and social studies expose children to relatively high amounts of academic vocabulary (Fitzgerald et al., 2020(Fitzgerald et al., , 2022. Moreover, coherence in textbooks is often achieved through the use of connectives (e.g., perhaps, consequently, even; Rodgers, 1974) while extensive explanations are circumvented through the use of short, albeit complex, noun phrases (Berendes et al., 2018). ...
Article
Full-text available
The present study investigates the incremental validity of the traditional books-at-home measure and selected extensions (i.e., number of children’s books and number of ebooks) for explaining students’ academic achievement as measured by their academic language comprehension. Using multiple linear regressions, we additionally explore the role of the source of information (i.e., whether information is given by parents or children). Based on cross-sectional data of a German sample of 2353 elementary school children from Grades 2 through 4, we found that parents’ information on the number of books and children’s books contributed to students’ academic language comprehension over and above parental occupation and education. Children’s information on the number of books did not further increase the amount of explained variance, and the effects were smaller than those for parents’ information. Yet, when investigated separately, both parents’ and children’s information on the number of books and children’s books at home predicted students’ academic language comprehension and mediated the relationship between more distal structural features of socioeconomic status (i.e., parents’ occupational status and education) and the outcome variable. No effect emerged for the number of ebooks. Our findings point to the robustness of the traditional books-at-home measure when used in parent questionnaires.
... Textbooks: Textbooks have been a common source of training data for ARA research, where available, for several languages such as English (Heilman et al., 2007), Japanese (Sato et al., 2008), German (Berendes et al., 2018), Swedish (Pilán et al., 2016, French (François and Fairon, 2012) and Bangla (Islam et al., 2012), to name a few. They are considered to be naturally suited for ARA research as one would expect the linguistic characteristics of texts to become more complex as school grade increases. ...
... Apart from these, François (2014) conducted a qualitative and quantitative analysis of a French as Foreign Language textbook corpus and concluded that there is a lack of consistent correlation among expert ratings, and that the texts assigned at the same level by the expert annotators showed significant differences in terms of lexical and syntactic features. Berendes et al. (2018) reached similar conclusions using a multidimensional corpus of graded German textbooks covering two school tracks and four publishers. While there are a few user studies aiming to study the relationship between readability annotations and reader comprehension (Crossley et al., 2014;Vajjala and Lucic, 2019), conclusions have been mixed. ...
Preprint
Full-text available
Readability assessment is the task of evaluating the reading difficulty of a given piece of text. Although research on computational approaches to readability assessment is now two decades old, there is not much work on synthesizing this research. This article is a brief survey of contemporary research on developing computational models for readability assessment. We identify the common approaches, discuss their shortcomings, and identify some challenges for the future. Where possible, we also connect computational research with insights from related work in other disciplines such as education and psychology.
... As a valuable complement to experimental research, this research has the potential to advance our current understanding of (both first and second) language learning and development (Rebuschat et al., 2017;Ellis, 2019). Important steps have been made in this direction through both language input and language output perspectives: Regarding the former, a number of studies have examined whether and to what extent learning materials show an adequate level of linguistic complexity considered to be of crucial importance for successful learning outcomes (see, e.g., François and Fairon, 2012;Pilán et al., 2016;Xia et al., 2019;Chen and Meurers, 2018;Berendes et al., 2018). For example, Berendes et al. (2018) employ a text classification approach to examine to whether and to what extent reading complexity of school textbooks differ systematically across grade levels in line with the so-called 'systematic complexification assumption'. ...
... Important steps have been made in this direction through both language input and language output perspectives: Regarding the former, a number of studies have examined whether and to what extent learning materials show an adequate level of linguistic complexity considered to be of crucial importance for successful learning outcomes (see, e.g., François and Fairon, 2012;Pilán et al., 2016;Xia et al., 2019;Chen and Meurers, 2018;Berendes et al., 2018). For example, Berendes et al. (2018) employ a text classification approach to examine to whether and to what extent reading complexity of school textbooks differ systematically across grade levels in line with the so-called 'systematic complexification assumption'. They build text classification models using a Sequential Minimal Optimization (SMO) algorithm trained on a wide range of lexical, syntactic, morphological, and cohesionrelated features to predict the grade level (fifth to tenth grade) and school track (high vs. low). ...
Conference Paper
Full-text available
In this paper we employ a novel approach to advancing our understanding of the development of writing in English and German children across school grades using classification tasks. The data used come from two recently compiled corpora: The English data come from the the GiC corpus (983 school children in second-, sixth-, ninth- and eleventh-grade) and the German data are from the FD-LEX corpus (930 school children in fifth- and ninth-grade). The key to this paper is the combined use of what we refer to as ‘complexity contours’, i.e. series of measurements that capture the progression of linguistic complexity within a text, and Recurrent Neural Network (RNN) classifiers that adequately capture the sequential information in those contours. Our experiments demonstrate that RNN classifiers trained on complexity contours achieve higher classification accuracy than one trained on text-average complexity scores. In a second step, we determine the relative importance of the features from four distinct categories through a Sensitivity-Based Pruning approach.
... However, compared to oral discourse, written texts provide more opportunity for learning previously unknown word meanings (e.g., Corson, 1995;Elleman et al., 2017;Hayes & Ahrens, 1988;McKeown & Beck, 2011;Snow & Uccelli, 2009). Moreover, core disciplinary textbooks are a primary source for disciplinary and domain-specific vocabulary learning (e.g., Adams, 2010Adams, -2011Berendes et al., 2018), even in the earliest years of schooling (Friend, Smolak, Liu, Poulin-Dubois, & Zesiger, 2018;Palincsar & Duke, 2004) where informational texts in general have been shown to be scarce (Duke, 2000). Disciplinary textbooks are special resources because they This document is copyrighted by the American Psychological Association or one of its allied publishers. ...
... All domain-specific academic words used in analyses were determined computationally using classification modeling based on word usage patterns (cf. Berendes et al., 2018;Gardner & Davies, 2014). The process involved two phases. ...
Article
Full-text available
Academic vocabulary networks were examined in three elementary grades textbook programs (first through fifth grade) in three domains—science, mathematics, and social studies. Within each program, a given network consisted of a focal domain-specific academic word and the collection of words from all grades that overlapped in meaning with the focal word. Academic words and the networks for each domain-specific academic word were computationally determined. For each network, the time at which the first network word appeared represented the inception of the network. The network’s expansion across the grades was tracked as additional words appeared. The main research questions were: Did network growth patterns across grades vary according to domain; and within domain, did network patterns vary according to two focal domain-specific academic word characteristics. The two focal word characteristics were the age at which the focal word’s meaning was typically known and the timing of the introduction of the focal word. Multilevel growth model analyses were conducted. The dependent variable was number of words, called nodes, in a network at each time point across the grades. Main conclusions were the following. 1. Network patterns varied across domains, with social studies networks most different from those in the other two domains. 2. Network growth pattern varied according to selected focal word characteristics, and the focal word characteristic effect varied by domain.
... Im Kontext von Sprache und Geographieunterricht ist neben den Schulbuchanalysen zu Argumentationsaufgaben auch eine Studie von Relevanz, die die Textschwierigkeit von Geographieschulbüchern von Klasse 5 bis 10 untersucht. Die Studie vonBerendes et al. (2018) zeigte nach der Analyse von 2928 Texten nur teilweise Anhaltspunkte für eine systematische Zunahme der Textkomplexität im Hinblick auf Klassenstufen und Schulformen. Darüber hinaus fielen deutliche Unterschiede zwischen den Verlagen auf (vgl.Berendes et al. 2018, S. 525). ...
... 3.3.1). Eine kritische Auseinandersetzung mit den Anforderungen an Schulbuchtexte und wiederum damit, welche Anforderungen aus den Schulbuchtexten resultieren, ist dennoch dringend nötig, wie die Studienergebnisse vonBerendes et al. (2018) undHärtig et al. (2019) nahelegen (vgl. Abschn. ...
Book
Full-text available
Das vorliegende Open-Access-Buch widmet sich den Desiderata, Design-Kriterien für sprachsensiblen Fachunterricht am Beispiel des Geographieunterrichts zu entwickeln sowie Erkenntnisse über dessen Wirksamkeit hinsichtlich der Zielvariablen Fachwissen und Fachsprache von Schüler*innen zu erlangen. Zusammenhänge zwischen Fachkompetenz und Fachsprache legen nahe, sprachliche Anforderungen in Form von sprachsensiblem Fachunterricht auch in Sachfächern zu adressieren. Im methodischen Rahmen von Design-Based Research werden Design-Kriterien entwickelt, die in einer Unterrichtsreihe operationalisiert und über mehrere Design-Zyklen beforscht werden. Die Datenerhebungen erfolgt je Zyklus im Prä-Post-Follow-up-Design mit Experimental- und Kontrollgruppe. Beide Gruppen behandeln den gleichen Inhalt im gleichen zeitlichen Umfang; als unabhängige Variable wird das Maß an Sprachsensibilität variiert. Zentrale Erkenntnis bezüglich der Effekte des Treatments ist, dass die Experimentalgruppe in Hinblick auf den Erwerb von Fachwissen und Fachsprache statistisch signifikant und mit mittlerer Effektgröße vom sprachsensiblen Geographieunterricht profitiert. Die Autorin Santina Wey promovierte am Lehrstuhl für Didaktik der Geographie an der FAU Erlangen-Nürnberg zum Thema Sprache in fachlichen Lehr-Lern-Prozessen. Dieser Arbeitsschwerpunkt begleitet sie seit ihrem Lehramtsstudium der Fächer Germanistik und Geographie sowie ihrer Tätigkeit als freiberufliche DaF-Lehrerin in der Erwachsenenbildung.
... Zur Einordnung der Lesbarkeitswerte der deutschen, wie auch der englischen Metrik, kann folgende Tabelle herangezogen werden (zusammengefasst aus [34], [55]): [7], [29], [33], [35], [36], [49]. Damit folgt die Interpretation des Wertes umgekehrt zum bereits vorgestellten FRES Wert (schwierig leicht), da ein niedriger FKG Wert für eine einfachere Lesbarkeit steht. ...
... Eine weitere beliebte Lesbarkeitsmetrik ist der Gunning Fog Index, kurz GFI [7], [12], [15], [17], [35], [45], [46]. Diese Metrik nutzt wie der FKG Test ebenfalls die Lese-/US-Schuljahre Skala im Wertebereich von 6 -17 Jahren (schwierig leicht) [23]. ...
Preprint
Full-text available
Online-Datenschutzerklärungen informieren Benutzer über die Art und Weise, wie und ob ein Dienst persönliche Daten erfasst, verarbeitet und speichert. Da die Zweifel der Anwender gegenüber Online-Diensten bezüglich der Privatsphäre in der Vergangenheit tendenziell gestiegen sind, kann die Datenschutzerklärung als integraler Informationspunkt Auskunft geben. In der Praxis scheinen Verbraucher jedoch Datenschutzerklärungen selten zu lesen, was möglicherweise die allgemeine Annahme widerspiegelt, dass Datenschutzerklärungen schwer lesbar sind und fortgeschrittene Lesefähigkeiten erforderlich sind, um den entsprechenden Text verstehen zu können. Die hierfür erforderlichen Lesefähigkeiten liegen oft weit über dem Durchschnitt der eigentlichen Verbraucher. Aus diesem Grunde konzentriert sich diese Arbeit auf die Lesbarkeit deutscher und englischer Datenschutzerklärungen der Top 210 Android-Apps aus dem Google Play Store. Um die Lesbarkeit von online Texten objektiv messen zu können, werden allgemein anerkannte Lesbarkeitsmetriken verwendet, die in verschiedenen Bereichen der computergestützten Textanalyse Anwendung finden. Unsere Ergebnisse bestätigen, dass die untersuchten Datenschutzerklärungen tatsächlich schwer zu verstehen sind.
... • We show that German educational media language is successfully and broadly adapted towards their target audiences, unlike, e.g., German school textbooks (Berendes et al., 2017). ...
Presentation
Full-text available
We analyze two novel data sets of German educational media texts targeting adults and children. The analysis is based on 400 automatically extracted measures of linguistic complexity from a wide range of linguistic domains. We show that both data sets exhibit broad linguistic adaptation to the target audience, which generalizes across both data sets. Our most successful binary classification model for German readability robustly shows high accuracy between 89.4%–98.9% for both data sets. To our knowledge, this comprehensive German readability model is the first for which robust cross-corpus performance has been shown. The research also contributes resources for German readability assessment that are externally validated as successful for different target audiences: we compiled a new corpus of German news broadcast subtitles, theTagesschau/Logo corpus, and crawled a GEO/GEOlino corpus substantially enlarging the data compiled by Hancke et al. (2012).
... The same or highly similar procedures have been used in other language-related educational studies, including studies involving reading and texts used for reading instruction (e.g., Berendes et al., 2018;Wilson, Chen, Sandbank, & Hebert, 2019). ...
Article
The purpose of the present study was to examine possible shifts in the presence of academic vocabulary across the past six decades for a continually best-selling first-grade core reading program. Seven program years dating from 1962 to 2013 were examined. Four categories of academic vocabulary (science, mathematics, social studies, and general academic) were computationally determined in each program. The primary research question was: Did the volume of academic words in a program year rise with advancing years? A secondary supplementary question was: Did the propensity toward academic affinity of a program considered as a whole rise with advancing years? Two types of academic word measures were employed: (a) a word was deemed to be academic or not, and if it was academic, it was assigned to one of the four academic categories, and then academic words were counted; and (b) a novel measure, academic affinity, was a continuous measure of the probability that a word was academic (in each of the four academic vocabulary categories). Poisson regression modeling and Hierarchical Generalized Linear Modeling were conducted. The main conclusions were: Later first-grade core reading program years included a moderately higher volume of science, social studies, and total academic words as compared to earlier years. The science, social studies, and general academic affinity of the program as a whole was statistically higher in later years, but in practical terms the change was not remarkable.
... We follow the well-established second-language acquisition (SLA) tradition of analysing language performance by assessing the multidimensional construct of linguistic complexity in terms of syntactic, lexical, and discursive elaborateness, variation, and inter-relatedness, as well as of language use and human language processing. Similar measures have been used in previous research to assess the adaptation of reading demands in German geography schoolbooks to different school types and grade levels (Berendes et al., 2018). The general linguistic complexity of student answers, which relates to research on tasks in foreign-language learning (Alexopoulou et al., 2017) provides first insights into the relationship between task complexity and general linguistic complexity. ...
Article
Full-text available
The purpose of history education in Austria has changed over at least the last decade. While the focus used to be to give students a master narrative of the national past based on positivist knowledge, the current objective of history education is to foster historical thinking processes that enable students to form transferable skills in the self-reflected handling and creation of history. A key factor in fostering historical thinking is the appropriation of learning tasks. This case study measures the complexity of learning tasks in Austrian history textbooks as one important aspect of their quality. It makes use of three different approaches to complexity to triangulate the notion: general task complexity (GTC), general linguistic complexity (GLC), and domain-specific task complexity (DTC). The question is which findings can be offered by the specific strengths and limitations of the different methodological approaches to give new insights into the study of task complexity in the domain of history education research. By pursuing multidisciplinary approaches in a triangulating way, the case study opens up new prospects for this field. Besides offering new insights on measuring the complexity of learning tasks, the study illustrates the need for further research in this field – not only related to the development of analytical frameworks, but also regarding the notion of complexity in the context of historical learning itself.
... Syntactic complexity is generally construed as the variety and degree of sophistication of the syntactic structures that are present in a text (Housen & Kuiken, 2009;Lu, 2011;Ortega, 2003;Pallotti, 2009). As part of the larger construct of linguistic complexity (Bulté & Housen, 2014), it has been considered to be a critical component in assessing the readability or comprehensibility of original and adapted reading texts for both first and second language (L2) readers (Berendes et al., 2018;Crossley et al., 2007;Frantz et al., 2015;Gamson et al., 2013;Graesser et al., 2011;Stevens et al., 2015), as well as a useful index of language proficiency (e.g., Bulté & Housen, 2014;Lu, 2011;Norris & Ortega, 2009), language development (e.g., Crossley & McNamara, 2014;Lu, 2009;Yoon & Polio, 2017), and the quality of language production (e.g., Biber, Gray, & Staples, 2016;Kyle & Crossley, 2018;Lu, 2017;Yang, Lu, & Weigle, 2015). ...
Article
An extensive body of research has investigated the role of syntactic complexity in gauging the linguistic complexity of reading texts, particularly for the purpose of determining their grade appropriateness. However, little such research has focused on adapted teaching materials for English as a foreign language (EFL) contexts, and to date there has been no systematic effort in establishing syntactic complexity benchmarks to guide text adaptation practices in such contexts. This paper reports on a large‐scale study that assessed the quantitative differences in syntactic complexity among adapted teaching materials for different grade levels in the EFL curricula in China. Our data consisted of 3,368 adapted English texts solicited from a corpus of teaching materials approved for use in the 12 primary and secondary grade levels in China by the Chinese Ministry of Education. All texts were analyzed using 8 syntactic complexity measures representing different dimensions of syntactic complexity. All 8 measures showed significant between‐level differences with moderate to large effect sizes and nonuniform patterns of progression, and 5 measures were identified as significant predictors of grade levels in a logistic regression analysis. The implications of our results for establishing syntactic complexity benchmarks to inform future text adaptation practices are discussed.
... In order to produce a language test that appropriately measures the academic language needs of the students, a detailed and precise analysis of the construct needs to be carried out (Butler et al., 2004). As stated by Berendes et al. (2018), academic language is the language used to transfer and acquire knowledge whether in spoken academic settings or in school textbooks. Ryan (2002) also suggests consulting stakeholders to decide about the assessment interpretation or test use to assure that the test is measuring what the students know and need. ...
Article
Full-text available
To investigate the congruence between the requisite post-graduate academic language skills and the language skills measured by the General English section of the Iranian National PhD Entrance exam, field-specialist informants, language-specialist informants and post-graduate students were questioned. The informants’ data were collected through interviews and the students’ data were obtained through a language skills’ questionnaire. The informants and students’ data were analyzed through content analysis and frequency analysis, respectively. The informants acknowledged that all four language skills were crucial for academic success. Considering congruity, both groups of informants asserted that there was little congruity between the language skills measured by the exam and those of the academic context. Post-graduate students believed that the reading section of the exam did not match their academic needs; they also believed that a writing section should be added and that a listening section need not be included in the exam. The findings have some implications for a change in the curriculum preceding the exam.
... Constituency trees can be used to derive different syntactic features. Features related to clausal and phrasal elaboration include the number of relative clauses per clause, the number of noun modifiers per noun, the length of noun phrases, the number of non-terminal nodes, or the number of connectors (Berendes et al., 2018). ...
Preprint
Full-text available
Beginning readers benefit in most situations from reading activities that are neither too difficult nortoo easy. This study investigated which text features make readingcomprehension difficult for thirdand fourthgrade elementary school students. Specifically, 145 multiple-choice items from a readingcomprehension test used in several cross-sectional studies (G3: N= 1387; G4: N= 868) and alongitudinal sub-study (N= 195) were analyzed using explanatory item response models to explain item difficulty and changes in item difficulty across grades. Amulti-stepfeature selection procedurecontrolling for seven task featuresled to the selection of eighttext featuresfrom a total of 268 linguistic text features examined. The results showedthat lexical and syntactic features and text genre were the most relevant featuresand that theimportance of specific textfeatures changes from third to fourth grade.Expository textwere more difficult on average than narrative texts. Thisdifference was only partially explained by lexical and syntactic features in third grade,but almost completelyin fourth grade. The results suggest that textfeatures have a dynamic effect on readingcomprehension difficulty throughout third tofourth grade; this is especially true of text genre. Our results can help to select more appropriate texts for elementary students and to improve our understanding of the complex interaction between reader, text and activity as it develops over time.
... In order to succeed across their educational careers, students must acquire basic interpersonal communication skills in general and cognitive academic language proficiency more specifically (Berendes, Dragon, Weinert, Heppt, & Stanat, 2013). Academic language, the language used to impart and acquire knowledge, is spoken in academic settings and used in school textbooks (for textbooks, see Berendes et al., 2018). It is designed to be precise and concise in order to refer to complex processes and to express complicated ideas in contextually reduced settings (see Cummins, 2008). ...
Article
Full-text available
Silent reading is the primary mode of reading for proficient readers, and therefore, silent reading fluency is often assessed in research and practice. However, little is known about the validity of the tests administered to students with different language backgrounds. Given that academic language is assumed to be especially challenging for students with a non-German home language, one might wonder if effects of academic language proficiencies can be found on these tests. In the present study, we explored whether, owing to academic language demands, students with a non-German home language (N = 748) would be found to be at a greater disadvantage than their monolingual-home peers (N = 1669) on the most frequently used silent reading fluency test in Germany. Using differential item functioning (DIF) analyses, we found specific item difficulties to the disadvantage of the students with a non-German home language. This DIF was linked to the academic language features of the sentences.
... Some pre-graded reference texts are utilized to train the model and evaluate the classification accuracy. Some of the notable studies on this approach are Dell'Orletta, et al. [20], De Clercq, et al. [21], and Berendes, et al. [22], etc. In Vietnamese, studies based on this approach have only been carried out in recent years like those of Luong, et al. [12], Luong, et al. [13]. ...
... Although the method has proven its efficiency for not requiring any experts, the number of collected texts is limited, and the copyright must also be taken into consideration. This method has been implemented in various works, including but not limited to Sun et al. [10], Dell'Orletta et al. [6], Chen & Daowadung [7], Lee & Hasebe [22], Berendes et al. [23], Luong et al. [10,12], Diep et al. [13] etc. ...
... While teaching was more beneficial than retrieval for the high-complexity text, there were no differences among conditions with the low-complexity text. Low-complexity material in combination with tasks that require to retrieve the previously learnt contents might already sufficiently support students in constructing a coherent representation of the text (Berendes et al., 2018;McNamara, 2013), and make subsequent generative activities, such as non-interactive teaching obsolete. In contrast, highcomplexity materials may require adding teaching activities to support students establish a coherent understanding of the text (see also Roelle & Nückles, 2019). ...
Article
Full-text available
Teaching the contents of study materials by providing explanations to fellow students can be a beneficial instructional activity. A learning-by-teaching effect can also occur when students provide explanations to a real, remote, or even fictitious audience that cannot be interacted with. It is unclear, however, which underlying mechanisms drive learning by non-interactive teaching effects and why several recent studies did not replicate this effect. This literature review aims to shed light on when and why learning by non-interactive teaching works. First, we review the empirical literature to comment on the different mechanisms that have been proposed to explain why learning by non-interactive teaching may be effective. Second, we discuss the available evidence regarding potential boundary conditions of the non-interactive teaching effect. We then synthesize the available empirical evidence on processes and boundary conditions to provide a preliminary theoretical model of when and why non-interactive teaching is effective. Finally, based on our model of learning by non-interactive teaching, we outline several promising directions for future research and recommendations for educational practice.
... Infrequent words are more often unknown than frequent words (Brysbaert et al., 2019). Texts are more challenging if they contain rarer words 8 (Berendes et al., 2018;Fitzgerald et al., 2015) and word frequency calculated based on different corpora contribute to explaining text complexity (Chen & Meurers, 2018). ...
Preprint
Full-text available
The quality of tests in psychological and assessment educational is of great scholarly and public interest. Item difficulty models are vital to generating test result interpretations based on evidence. A major determining factor of item difficulty in knowledge tests is the opportunity to learn about the facts and concepts in question. Knowledge is mainly conveyed through language. Exposure to language associated with facts and concepts might be an indicator of the opportunity to learn. Thus, we hypothesize that item difficulty in knowledge tests should be related to the probability of exposure to the item content in everyday life and/or academic settings and therefore also to word frequency. Results from a study with 99 political knowledge test items administrated to N = 250 German 7th (age: 11 – 14 years) and 10th (age: 15 – 18 years) graders showed that word frequencies in everyday settings (SUBTLEX-DE) explain variance in item difficulty, while word frequencies in academic settings (dlexDB) alone do not. However, both types of word frequency combined explain a considerable amount of the variance in item difficulty. Items with words that are relatively more frequent in everyday life compared to academic settings are easier, and items with words that are relatively more frequent in academic settings are more difficult. Difficult items have content that is rarely present in everyday settings and is instead more cultivated academically. Examining word frequency from different language settings can help researchers investigate test score interpretations and is a useful tool for predicting item difficulty and refining knowledge test items.
Article
Full-text available
CohViz is a feedback system that provides students with concept maps as feedback on the cohesion of their writing. Although previous studies demonstrated the effectiveness of CohViz, the accuracy of CohViz remains unclear. Thus, we conducted two comprehensive validation studies to assess the accuracy of CohViz in terms of its reliability and validity. In a reliability study, we compared the concept maps generated by CohViz with concept maps generated by four human expert raters based on a text corpus comprising students' explanatory texts (N = 100). Regarding the depiction of cohesion gaps, we obtained high accordance between the CohViz concept maps and the concept maps generated by the human expert raters. However, CohViz tended to overestimate the number of relations within the concept maps. In a validity study, we examined the validity of CohViz and compared central features of the CohViz concept maps with convergent linguistic features and divergent linguistic features based on a Wikipedia text corpus (N = 1020). We found medium to high agreement with the convergent cohesion features and low agreement with the divergent features. Together, these findings suggest that CohViz can be regarded as an accurate feedback system to provide feedback on the cohesion of students' writing.
Article
This study aims to analyze the text complexity of elementary school Korean textbooks of grades 1-6. A Korean analysis tool, Auto-Kohesion, was employed for this study based on many linguistic and psycholingustic measures. The results showed that the text complexity gradually increased across grades by the surface-level measures (e.g., the number of sentences). The text complexity, however, was not reflected for the most other measures(i.e., cohesion measurs, lexical diversity, connectives, pronouns, word frequency, modifiers). The results embrace some pedagogical implicaitons for the Korean textbook development to maximize Korean language learners’ learning gains.
Article
This study aims to investigate the text difficulty of the reading materials of Korean middle school English textbooks with Coh-Metrix, a software developed by the Institute for Intelligent Systems at the University of Memphis to analyze the linguistic and psycholinguistic features of English text and textbooks with a wide range of indices on cohesion and language. In this study, the textbook corpus consisted of the text files extracted from 13 English textbooks. These files were used for analyzing the text difficulty among grades with Coh-Metrix. The Coh-Metrix indices selected for this study contained basic counts, word frequency, word features, lexical diversity, pronouns, connectives, readability, syntax complexity, syntax similarity, reference cohesion, semantic cohesion, and situation model measures. The results showed that there were significant differences among grades for basic counts, word features, first pronouns, causal and temporal connectives, readability, reference and semantic cohesion, the number of words before main verbs, syntactic similarity, and situation model measures. The differences among grades, however, were not significant for word frequency, lexical diversity, second and third person pronouns, additive connectives, and NP density measures. The findings have educational implications for textbook design and language learning for English learners.
Article
Este artículo de investigación tiene como objeto de estudio la complejidad lingüística estructural, específicamente larelacionada con los textos escolares descriptivo-expositivos. De esta forma, en el texto se indagó acerca de los factores (dimensiones) lingüísticos que permiten medir la complejidad y clasificar los textos de una muestra. Se analizaron 80 textosescolares a partir de una serie de rasgos (agrupados en dos niveles: morfosintáctico y semántico) y a través de un análisis factorial. Los resultados muestran que los factores cantidad, prototipicidad e informatividad, resultaron significativos y elfactor variedad parcialmente productivo. Frente a los rasgos, se valida la hipótesis de que es necesario incluir en el análisisno solo rasgos sintácticos sino también semánticos e incluso pragmáticos; no obstante, estos resultados son provisionalespues se requiere repensar los criterios y las hipótesis explicativas dadas.
Article
In national educational standards, linguistic complexity is one of the core aspects characterizing the demands of reading material and student writing quality. In Germany, teachers are supposed to consider linguistic complexity in their assessment of language arts Abitur examinations, the high stakes test qualifying for higher education admission in Germany. In the present study, we investigated if and how teachers consider linguistic complexity in their assessment of writing quality. To systematically identify a range of linguistic complexity aspects, we conducted computational linguistic analyses of student essays (N = 344) written during Abitur examinations in the subject German. Experienced teachers (N = 33) then regraded a subsample of the essays (n = 16) and rated their linguistic complexity. We used mixed-effects models to study the relationships between computational measures of linguistic complexity, teachers’ assessment of writing quality (language, content, and overall grade), and their linguistic complexity rating. Results confirm that teachers’ complexity rating and their assessment of the language aspect of writing quality is related to the computed linguistic complexity features, primarily measures of syntactic complexity and language use (word frequency measures). But findings also show that teachers have difficulties in separating linguistic complexity from aspects of content and language accuracy.
Article
Full-text available
It is widely known that boys, on average, have lower reading competencies than girls. With respect to the development of reading competencies, research has yet to determine whether performance differences between genders increase, decrease, or remain stable over the course of secondary school. Some studies, mainly from the United Kingdom and the United States, suggest that an increase in performance differences between boys and girls is related to the development of students from families with low socioeconomic status. Moreover, students' immigration background and the school track have been discussed as a moderator. In the present study, the aforementioned research questions were addressed with data from 2,505 students from Germany. Using data collected at four time points (Grades 5-8), we applied latent growth curve modeling to analyze the competence areas reading speed and reading comprehension. The results showed a fan-spread effect that illustrated a disadvantage for boys in reading speed and comprehension. No fan-spread effects of reading performance growth occurred in relation to socioeconomic status or immigration background. Furthermore, the analyses showed that the gender-related fan-spread effects were not moderated by socioeconomic status or immigration background. The school track was not a significant moderator of the gender effects.
Article
We investigate the linguistic complexity of oral classroom interactions in late primary and early secondary school across German school types. The goal is to explore whether teachers and students align in terms of their use of the academic language register. We empirically base this investigation on transcriptions of teacher and student contributions during content matter lessons on the vaporisation and condensation of water. Across school types and grade levels, we compare the extent to which teachers offer language that is adaptively rich in linguistic constructs commonly associated with academic language, such as deagentivation, nominal style, and cohesive devices. Putting this in relation to the developing academic language competence of the students, we then compare the language offered by the teachers to the use of these academic language constructs in the students’ spoken language contributions. We discuss the methodological challenges arising from analyzing oral classroom interactions and from applying automatic linguistic complexity analyses to such data.
Conference Paper
Full-text available
We analyze two novel data sets of German educational media texts targeting adults and children. The analysis is based on 400 automatically extracted measures of linguistic complexity from a wide range of linguistic domains. We show that both data sets exhibit broad linguistic adaptation to the target audience, which generalizes across both data sets. Our most successful binary classification model for German readability robustly shows high accuracy between 89.4%–98.9% for both data sets. To our knowledge, this comprehensive German readability model is the first for which robust cross-corpus performance has been shown. The research also contributes resources for German readability assessment that are externally validated as successful for different target audiences: we compiled a new corpus of German news broadcast subtitles, the Tagesschau/Logo corpus, and crawled a GEO/GEOlino corpus substantially enlarging the data compiled by Hancke et al. (2012).
Article
Recent years have witnessed a growing interest in the relationship between academic language registers and school success in the German-speaking education system. However, we still know very little about the actual effects that academic language has on the academic performance of students, for instance, in how far the extent to which academic language is used in subject tasks actually makes these tasks more difficult. It is therefore highly vital that any operationalization of difficulty-inducing linguistic features of tasks is made on solid theoretical and empirical grounds. The purpose of this article is thus to present the linguistic foundation used in an interdisciplinary empirical study in which 1.346 7 th and 8 th graders solved a set of subject-oriented tasks from Maths, Physics, German, PE and Music, while the degree of linguistic demands in the tasks was systematically varied. First, the theoretical and empirical research on linguistic difficulty from a range of research discourses is discussed. The findings are merged into a model of linguistic demands. Its operationalization is then illustrated in three linguistically varied versions of the subject-specific tasks. Finally, an outlook on preliminary results of the empirical study is given, which indicate that the categories used in the model actually do produce differences in subject-task difficulty, even though there are a number of effects that need further investigation.
Article
Full-text available
The purpose of the research is to identify the influence of an educational text on the organisation of information activities of schoolchildren. Methods: content analysis, structural and semantic analysis of educational texts and the results of their interpretation. Research results: identification of features of the educational text as an information structure (the integrity of multi-layer content, reproducibility, presence of system connections with the thematic information field), description of the information activity of a complex of educational texts united by the function of updating the value attitude to knowledge, which can be represented in three main types: impersonal, authorised and personal. Each of these types is correlated with the level of perception of information by the student-reader, which can be a. detached and impersonal, b. in conscious collaboration with the author, c. to their own cognitive experience. As a result of the research, the basic information-significant structural components of educational texts are also identified: the title and the beginning correlated with the key meaning of the text, the means of axiological field and dialogisation.
Article
Full-text available
This study of English as a second language (ESL) reading textbooks investigates cohesion in reading passages from 27 textbooks. The guiding research questions were whether and how cohesion differs across textbooks written for beginning, intermediate, and advanced second language readers. Using a computational tool called Coh-Metrix, textual features were compared across the three levels using Multivariate Analysis of Variance (MANOVA). The results indicated that some features of cohesion yielded significant variation, but with small effect sizes. The majority of cohesion features considered were not different across the textbook levels. Larger effect sizes were found with factors like length, readability and lexical or syntactic complexity.
Article
Full-text available
The Common Core State Standards represent the first standards document to address whether students are able to read progressively more complex texts as they progress across the grades. This article gives an overview of the three components of the model of text complexity that were identified in Appendix A of the Standards and also were the basis for the selection of manuscripts for this special issue: (a) qualitative, (b) quantitative, and (c) reader-task considerations. This introduction gives an overview of the three components and the contributions of the articles in this special issue to extending understanding about these three components.
Article
Full-text available
The Common Core State Standards for English Language Arts have prompted enormous attention to issues of text complexity. The purpose of this article is to put text complexity in perspective by moving from a primary focus on the text itself to a focus on the comprehension of complex text. We argue that a focus on comprehension is at the heart of the Common Core Standards for ELA and that characteristics of the text represent only one of several factors that influence comprehension. Using both theoretical and empirical sources, we highlight the relationship between texts and tasks. We propose a Text-Task Scenario framework in which the simultaneous consideration of text and task results in a more nuanced and more instructionally responsive estimate of the comprehension of complex text.
Article
Full-text available
The CCSS framework indicates more difficult texts are to be used with students. However, the rationale for increasing text difficulty, decreasing text difficulty, is unsupported by the research that shows texts have been increasing in difficulty for at least 50 years. Oral reading accuracy is a traditional method of estimating text difficulty. For 70 years oral reading accuracy of at least 95% accuracy has been the accepted standard. The research available suggest that this traditional level of accuracy is supported by the evidence as optimal for developing reading proficiency.
Article
Full-text available
Publication details, including instructions for authors and subscription information: The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Article
Full-text available
The Common Core Standards call for students to be exposed to a much greater level of text complexity than has been the norm in schools for the past forty years. Textbook publishers, teachers, and assessment developers are being asked to refocus materials and methods to ensure that students are challenged to read texts at steadily increasing complexity levels as they progress through school so that all students remain on track to achieve college and career readiness by the end of 12th grade. Although automated text analysis tools have been proposed as one method for helping educators achieve this goal, research suggests that existing tools are subject to three limitations: inadequate construct coverage; overly narrow criterion variables; and inappropriate treatment of genre effects. Modeling approaches developed to address these limitations are described. Recommended approaches are incorporated into a new text analysis system called SourceRater. Validity analyses implemented on an independent sample of texts suggest that, compared to existing approaches, SourceRater's estimates of text complexity are more reflective of the complexity classifications given in the new Standards. Implications for the development of learning progressions designed to help educators organize curriculum, instruction and assessment in reading are discussed.
Article
Full-text available
Largely due to technological advances, methods for analyzing readability have increased significantly in recent years. While past researchers designed hundreds of formulas to estimate the difficulty of texts for readers, controversy has surrounded their use for decades, with criticism stemming largely from their application in creating new texts as well as their utilization of surface-level indicators as proxies for complex cognitive processes that take place when reading a text. This review focuses on examining developments in the field of readability during the past two decades with the goal of informing both current and future research and providing recommendations for present use. The fields of education, linguistics, cognitive science, psychology, discourse processing, and computer science have all made recent strides in developing new methods for predicting the difficulty of texts for various populations. However, there is a need for further development of these methods if they are to become widely available.
Article
Full-text available
Matching reading materials to learners with the appropriate level of proficiency has been the focus of attention for many scholars. To this end, readability formulas have been developed. Despite being efficient and user friendly, the formulas have not been able to stand to the test of research, thus undergoing some criticism on the grounds that they are not sensitive to the modification in the factors they are based on. Furthermore, they fail to consider other factors which play roles in the comprehension of written materials. Some scholars, based on such criticisms, have noticed the absence of some factors in readability formulas. Some of these factors are cultural origin, structure of theme and core/non-core words, and conjunctions. The present study constitutes an attempt to investigate the relationship between readability of written materials and the learners' performance at two proficiency levels of intermediate and advanced, the relationship between cohesion markers (grammatical markers, conjunctions and lexical markers) and the readability of written materials, and also the relationship between these cohesion markers and the performance of learners of English as a foreign language at the two aforementioned proficiency levels. To calculate the readability of the material, two prominent readability formulas, Flesch and Fog Index, were employed. The results indicated a significant correlation between the readability of passages and the learners' performance at both levels. Only grammatical cohesion markers were shown to be significantly correlated with the readability of the written materials. The learners' performance correlated significantly with grammatical cohesion markers at intermediate level and with lexical cohesion markers at advanced level.
Article
Full-text available
Our aim in the present paper is to discuss a “cognitive view” of reading comprehension, with particular attention to research findings that have the potential to improve our understanding of difficulties in reading comprehension. We provide an overview of how specific sources of difficulties in inference making, executive functions, and attention allocation influence reading comprehension processes and outcomes and may lead to reading comprehension problems. Finally, we discuss how the consideration of these potential sources of difficulty have practical implications for the design and selection of instructional materials.
Article
Full-text available
The widely adopted Common Core State Standards (CCSS) call for raising the level of text complexity in textbooks and reading materials used by students across all grade levels in the United States; the authors of the English Language Arts component of the CCSS build their case for higher complexity in part upon a research base they say shows a steady decline in the difficulty of student reading textbooks over the past half century. In this interdisciplinary study, we offer our own independent analysis of third- and sixth-grade reading textbooks used throughout the past century. Our data set consists of books from 117 textbook series issued by 30 publishers between 1905 and 2004, resulting in a linguistic corpus of roughly 10 million words. Contrary to previous reports, we find that text complexity has either risen or stabilized over the past half century; these findings have significant implications for the justification of the CCSS as well as for our understanding of a “decline” within American schooling more generally.
Article
Full-text available
The Common Core State Standards (CCSS) set a controversial aspirational, quantitative trajectory for text complexity exposure for readers throughout the grades, aiming for all high school graduates to be able to independently read complex college and workplace texts. However, the trajectory standard is presented without reference to how the grade-by-grade complexity ranges were determined or rationalized, and little guidance is provided for educators to know how to apply the flexible quantitative text exposure standard in their local contexts. We extend and elaborate the CCSS presentation and discussion, proposing that decisions about shifting quantitative text complexity levels in schools requires more than implementation of a single, static standard. A rigorous two-part analytical strategy for decision making surrounding the quantitative trajectory standard is proposed, a strategy that can be used by state policy makers, district officials, and educators in general. First, borrowing methods from student growth modeling, we illustrate an analytical method for creation of multiple trajectories that can lead to the CCSS end-of-high-school target for text complexity exposure, resulting in trajectories that place greater burden for shifting text complexity levels on students in different grades. Second, we submit that knowledge of the multiple possibilities, in conjunction with a set of guiding principles for decision making, can support educators and policy makers in critiquing and using the CCSS quantitative standard for text complexity exposure to establish particular expectations for quantitative text complexity exposure for particular students in situ.
Article
Full-text available
Computer analyses of text characteristics are often used by reading teachers, researchers, and policy makers when selecting texts for students. The authors of this article identify components of language, discourse, and cognition that underlie traditional automated metrics of text difficulty and their new Coh-Metrix system. Coh-Metrix analyzes texts on multiple measures of language and discourse that are aligned with multilevel theoretical frameworks of comprehension. The authors discuss five major factors that account for most of the variance in texts across grade levels and text categories: word concreteness, syntactic simplicity, referential cohesion, causal cohesion, and narrativity. They consider the importance of both quantitative and qualitative characteristics of texts for assigning the right text to the right student at the right time.
Article
Full-text available
Developing academic, or school-based, literacy poses a significant challenge for many students, because the language through which academic subjects are presented is markedly different from the social language that students use in everyday ordinary life. This article focuses on one aspect of academic language, the functions of nouns and nominal structures in constructing knowledge in different subject areas and the challenges they present for comprehension of academic texts. Using a functional linguistics framework and analyzing written texts from language arts, science, and history, at elementary and secondary levels, we illustrate the ways nominal expressions expand the amount of information in a clause, establish and maintain reference, and enable information to be distilled and further expanded. We also show how the semantic features of the nominal elements vary in different academic registers, as texts introduce grammatical “participants” of different types according to the purposes of the text. We suggest that the notion of linguistic register offers a means of transcending debates about academic language, enabling a pedagogy that can raise students' consciousness about specific grammatical resources and how those resources function to construct knowledge in the language of schooling.
Chapter
Full-text available
Starting from a view on language as a combinatorial and hierarchically organized system we assumed that a high syllable complexity favours a high number of syllable types , which in turn favours a high number of monosyllables . Relevant crosslinguistic correlations based on Menzerath's (1954) data on monosyllables in 8 languages turned out to be statistically significant. A further attempt was made to conceptualise "semantic complexity" and to relate it to complexity in phonology, word formation, and word order. In English, for instance, the tendency to phonological complexity and monosyllabism is associated with a tendency to homonymy and polysemy, to rigid word order and idiomatic speech. The results are explained by complexity trade-offs rather between than within the subsystems of language. 1
Article
Full-text available
Two experiments were conducted to examine the on-line processing mechanisms used by young children to comprehend pronouns. The work focuses on their use of two highly relevant sources of information: (1) the gender and number features carried by English pronouns, and (2) the differing accessibility of discourse entities, as influenced by order-of-mention in a clause. Adults use both evidential sources, as early as 200 ms after the offset of the pronoun (Arnold, Eisenband, Brown-Schmidt, & Trueswell, 2000). We find that like adults, 3–5-year-old children use a pronoun's gender to guide their choice of a referent, and that they use it rapidly on-line. But unlike adults, they show little or no signs of a first-mentioned bias, either off-line or on-line. This is consistent with a tendency for children to initially recruit reliable sources of constraint for language comprehension – in this case, the gender of the pronoun.
Article
Full-text available
We tested the effects of word length, frequency, and predictability on inspection durations (first fixation, single fixation, gaze duration, and reading time) and inspection probabilities during first‐pass reading (skipped, once, twice) for a corpus of 144 German sentences (1138 words) and a subset of 144 target words uncorrelated in length and frequency, read by 33 young and 32 older adults. For corpus words, length and frequency were reliably related to inspection durations and probabilities, predictability only to inspection probabilities. For first‐pass reading of target words all three effects were reliable for inspection durations and probabilities. Low predictability was strongly related to second‐pass reading. Older adults read slower than young adults and had a higher frequency of regressive movements. The data are to serve as a benchmark for computational models of eye movement control in reading.
Article
Readability research has a long and rich tradition, but there has been too little focus on general readability prediction without targeting a specific audience or text genre. Moreover, although NLP-inspired research has focused on adding more complex readability features, there is still no consensus on which features contribute most to the prediction. In this article, we investigate in close detail the feasibility of constructing a readability prediction system for English and Dutch generic text using supervised machine learning. Based on readability assessments by both experts and crowdsourcing, we implement different types of text characteristics ranging from easy-to-compute superficial text characteristics to features requiring deep linguistic processing, resulting in ten different feature groups. Both a regression and classification set-up are investigated reflecting the two possible readability prediction tasks: scoring individual texts or comparing two texts. We show that going beyond correlation calculations for readability optimization using a wrapper-based genetic algorithm optimization approach is a promising task that provides considerable insights in which feature combinations contribute to the overall readability prediction. Because we also have gold standard information available for those features requiring deep processing, we are able to investigate the true upper bound of our Dutch system. Interestingly, we will observe that the performance of our fully automatic readability prediction pipeline is on par with the pipeline using gold-standard deep syntactic and semantic information.
Article
The authors describe a book leveling system developed to support emergent literacy in one Canadian school district.
Book
Originally published in 1974, this volume presents empirical and theoretical investigations of the role of meaning in psychological processes. A theory is proposed for the representation of the meaning of texts, employing ordered lists of propositions. The author explores the adequacy of this representation, with respect to the demands made upon such formulations by logicians and linguists. A sufficiently large number of problems are encompassed by the propositional theory to justify its use in psychological research into memory and language comprehension. A number of different experiments are reported on a wide variety of topics, and these test central portions of this theory, and any that purports to deal with how humans represent meaning. Among the topics discussed are the role of lexical decomposition in comprehension and memory, propositions as the units of recall, and the effects of the number of propositions in a text base upon reading rate and recall. New problems are explored, such as inferential processes during reading, differences in levels of memory for text, and retrieval speed for textual information. On the other hand, a study of retrieval from semantic memory focusses on a problem of much current research. The final review chapter relates the present work to other current research in the area at the time.
Article
Teaching Academic ESL Writing: Practical Techniques in Vocabulary and Grammar fills an important gap in teacher professional preparation by focusing on the grammatical and lexical features that are essential for all ESL writing teachers and student-writers to know. The fundamental assumption is that before students of English for academic purposes can begin to successfully produce academic writing, they must have the foundations of language in place--the language tools (grammar and vocabulary) they need to build a text. This text offers a compendium of techniques for teaching writing, grammar, and lexis to second-language learners that will help teachers effectively target specific problem areas of students' writing. Based on the findings of current research, including a large-scale study of close to 1,500 non-native speakers' essays, this book works with several sets of simple rules that collectively can make a noticeable and important difference in the quality of ESL students' writing. The teaching strategies and techniques are based on a highly practical principle for efficiently and successfully maximizing learners' language gains. Part I provides the background for the text and a sample of course curriculum guidelines to meet the learning needs of second-language teachers of writing and second-language writers. Parts II and III include the key elements of classroom teaching: what to teach and why, possible ways to teach the material in the classroom, common errors found in student prose and ways to teach students to avoid them, teaching activities and suggestions, and questions for discussion in a teacher-training course. Appendices to chapters provide supplementary word and phrase lists, collocations, sentence chunks, and diagrams that teachers can use as needed. The book is designed as a text for courses that prepare teachers to work with post-secondary EAP students and as a professional resource for teachers of students in EAP courses. © 2004 by Lawrence Erlbaum Associates, Inc. All rights reserved.
Article
Coh-Metrix analyzes texts on multiple measures of language and discourse that are aligned with multilevel theoretical frameworks of comprehension. Dozens of measures funnel into five major factors that systematically vary as a function of types of texts (e.g., narrative vs. informational) and grade level: narrativity, syntactic simplicity, word concreteness, referential cohesion, and deep (causal) cohesion. Texts are automatically scaled on these five factors with Coh-Metrix-TEA (Text Easability Assessor). This article reviews how these five factors account for text variations and reports analyses that augment Coh-Metrix in two ways. First, there is a composite measure called formality, which increases with low narrativity, syntactic complexity, word abstractness, and high cohesion. Second, the words are analyzed with Linguistic Inquiry and Word Count, an automated system that measures words in texts on dozens of psychological attributes. One next step in automated text analyses is a topics analysis that scales the difficulty of conceptual topics.
Article
The present study was designed to analyze by quantitative methods a corpus of writing produced by four levels of American college students and by one group of professional German writers. Analysis was undertaken to (a) determine whether or not significant quantitative differences in the use of selected syntactic structures exist between the five groups; and (b) test the validity of the Hunt method of measuring syntactic maturity when applied to the writing of second language learners and native Germans. Basically, the Hunt method measures syntactic acquisition by quantifying the rate of occurrence of sentence-embedding transformation in writing samples. The findings indicate that developmental stages in the acquisition of written German syntax did exist in this study and that these stages were most clearly definable between every other level. Hunt's method of measuring syntactic maturity was successfully applied to measuring second language acquisition. In addition, some comparisons were made between this study and other first language acquisition studies and suggestions for further research were given.
Article
This research investigated the cognitive demands of reading curricula from 1910 to 2000. We considered both the nature of the text used and the comprehension tasks asked of students in determining the cognitive demands of the curricula. Contrary to the common assumption of a trend of simplification of the texts and comprehension tasks in third- and sixth-grade curricula, the results indicate that curricular complexity declined early in the century and leveled off over the middle decades but has notably increased since the 1970s, particularly for the third-grade curricula.
Article
Assessing text readability is a time-honored problem that has even more relevance in today’s information-rich world. This article provides background on how readability of texts is assessed automatically, reviews the current state-of-the-art algorithms in automatic modeling and predicting the reading difficulty of texts, and proposes new challenges and opportunities for future exploration not well-covered by current computational research.
Article
Readability formulas, such as the Flesch Reading Ease formula, the Flesch-Kincaid Grade Level Index, the Gunning Fog Index, and the Dale-Chall formula are often considered to be objective measures of language complexity. Not surprisingly, survey researchers have frequently used readability scores as indicators of question difficulty and it has been repeatedly suggested that the formulas be applied during the questionnaire design phase, to identify problematic items and to assist survey designers in revising flawed questions. At the same time, the formulas have faced severe criticism among reading researchers, particularly because they are predominantly based on only two variables (word length/frequency and sentence length) that may not be appropriate predictors of language difficulty. The present study examines whether the four readability formulas named above correctly identify problematic survey questions. Readability scores were calculated for 71 question pairs, each of which included a problematic (e. g., syntactically complex, vague, etc.) and an improved version of the question. The question pairs came from two sources: (1) existing literature on questionnaire design and (2) the Q-BANK database. The analyses revealed that the readability formulas often favored the problematic over the improved version. On average, the success rate of the formulas in identifying the difficult questions was below 50 percent and agreement between the various formulas varied considerably. Reasons for this poor performance, as well as implications for the use of readability formulas during questionnaire design and testing, are discussed.
Article
Schülerinnen und Schüler aus zugewanderten Familien erzielen im deutschen Schulsystem deutlich geringere Bildungserfolge als Gleichaltrige ohne Migrationshintergrund (vgl. z.B. Stanat/Rauch/Segeritz 2010). Da schulischer Erfolg maßgeblich von der Beherrschung der Unterrichtssprache abhängt, gelten Differenzen im sprachlichen Kompetenzniveau als wichtige Ursache für die beobachteten Leistungsunterschiede (vgl. Baumert/Schümer 2001). Dabei werden jedoch nicht grundlegende alltagssprachliche Kompetenzen als ausschlaggebend betrachtet. Vielmehr wird die Beherrschung der so genannten „Bildungssprache“ betont, deren Erwerb in einer Zweitsprache eine besondere Hürde zu sein scheint (vgl. Bailey u.a. 2004; Gogolin/Lange 2011). Inwieweit dies tatsächlich der Fall ist und durch welche spezifischen Merkmale die Bildungssprache im Deutschen charakterisiert ist, ist bislang jedoch weitgehend ungeklärt. Dieser Frage wird daher im Projekt „Bildungssprachliche Kompetenzen (BiSpra): Anforderungen, Sprachverarbeitung und Diagnostik“ nachgegangen, das im Rahmen der „Forschungsinitiative Sprachdiagnostik und Sprachförderung“ (FiSS) vom Bundesministerium für Bildung und Forschung (BMBF) gefördert wird. Ziel ist es zu ermitteln, welche Merkmale von Bildungssprache Grundschulkindern mit unterschiedlichem sprachlichem und sozialem familiären Hintergrund besondere Schwierigkeiten bereiten und ob sich hierbei Unterschiede zwischen Kindern nicht-deutscher Herkunftssprache und monolingual deutschsprachigen Kindern aus bildungsfernen Familien zeigen. Bislang liegen im deutschsprachigen Raum jedoch keine Verfahren vor, mit denen sich bildungssprachliche Kompetenzen im Grundschulalter erfassen lassen. In BiSpra wurden daher zwei Gruppen von Aufgaben entwickelt, mit denen verschiedene Aspekte von Bildungssprache erfasst werden sollen. Nach einer theoretischen Einführung in das Konzept Bildungssprache werden wir im Folgenden die beiden Aufgabengruppen sowie Ergebnisse erster Pilotierungsstudien vorstellen.
Article
It seems now to be established that schoolchildren who are native speakers of English embed a larger and larger number of sentence constituents as they get older. This developmental trend can be demonstrated by having them rewrite a passage made up of exceedingly simple sentences. The developmental trend is demonstrable both in speech and writing. And skilled adult writers carry the same tendency still farther-at least in their writing. Various applications suggest themselves: (1) it should be possible to discover whether this trend is universal: whether it is characteristic of the development of native speakers of all languages; (2) perhaps, too, this measure might be found useful in measuring a person's command of a second language; and (3) since Mellon has shown that American seventh graders can be taught to carry this tendency farther than they normally do, it is possible that drill in sentence-embedding should be part of second language instruction.
Book
hrsg. von Franz E. Weinert., The following values have no corresponding Zotero field: Label: B480 ID - 29
Article
Mit der lexikalischen Datenbank dlexDB stellen wir der psychologischen und linguistischen Forschung im World Wide Web online statistische Kennwerte für eine Vielzahl von verarbeitungsrelevanten Merkmalen von Wörtern zur Verfügung. Diese Kennwerte umfassen die durch CELEX (Baayen, Piepenbrock und Gulikers, 1995) bekannten Variablen der Häufigkeiten von Wortformen und Lemmata in Texten geschriebener Sprache. Darüber hinaus berechnen wir eine Reihe neuer Kennwerte wie die Häufigkeiten von Silben, Morphemen, Zeichenfolgen und Mehrwortverbindungen sowie Wortähnlichkeitsmaße. Die Datengrundlage bildet das Kernkorpus des Digitalen Wörterbuchs der deutschen Sprache (DWDS) mit über 100 Millionen laufenden Wörtern. Wir illustrieren die Validität dieser Kennwerte mit neuen Ergebnissen zu ihrem Einfluss auf Fixationsdauern beim Lesen von Sätzen.
Article
This paper reports the results of a detailed historical analysis of changes in lexical difficulty and diversity of the language used in elementary school reading textbooks widely adopted in the United States during the period 1905-2004. Applying a variety of analytical measures to a 5-million-word corpus of third-grade reading texts, we revisit the patterns of change in lexical complexity reported in previous research and examine the trends in more recent decades that have not yet been thoroughly explored. Our findings provide us with rich evidence for challenging some of the historical critiques of the American reading curriculum, and they have important implications for both educational history and policy.
Conference Paper
We investigate the problem of reading level assessment for German texts on a newly compiled corpus of freely available easy and difficult articles, targeted at adult and child readers respectively. We adapt a wide range of syntactic, lexical and language model features from previous research on English and combined them with new features that make use of the rich morphology of German. We show that readability classification for German based on these features is highly successful, reaching 89.7% accuracy, with the new morphological features making an important contribution.
Article
This study explores the development of multiple dimensions of linguistic complexity in the writing of beginning learners of German both as a group and as individuals. The data come from an annotated, longitudinal learner corpus. The development of lexicogrammatical complexity is explored at 2 intersections: (a) between cross-sectional trendlines and the individual development paths of 2 focal learners and (b) between different complexity variables. The study contributes to the empirical body of linguistic complexity research by close tracking of beginning learners over 4 semesters of collegiate study of German as a second language (L2). For this purpose, data for multiple variables were collected at dense time intervals using multiple waves, and correlation analysis between various datasets was performed. The results confirm some general developmental trends established in previous research. However, the study also found significant variability between individual and cross-sectional data. Furthermore, differences found for more specific complexity measures between this study's results and previous research are explained in terms of differences in instructional approaches. In addition, the study contributes to the discussion of methods and metrics appropriate for tracking the development of complexity in foreign language writing. The study concludes with implications for L2 pedagogy and further research, including applications of computational methods.
Article
In this article, we examine current practices in the measurement of syntactic complexity to illustrate the need for more organic and sustainable practices in the measurement of complexity, accuracy, and fluency (CAF) in second language production. Through in-depth review of examples drawn from research on instructed second language acquisition, we identify and discuss challenges to the evidentiary logic that underlies current approaches. We also illuminate critical mismatches between the interpretations that researchers want to make and the complexity measures that they use to make them. Building from the case of complexity, we point to related concerns with impoverished operationalizations of multidimensional CAF constructs and the lack of attention to CAF as a dynamic and interrelated set of constantly changing subsystems. In conclusion, we offer suggestions for addressing these challenges, and we call for much closer articulation between theory and measurement as well as more central roles for multidimensionality and dynamicity in future CAF research.
Article
This article provides an analysis of some linguistic features of school-based texts, relating the grammatical and lexical choices of the speaker/writer to the functions that language performs in school contexts. Broadly speaking, the context of schooling requires that students read and write texts that present information authoritatively in conventionally structured ways. This article describes some of the lexical and grammatical resources—the register features—that realize this context of schooling. It shows that the presentation of information typically requires technical and specific lexis and explicitly stated logical relations. Authoritativeness is reflected in the choice of declarative mood and the use of grammatical and lexical resources instead of intonation to convey speaker/writer stance or attitude toward what is said. A high degree of structure is expected in school-based language, realized through elaboration of noun phrases, sentence rather than prosodic segmentation, and clause-structuring strategies of nominalization and embedding. These features are functional for creating the texts students read and are expected to write at school.