Ulf Kroehne

Ulf Kroehne
DIPF - Leibniz Institute for Research and Information in Education · Technology Based Assessment

PhD

About

66
Publications
63,530
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,161
Citations
Citations since 2017
42 Research Items
970 Citations
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200
2017201820192020202120222023050100150200

Publications

Publications (66)
Article
Full-text available
Background Mode effects, the variations in item and scale properties attributed to the mode of test administration (paper vs. computer), have stimulated research around test equivalence and trend estimation in PISA. The PISA assessment framework provides the backbone to the interpretation of the results of the PISA test scores. However, an identifi...
Article
Full-text available
In large-scale assessments, disengaged participants might rapidly guess on items or skip items, which can affect the score interpretation’s validity. This study analyzes data from a linear computer-based assessment to evaluate a micro-intervention that blocked the possibility to respond for 2 s. The blocked response was implemented to prevent parti...
Article
Full-text available
As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a res...
Article
Full-text available
Background Computer‐based assessment allows for the monitoring of reader behaviour. The identification of patterns in this behaviour can provide insights that may be useful in informing educational interventions. Objectives Our study aims to explore what different patterns of reading activity exist, and investigates their interpretation and consis...
Article
Objectives. Evaluate the block-adaptive number series task of reasoning, as a time-efficient proxy of general cognitive ability in the Level-2 sample of the German National Cohort (NAKO), a population-based mega cohort. Methods. The number series task consisted of two blocks of three items each, administered as part of the touchscreen-based assessm...
Article
Full-text available
In this article the affiliation details for Author A were incorrectly given as ‘EDUCATIONAL MEASUREMENT’ but should have been ‘IPN–Leibniz Institute for Science and Mathematics Education’.
Article
Multiple document comprehension (MDC) refers to the ability to integrate information from multiple sources into a coherent representation, which requires specific cognitive processes. Assuming that epistemic beliefs are domain-related, this study investigates exploratively how epistemic beliefs in the domains of science and history affect the maste...
Article
Full-text available
Careless and insufficient effort responding (C/IER) can pose a major threat to data quality and, as such, to validity of inferences drawn from questionnaire data. A rich body of methods aiming at its detection has been developed. Most of these methods can detect only specific types of C/IER patterns. However, typically different types of C/IER patt...
Article
Full-text available
International large-scale assessments such as PISA or PIAAC have started to provide public or scientific use files for log data; that is, events, event-related attributes and timestamps of test-takers’ interactions with the assessment system. Log data and the process indicators derived from it can be used for many purposes. However, the intended us...
Article
Full-text available
The study investigates automated and controlled cognitive processes that occur when university students read multiple documents (MDs). We examined data of 401 students dealing with two MD sets in a digital environment. Performance was assessed through several comprehension questions. Recorded log data gave indications about students’ time allocatio...
Article
Full-text available
The increased availability of time‐related information as a result of computer‐based assessment has enabled new ways to measure test‐taking engagement. One of these ways is to distinguish between solution and rapid guessing behavior. Prior research has recommended response‐level filtering to deal with rapid guessing. Response‐level filtering can le...
Article
Recent research suggests that readers' subjective task understanding influences reading processes and outcomes. Therefore, the present study's aim was to investigate whether the task demands that readers retrospectively report relate to multiple document comprehension strategies and outcome. A total of 310 university students completed three units...
Article
Full-text available
The digital revolution has made a multitude of text documents from highly diverse perspectives on almost any topic easily available. Accordingly, the ability to integrate and evaluate information from different sources, known as multiple document comprehension, has become increasingly important. Because multiple document comprehension requires the...
Article
Full-text available
By tailoring test forms to the test‐taker's proficiency, Computerized Adaptive Testing (CAT) enables substantial increases in testing efficiency over fixed forms testing. When used for formative assessment, the alignment of task difficulty with proficiency increases the chance that teachers can derive useful feedback from assessment data. The appli...
Chapter
Das Kapitel gibt einen Überblick, wie mit Hilfe von Computern im weiteren Sinne Tests und Fragebogen realisiert und dabei die Möglichkeiten von klassischen Papier-und-Bleistift-Verfahren erweitert bzw. deutlich überschritten werden können. Dies betrifft beispielsweise die Entwicklung computerbasierter Items mit innovativen Antwortformaten und multi...
Chapter
Full-text available
The OECD Programme for the International Assessment of Adult Competencies (PIAAC) was the first computer-based large-scale assessment to provide anonymised log file data from the cognitive assessment together with extensive online documentation and a data analysis support tool. The goal of the chapter is to familiarise researchers with how to acces...
Article
Full-text available
Rapid guessing can threaten measurement invariance and the validity of large-scale assessments, which are often conducted under low-stakes conditions. Comparing measures collected under different administration modes or in different test settings necessitates that rapid guessing rates also be comparable. Response time thresholds can be used to iden...
Article
Full-text available
International large-scale assessments, such as the Program for International Student Assessment (PISA), are conducted to provide information on the effectiveness of education systems. In PISA, the target population of 15-year-old students is assessed every 3 years. Trends show whether competencies have changed in the countries between PISA cycles....
Article
Das Verständnis multipler Dokumente (Multiple Document Comprehension, MDC) wird als Fähigkeit verstanden, aus verschiedenen Informationsquellen eine integrierte Repräsentation eines inhaltlichen Gegenstandsbereichs zu konstruieren. Als solche ist sie sowohl für die erfolgreiche Bewältigung eines Studiums als auch für gesellschaftliche Partizipation...
Article
Full-text available
Educational largescale assessments risk their temporal comparability when shifting from paperto computerbased assessment. A recent study showed how text responses have altered alongside PISA’s mode change, indicating mode effects. Uncertainty remained, however, because it compared students from 2012 and 2015. We aimed at reproducing the findings in...
Chapter
Multiple document comprehension is the ability to construct an integrated representation of a specific topic based on several sources. It is an important competence for university students; however, there has been so far no established instrument to assess multiple document comprehension in a standardized way. Therefore, we developed a test coverin...
Chapter
Studien wie PISA zeigen durch die wiederkehrenden Erhebungen nicht nur Moment-aufnahmen zur Leistungsfähigkeit von Bildungssystemen. Vielmehr generieren sie auch Daten über ihre Entwicklung und geben insbesondere Hinweise, ob nachfolgende Gene-rationen von Fünfzehnjährigen im Vergleich zu früheren Generationen eine höhere oder niedrigere Kompetenz...
Article
The study investigates the cognitive load of students working on tasks that require the comprehension of multiple documents (Mul-tiple Document Comprehension, MDC). In a sample of 310 students, perceived task difficulty (PD) and mental effort (ME) were examined in terms of task characteristics, individual characteristics, and students' processing be...
Article
The transition from paper-based assessment (PBA) to computer-based assessment (CBA) requires mode effect studies to investigate the comparability of scores across modes. In the National Educational Panel Study experimental studies were conducted to investigate psychometric differences between modes. In the present study, the cross-mode equivalence...
Article
For many years, reading comprehension in the Programme for International Student Assessment (PISA) was measured via paper‐based assessment (PBA). In the 2015 cycle, computer‐based assessment (CBA) was introduced, raising the question of whether central equivalence criteria required for a valid interpretation of the results are fulfilled. As an exte...
Article
Full-text available
Background: With digital technologies, competence assessments can provide process data, such as mouse clicks with corresponding timestamps, as additional information about the skills and strategies of test takers. However, in order to use variables generated from process data sensibly for educational purposes, their interpretation needs to be valid...
Article
Full-text available
In this paper, we developed a method to extract item-level response times from log data that are available in computer-based assessments (CBA) and paper-based assessments (PBA) with digital pens. Based on response times that were extracted using only time differences between responses, we used the bivariate generalized linear IRT model framework (B...
Chapter
Full-text available
Many large-scale competence assessments such as the National Educational Panel Study (NEPS) have introduced novel test designs to improve response rates and measurement precision. In particular, unstandardized online assessments (UOA) offer an economic approach to reach heterogeneous populations that otherwise would not participate in face-to-face...
Presentation
Full-text available
A popular definition describes learning analytics as measuring, collecting, analyzing and reporting of data about learners. The main purpose thereof is to understand and support learning processes. Thus, the main research goals of learning analytics remarkably overlap with those of educational assessment and psychometrics. To demonstrate how these...
Article
Full-text available
Log data from educational assessments attract more and more attention and large-scale assessment programs have started providing log data as scientific use files. Such data generated as a by-product of computer-assisted data collection has been known as paradata in survey research. In this paper, we integrate log data from educational assessments i...
Article
Full-text available
The shadow testing approach (STA; van der Linden & Reese, 1998) is considered the state of the art in constrained item selection for computerized adaptive tests. The present paper shows that certain types of constraints (e.g., bounds on categorical item attributes) induce a matroid on the item bank. This observation is used to devise item selection...
Article
Full-text available
The behavioral sciences, including most of psychology, seek to explain and predict behavior with the help of theories and models that involve concepts (e.g., attitudes) that are subsequently translated into measures. Currently, some subdisciplines such as social psychology focus almost exclusively on measures that demand reflection or even introspe...
Article
Full-text available
A critical evaluation of results to find useful information is essential when doing a web search. In this study, we investigated the evaluation skills of secondary school students, based on their behavior in selecting links from a search engine result page (SERP). To clarify the role of reading when evaluating online information, we assessed studen...
Article
Full-text available
Receiving and using web-based information has become part of everyday life, but the non-linear presentation of information can make considerable demands on cognitive resources, affecting text comprehension. This study examined whether memory updating predicts students' comprehension of digital hypertext over and above skills in reading linearly str...
Article
Completing test items under multiple speed conditions avoids that the performance measure is confounded with individual differences in the speed-accuracy compromise, and offers insights into the response process, that is, how response time relates to the probability of a correct response. This relation is traditionally represented by two conceptual...
Article
Die Studie untersucht Zusammenhänge zwischen dem Leseverständnis und basalen Prozessen des Leseverstehens auf Wort- und Satzebene sowie des Arbeitsgedächtnisses 15-jähriger Jugendlicher. Es wurde den Fragen nachgegangen, ob Unterschiede in der Effizienz der betrachteten Teilkomponenten zum einen die Lesekompetenz selbst, zum anderen Veränderungen i...
Article
Full-text available
The effects of aging on response time were examined in a paper-based lexical-decision experiment with younger (age 18-36) and older (age 64-75) adults, applying Ratcliff's diffusion model. Using digital pens allowed the paper-based assessment of response times for single items. Age differences previously reported by Ratcliff and colleagues in compu...
Chapter
In mathematics education, the student’s ability to translate between different representations of functions is regarded as a key competence for mastering situations that can be described by mathematical functions. Students are supposed to interpret common representations like numerical tables (N), function graphs (G), verbally or pictorially repres...
Chapter
Even though multidimensional adaptive testing (MAT) is advantageous in the measurement of complex competences, operational applications are still rare. In an attempt to change this situation, this chapter presents four recent developments that foster the applicability of MAT. First, in a simulation study, we show that multiple constraints can be ac...
Article
Full-text available
Zusammenfassung. Internationale Schulleistungsstudien wie das Programme for International Student Assessment (PISA) dienen den teilnehmenden Landern zur Feststellung der Leistungsfahigkeit ihrer Schulsysteme. In PISA wird die Zielpopulation (15-jahrige Schulerinnen und Schuler) alle 3 Jahre getestet. Von besonderer Bedeutung sind dabei die Trendinf...
Article
Full-text available
This paper provides an overview and recommendations on how to conduct a mode effect study in large-scale assessments by addressing criteria of equivalence between paper-based and computer-based tests. These criteria are selected according to the intended use of test scores and test score interpretations. A mode effect study can be implemented using...
Chapter
We present data-driven log file analyses of an electronic text book for history called the mBook to support teachers in preparing lessons for their students. We represent user sessions as contextualised Markov processes of user sessions and propose a probabilistic clustering using expectation maximisation to detect groups of similar (i) sessions an...
Article
Full-text available
Reading and understanding digital text that is organized in a non-linear hypertext format can be challenging for students as it requires a more self-directed selection of text pieces compared to reading linear texts. This study aims at investigating how individual differences in students' skills in comprehending digital text can be explained by the...
Article
The use of Information and Communication Technology (ICT) is of immense importance in today’s digital knowledge society. As a basis for private and vocational participation in society, ICT literacy has been widely discussed in recent decades. Although motivational and metacognitive facets play an important role in developing ICT literacy and compet...
Article
Full-text available
Multidimensional adaptive testing (MAT) can improve the efficiency of measuring traits that are known to be highly correlated. Content balancing techniques can ensure that tests fulfill requirements with respect to content areas, such as the number of items from various dimensions (target rates). However, content balancing does not restrict the ord...
Article
The Rasch-based, computerized adaptive assessment procedure RehaCAT allows to assess the ICF-oriented dimensions "activities in daily living", "functionality upper extremities" and "functionality lower extremities" as well as "depression" economically and reliably in orthopaedic rehabilitation patients. This validation study aimed at analyzing the...
Article
The speed-ability trade-off becomes a measurement problem if there is between-subject variation in the speed-ability compromise, as this may affect the comparability of ability estimates. To control individual speed differences, the response-signal (RS) paradigm was applied requiring an immediate response as soon as an acoustic signal is presented....
Article
Full-text available
Multidimensional adaptive testing (MAT) can improve the efficiency of measuring traits that are known to be highly correlated. Content balancing techniques can ensure that tests fulfill requirements with respect to content areas, such as the number of items from various dimensions (target rates). However, content balancing does not restrict the ord...
Article
ICT-Literacy legt eine performanzbasierte Erfassung nahe, also mithilfe von Testaufgaben, die interaktive (simulierte) Computerumgebungen prasentieren und eine Reaktion mittels Maus und/oder Tastatur erfordern. Dennoch kommen haufig Verfahren wie Selbstbeurteilungen oder papierbasierte Leistungstests zum Einsatz. Ziel der vorliegenden Studie war es...
Article
This study aimed at confirmatory testing the factorial structure of the established assessment instruments ODI, SF-12 and HADS-D by means of structural equation modeling in a sample of n=184 rehabilitation patients with musculo-skeletal diseases. According to local and global fit indices for each instrument an acceptable to good fit to the underlyi...
Article
This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated. Recalibration was performed in a sample of 161 patients treated for a...
Presentation
Full-text available
In the field of competence diagnostics adaptive testing is considered as an optimal approach for a highly efficient and economic measurement. As a prerequisite, items have to be part of a calibrated item pool and need to be administered in a computerized testing format. Thus, if one wants to benefit from using a computerized-adaptive testing (CAT)...
Article
To develop and evaluate a computer-adaptive test for the assessment of anxiety in cardiovascular rehabilitation patients (ACAT-cardio) that tailors an optimal test for each patient and enables precise and time-effective measurement. Simulation study, validation study (against the anxiety subscale of the Hospital Anxiety and Depression Scale (HADS-A...
Presentation
Full-text available
In der Kompetenzdiagnostik gelten adaptive Testprozeduren als eine optimale Methode, um sowohl psychometrische Gütestandards zu erfüllen als auch Aspekte der Praktikabilität angemessen zu berücksichtigen. Eine wichtige Voraussetzung für adaptives Testen ist dabei, dass die Items einer kalibrierten Itembank entstammen und in einem computerisierten T...
Article
Full-text available
For diagnostics and outcome measurement in clinical rehabilitation a multitude of questionnaires is used. In order to gain comparability of the diagnostic findings, generally, the same information is gathered of all patients, regardless of their state of health or how severely ill they are, by using identical groups of items. In this kind of assess...
Article
Computerized competence tests promise a variety of advantages compared to paper pencil delivered tests, for instance, increased test security, more information about test takers and the test-taking process, instant scoring, and immediate feedback. Moreover, new innovative item types can be administered to broaden the test content. Three benefits sh...
Article
Full-text available
During the last two decades, Structural Equation Modeling (SEM) has evolved from a statistical technique for insiders to an established valuable tool for a broad scientific public. This class of analyses has much to offer, but at what price? This paper pro- vides an overview on SEM, its underlying ideas, potential applications and current software....

Network

Cited By

Projects

Projects (7)
Project
Technology-based Testing (TBT) in the National Educational Panel Study (NEPS)
Project
Das „Programme for International Student Assessment“ (PISA) erfasst weltweit Schülerleistungen und vergleicht diese international. Die drei untersuchten Kompetenzbereiche in Naturwissenschaft, Lesen und Mathematik sind ein zentraler Bestandteil lebenslangen Lernens. PISA stellt das Leistungsniveau der Jugendlichen fest, liefert Informationen über Ergebnisse des Lehrens und Lernens in den Schulen und zeigt Entwicklungen im Bildungssystem auf. Dabei ist weniger die Übereinstimmung der Testaufgaben mit den Lehrplänen der teilnehmenden Länder von Bedeutung als die Erfassung von Basiskompetenzen in verschiedenen Anwendungssituationen. Das Grundbildungskonzept, von dem PISA ausgeht, ist also funktionalistisch: 15-jährige Schülerinnen und Schüler sollen in möglichst authentischen Aufgaben ihre in der Schule erworbenen Kompetenzen anwenden. Bei PISA 2018 werden nach 2000 und 2009 zum dritten Mal die Lesekompetenzen der 15-jährigen Schülerinnen und Schüler als Schwerpunkt getestet.
Project
Implementation of software that can be used to analyze log file data form educational large-scale assessments using the method described in: Kroehne, U., & Goldhammer, F. (2018). How to conceptualize, represent, and analyze log data from technology-based assessments? A generic framework and an application to questionnaire items. *Behaviormetrika*, 45 (2), 527–563. https://doi.org/10.1007/s41237-018-0063-y