Timo GnambsLeibniz Institute for Educational Trajectories · Educational Measurement
Timo Gnambs
Doctor of Psychology
About
166
Publications
197,638
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
5,134
Citations
Introduction
Additional affiliations
September 2016 - August 2022
March 2018 - February 2021
September 2015 - August 2016
Publications
Publications (166)
There is consensus that the ten items of the Rosenberg Self-Esteem Scale (RSES) reflect wording effects resulting from positively and negatively keyed items. The present study examined the effects of cognitive abilities on the factor structure of the RSES with a novel, non-parametric latent variable technique called Local Structural Equation Models...
Meta-analyses of treatment effects in randomized control trials are often faced with the problem of missing information required to calculate effect sizes and their sampling variances. Particularly, correlations between pre-and posttest scores are frequently not available. As an ad-hoc solution, researchers impute a constant value for the missing c...
Proctored remote testing of cognitive abilities in the private homes of test-takers is becoming an increasingly popular alternative to standard psychological assessments in test centers or classrooms. Because these tests are administered under less standardized conditions, differences in computer devices or situational contexts might contribute to...
Meta-analytic structural equation modeling (MASEM) combines the strengths of meta-analysis with the flexibility of path models to address multivariate research questions using summary statistics. Because many research questions refer to latent constructs, measurement error can distort effect estimates in MASEMs if the unreliability of study variabl...
Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items...
Artificial intelligence (AI) has profoundly transformed numerous facets of both private and professional life. Understanding how people evaluate AI is crucial for predicting its future adoption and addressing potential barriers. However, existing instruments measuring attitudes towards AI often focus on specific technologies or cross-domain evaluat...
Artificial intelligence (AI) has profoundly transformed numerous facets of both private and professional life. Understanding how people evaluate AI is crucial for predicting its future adoption and addressing potential barriers. However, existing instruments measuring attitudes towards AI often suffer from conceptual and psychometric limitations. T...
This editorial introduces a special issue of Large-Scale Assessments in Education (LSAE) that addresses key challenges in analyzing longitudinal data from large-scale studies. These challenges include ensuring fair measurement across time, developing common metrics, and correcting for measurement errors. The special issue highlights recent methodol...
In psychological science, replicability—repeating a study with a new sample achieving consistent results (Parsons et al., 2022)—is critical for affirming the validity of scientific findings. Despite its importance, replication efforts are few and far between in psychological science with many attempts failing to corroborate past findings. This scar...
Because large-scale studies repeatedly indicated low reading literacy for many students, a need for interventions fostering reading literacy, such as extracurricular tutoring, has often been emphasized. Several reading promoting programs, suitable for extracurricular tutoring, were developed and shown to be effective in recent years. Moreover, thes...
Although item response theory (IRT) models have established psychometric advantages over traditional scoring methods, they remain underutilized in practice. We aim to reevaluate common criticisms of IRT in light of substantive, methodological, and computational advances that have transformed the way psychologists measure, collect, and analyze resea...
In psychological science, replicability—repeating a study with a new sampleachieving consistent results (Parsons et al., 2022)—is critical for affirming the validity of scientific findings. Despite its importance, replication efforts are few and far between in psychological science with many attempts failing to corroborate past findings. This scarc...
The Need for Cognition Scale (NCS) is a self-report scale measuring individual differences in the tendency to engage in and enjoy thinking. The shortened version with 18 items (NCS-18; Cacioppo et al., 1984) has widely been administered in research on persuasion, critical thinking, and educational achievement. Whereas most studies advocated for ess...
The Core Self-Evaluations Scale (CSES) measures a broad personality trait reflecting individuals’ self-appraisals of their worth, capabilities, and control of their lives. Although the CSES was designed to capture a single trait, factor analytic studies often found more complex measurement structures. These either referred to different content face...
Artificial intelligence (AI) has become an integral part of many contemporary technologies, such as social media platforms, smart devices, and global logistics systems. At the same time, research on the public acceptance of AI shows that many people feel quite apprehensive about the potential of such technologies—an observation that has been connec...
Computer und andere Informationstechnologien können viele Aufgaben in der psychologischen Diagnostik unterstützen oder überhaupt erst ermöglichen. Dies reicht von Erstgesprächen zur Exploration der Problemlage über Videokonferenzsysteme bis hin zu internetbasierten Persönlichkeits-und Leistungsmessungen oder der Erfassung biophysiologischer Merkmal...
In recent years, there has been a growing emphasis on the importance of open data and data sharing in scientific research (Nosek et al., 2015; van der Zee & Reich, 2018). However, in the educational field, access to FAIR (findable, accessible, interoperable, and reusable) data remains a significant challenge (Wilkinson et al., 2016). This special c...
Out-of-field teaching is viewed as inferior to subject-specific instruction, but its impact on student outcomes varies depending on the criteria used for evaluation. The present study investigated the consequences of out-of-field teaching in science on different student outcomes on an international scale. Analyses were based on the sixth cycle of P...
The SCHNAPP Spelling Test is a novel screening instrument to identify at-risk children with poor spelling abilities in German at the beginning of primary school. Although originally developed as a computerized test to be administered on tablets, in school settings paper-pencil methods are often still preferred. Therefore, the present study on N = 3...
The product-moment correlation is a central statistic in psychological research including meta-analysis. Unfortunately, it has a rather complex sampling distribution which leads to sample correlations that are biased indicators of the respective population correlations. Moreover, there seems to be some uncertainty on how to properly calculate the s...
Method effects on the item level can be modeled as latent difference variables in longitudinal data. These item-effect variables represent interindividual differences associated with responses to a specific item when assessing a common construct with multi-item scales. In latent variable analyses, their inclusion substantially improves model fits i...
Reading and mathematical competencies are important cognitive prerequisites for children’s educational achievement and later success in society. An ongoing debate pertains to potential transfer effects between both domains and whether reading and mathematics influence each other over time. Therefore, the present study on N = 5185 students from the...
The Need for Cognition Scale (NCS) is a self-report scale measuring individual differences in the tendency to engage in and enjoy thinking. The shortened scale with 18 items (NCS-18; Cacioppo et al., 1984) has widely been administered in research on persuasion, critical thinking, and educational achievement. Whereas most studies advocated for essen...
Disengaged responding poses a severe threat to the validity of educational large-scale assessments, because item responses from unmotivated test-takers do not reflect their actual ability. Existing identification approaches rely primarily on item response times, which bears the risk of misclassifying fast engaged or slow disengaged responses. Proce...
Although the Satisfaction with Life Scale strives to capture a single dimension, describing respondents' satisfaction with life as a whole, individual items might also capture unique aspects of life satisfaction leading to some form of multidimensionality. Such systematic item-specific variance can be viewed as a content-laden secondary trait. Info...
The German National Educational Panel Study investigates individual competences and educational trajectories in a longitudinal multi-cohort study design. The third of the six starting cohorts focuses on paths through lower into upper secondary level and beyond. The representative sample includes about N = 6,112 students from fifth grade attending r...
Hintergrund: Die Rechtschreibleistung am Ende der ersten Klasse sagt die weitere Rechtschreibentwicklung voraus. Um Schwierigkeiten im Rechtschreiben früh zu erkennen und Interventionen ansetzen zu können, braucht es ein Messinstrument, das schriftsystematische Prinzipien abbildet. Der neue digitale „SCHNAPP-Rechtschreibtest“ basiert auf Wortmateri...
The Brief COPE (Coping Orientation to Problems Experienced) is a frequently used questionnaire assessing 14 theoretically derived coping mechanisms, but psychometric research has suggested inconsistent results concerning its factor structure. The aim of this study was to investigate primary and secondary order factor structures of the Brief COPE du...
Declarative metacognition, use of reading strategies and reading motivation are important predictors of reading literacy. Moreover, reading motivation’s strong links with reading strategy use and declarative metacognition raise questions about whether motivation moderates the effects of the latter on reading literacy and its development during seco...
This article examines the development of reading and mathematical competence in early secondary education and aims at identifying distinct profiles of competence development. Since reading and mathematical competences are highly correlated both cross-sectionally and longitudinally, we expected to find a generalized profile of competence development...
The Core Self-Evaluations Scale (CSES) measures a broad personality trait reflecting individuals’ self-appraisals of their worth, capabilities, and control of their lives. Although the CSES was designed to capture a single trait, factor analytic studies often found more complex measurement structures. These either referred to different content face...
Reading comprehension in bilingual children depends on the extent to which each language is used in daily life. To date, most bilingual studies have focused on children who learn the majority language as their second language (L2 bilingual children). In contrast, bilingual children learning the majority language as their first language (L1 bilingua...
Children with special educational needs in the area of learning (SEN-L) have severe learning disabilities and often exhibit substantial cognitive impairments. Therefore, standard assessment instruments of basic cognitive abilities designed for regular school children are frequently too complex for them and, thus, unable to provide reliable proficie...
This study explores how researchers' analytical choices affect the reliability of scientific findings. Most discussions of reliability problems in science focus on systematic biases. We broaden the lens to emphasize the idiosyncrasy of conscious and unconscious decisions that researchers make during data analysis. We coordinated 161 researchers in...
Alexithymia is defined as the inability of persons to describe their emotional states, to identify the feelings of others, and a utilitarian type of thinking. The most popular instrument to assess alexithymia is the Toronto Alexithymia Scale (TAS-20). Despite its widespread use, an ongoing controversy pertains to its internal structure. The TAS-20...
Response styles (RSs) such as acquiescence represent systematic respondent behaviors in self-report questionnaires beyond the actual item content. They distort trait estimates and contribute to measurement bias in questionnaire-based research. Although various approaches were proposed to correct the influence of RSs, little is known about their rel...
The representation of women in cultural products is a main concern to the social sciences and humanities and a topic that many citizens care about. We analyzed female characters in the 30 worldwide highest grossing movies per year, for the past 40 years, amounting to 1,200 movies. Our study was based on the Bechdel-Wallace test (BWT, Bechdel test):...
Rationale:
High body mass and obesity are frequently linked to the use of sedentary media, like television (TV) or non-active video games. Empirical evidence regarding video gaming, however, has been mixed, and theoretical considerations explaining a relationship between general screen time and body mass may not generalize to non-active video gami...
Meta-analytic structural equation modeling (MASEM) combines the strengths of meta-analysis with the flexibility of path models to address multivariate research questions using summary statistics. Because many research questions refer to latent constructs, measurement error can distort effect estimates in MASEMs if the unreliability of study variabl...
Although web-based cognitive assessments have gained increasing attention in recent decades, it is still debated whether unstandardized test settings allow for comparable measurements as compared to proctored testing, particularly for speeded cognitive tests. Therefore, two within-subject experiments (N = 73 and N = 72) compared differences in mean...
Whilst writing this editorial, we are looking back at almost 2 years of crisis due to the COVID-19-pandemic. From a first unprecedented lockdown in March 2020, after the first cases of this new virus disease were detected, to a series of more lockdowns, and hygiene regulations, it seems worthwhile to summarize findings that shed light on the situat...
Valid information on early social-emotional competence is essential to diagnose, treat, and prevent behavioral problems in children and adolescents. Particularly in young children, social-emotional competence is frequently measured using parent and teacher ratings that frequently exhibit low agreement. Therefore, the present study on n = 532 three-...
The product-moment correlation is a central statistic in exploratory and confirmatory research including longitudinal and meta-analytic applications. Unfortunately, it has a rather complex sampling distribution which leads to sample correlations that are biased indicators of the respective population correlations. Moreover, there seems to be some u...
Reading and mathematical competencies are important cognitive prerequisites for children’s educational achievement and later success in society. An ongoing debate pertains to potential transfer effects between both domains and whether reading and mathematics influence each other over time. Therefore, the present study on N = 5,185 students from the...
Method effects on the item level can be modeled as latent difference variables in longi-tudinal data. These item-effect variables represent interindividual differences associated with responses to a specific item when assessing a common construct with multi-item scales. In latent variable analyses, their inclusion substantially improves model fits...
Although web-based cognitive assessments have gained increasing attention in recent decades, it is still debated whether unstandardized test settings allow for comparable measurements as compared to proctored testing, particularly for speeded cognitive tests. Therefore, two within-subject experiments (N = 73 and 72) compared differences in means, c...
Meta-analyses of treatment effects in randomized control trials are often faced with the problem of missing information required to calculate effect sizes and their sampling variances. Particularly, correlations between pre- and posttest scores are frequently not available. As an ad-hoc solution, researchers impute a constant value for the missing...
In the field of human-robot interaction, the well-known uncanny valley hypothesis proposes a curvilinear relationship between a robot’s degree of human likeness and the observers’ responses to the robot. While low to medium human likeness should be associated with increased positive responses, a shift to negative responses is expected for highly an...
Background
After elementary school, students in Germany are separated into different school tracks (i.e., school types) with the aim of creating homogeneous student groups in secondary school. Consequently, the development of students’ reading achievement diverges across school types. Findings on this achievement gap have been criticized as dependi...
In order to draw pertinent conclusions about persons with low reading skills, it is essential to use validated standard-setting procedures by which they can be assigned to their appropriate level of proficiency. Since there is no standard-setting procedure without weaknesses, external validity studies are essential. Traditionally, studies have asse...
In the field of human-robot interaction, the well-known uncanny valley hypothesis proposes a curvilinear relationship between a robot’s degree of human likeness and the observers’ responses to the robot. While low to medium human likeness should be associated with increasingly positive responses, a shift to negative responses is expected for highly...
The registered report was targeted at identifying latent profiles of competence development in reading and mathematics among N = 15,012 German students in upper secondary education sampled in a multi-stage stratified cluster design across German schools. These students were initially assessed in grade 9 and provided competence assessments on three...
In large-scale educational assessments, interviewers should ensure standardized settings for all participants. However, in practice many interviewers do not strictly adhere to standardized field protocols. Therefore, systematic interviewer effects for the measurement of mathematical competence were examined in a representative sample of N = 5,139 G...
Perceptual speed is a basic component of cognitive functioning that allows people to efficiently process novel visual stimuli and quickly react to them. In educational studies, tests measuring perceptual speed are frequently developed using students from regular schools without considering students with special educational needs. Therefore, it is u...
In the context of item response theory (IRT), linking the scales of two measurement points is a prerequisite to examine a change in competence over time. In educational large-scale assessments, non-identical test forms sharing a number of anchor-items are frequently scaled and linked using two− or three-parametric item response models. However, if...
The paper reports findings from a crowdsourced replication. Eighty-four replicator teams attempted to verify results reported in an original study by running the same models with the same data. The replication involved an experimental condition. A “transparent” group received the original study and code, and an “opaque” group received the same unde...
Alexithymia is defined as the inability of persons to describe their emotional states, to identify the feelings of others, and a utilitarian type of thinking. The most popular instrument to assess alexithymia is the Toronto Alexithymia Scale (TAS-20). Despite its widespread use, an ongoing controversy pertains to its internal structure. The TAS-20...
Findings from 162 researchers in 73 teams testing the same hypothesis with the same data reveal a universe of unique analytical possibilities leading to a broad range of results and conclusions. Surprisingly, the outcome variance mostly cannot be explained by variations in researchers’ modeling decisions or prior beliefs. Each of the 1,261 test mod...
This registered report protocol elaborates on the theory, methods, and material of a study to identify latent profiles of competence development in reading and mathematics among German students in upper secondary education. It is expected that generalized (reading and mathematical competence develop similarly) and specialized (one of the domains de...
Information and communication technology (ICT) literacy represents an essential skill for adolescents to efficiently participate in a modern society. Previous research reported conflicting findings regarding gender differences in ICT literacy. Therefore, the aim of the present study was the exploration of cross-sectional and longitudinal gender eff...
In large-scale social surveys, respondents are typically interviewed on different days of the week. Because previous research established systematic daily fluctuations of people’s mood, it was hypothesized that subjective well-being ratings might be similarly affected by the day the interview takes place. Therefore, an individual-participant meta-a...
The Positive and Negative Affect Schedule (PANAS; Watson et al., 1988 ) is a popular self-report questionnaire that is administered all over the world. Though originally developed to measure two independent factors, different models have been proposed in the literature. Comparisons among alternative models as well as analyses concerning their robus...
An Erratum to this paper has been published: https://doi.org/10.1007/s11618-020-00980-8
Background
Human Papillomavirus (HPV) is associated with development of oropharyngeal cancer. Aim of this review was to assess airborne transmission risk of infectious particles from HPV lesions to airway mucosa of medical staff during established ablation procedures.
Methods
A systematic review of human and animal studies, published before 09/202...
Societies have socially shared assumptions about what constitutes typically male or female attributes. Language can contribute to gender inequality by transmitting gender stereotypes. This study examines whether gender-stereotypical connotations in stimulus texts within a reading competence test might serve as a nuisance factor distorting reading c...
Educational large-scale studies typically adopt highly standardized settings to collect cognitive data on large samples of respondents. Increasing costs alongside dwindling response rates in these studies necessitate exploring alternative assessment strategies such as unsupervised web-based testing. Before respective assessment modes can be impleme...
Educational large-scale studies typically adopt highly standardized settings to collect cognitive data on large samples of respondents. Increasing costs alongside dwindling response rates in these studies necessitate exploring alternative assessment strategies such as unsupervised web-based testing. Before respective assessment modes can be impleme...
Red color supposedly affects cognitive functioning in achievement situations and impairs test performance. Although this has been shown for different cognitive domains in different populations and cultural contexts, recent studies including close replications failed to corroborate this effect. Reported here is a random-effects meta-analysis of 67 e...
Innerhalb von wenigen Jahrzehnten hat das Internet zahlreiche Lebensbereiche einschneidend verändert. Dieser Beitrag beleuchtet entsprechende Auswirkungen auf die psychologische Forschung und diskutiert Möglichkeiten qualitativer Forschungsvorhaben mithilfe internetbasierter Ansätze. Insbesondere die Besonderheiten von qualitativen Online-Interview...
Careless responding is considered a bias in survey responses without regard to the actual item content which constitutes a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (...
Careless responding is considered a bias in survey responses without regard to the actual item content which constitutes a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (...