Steffen ZitzmannMSH Medical School Hamburg – University of Applied Sciences and Medical University | MSH
Steffen Zitzmann
Dr. phil. habil., Dipl.-Psych.
About
87
Publications
14,286
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,159
Citations
Introduction
I am professor of quantitative methods at the Medical School Hamburg, Germany. My main interests are Bayesian statistics, multilevel analysis, structural equation modeling, meta-analysis, and the analysis of quasi-experiments.
Publications
Publications (87)
This journal recently published a systematic review of simulation studies on the performance of Bayesian approaches for estimating latent variable models in small samples. The authors of this review highlighted that Bayesian approaches can perform poorly (i.e., by exhibiting bias) when the prior distributions are not thoughtfully constructed on the...
In the analysis of hierarchical data, multilevel structural equation modeling (multilevel SEM) has become the standard in the social sciences. To estimate these models, maximum likelihood (ML) approaches have been applied because they are the default in latent variable software. However, one drawback of ML is that it tends to suffer from estimation...
A central question in educational research is how classroom climate variables, such as teaching quality, goal structures, or interpersonal teacher behavior, are related to critical student outcomes, such as students’ achievement and motivation. Student ratings are frequently used to measure classroom climate. When using student ratings to assess cl...
We investigated three different approaches for quantifying individual change and reporting it back to persons: (a) the common change score, which is obtained by first computing scale scores from two consecutive measurements and then subtract these scores from one another, (b) the ad-hoc approach, which is similar to the former approach but uses reg...
This article sensitizes clinicians and researchers to the uncertainty inherent in clinical measures and points the problems that it poses for the assessment of a response and, more importantly, for the computation of responder rates in clinical trials, using the Positive and Negative Syndrome Scale (PANSS) as an example. We offer a new, promising a...
Teachers’ professional development is crucial for effective classroom practice. Due to its advantages, many teachers have participated in online professional development (OPD) in recent years. Numerous studies have investigated the participation and effect of first- to 12th-grade in-service teachers’ OPD participation on the teacher, classroom prac...
Heterogeneity of variance is more than a statistical nuisance when variance parameters are of substantial interest. In multilevel modeling (e.g. students within classes), for instance, the inclusion of discrete variables at the between-cluster level (e.g. school type) may lead to the detection of differences between variances at the within-cluster...
In this article, we extend the regularized Bayesian estimator of multilevel
latent variable models to improve the estimation in terms of the Mean Squared Error (MSE) of the between-group parameter in two-level latent variable models with covariates.
Psychological measures frequently show trait-like properties, and the ontological status of stable psychological traits has been discussed for decades. We argue that these properties can emerge from causal dynamics of time-varying processes, which are omitted from the analysis model, potentially leading to the estimation of traits that are, at leas...
This first-of-its-kind meta-analysis (N = 79 studies; 56,552 students; k = 640 effects) provides a comprehensive assessment of five cultural diversity climate approaches that capture different ways of addressing cultural diversity in K-12 schools. We examined how intergroup contact theory’s optimal contact conditions, multiculturalism climate, colo...
The assessment of individual students is not only crucial in the school setting but also at the core of educational research. Although classical test theory focuses on maximizing insights from student responses, the Bayesian perspective incorporates the assessor's prior belief, thereby enriching assessment with knowledge gained from previous intera...
Background
At the State Europe School of Berlin (SESB) students with different language backgrounds learn together in two languages of instruction: German and one of nine partner languages (English, French, Greek, Italian, Polish, Portuguese, Russian, Spanish, and Turkish).
Aims
This study investigates the reading proficiency trajectories in the mi...
We point out potential drawbacks of some of Leising et al.’s (2022a) proposed ways how personality science can be improved. We argue that it is ill-advised to use only one measure for a concept. Also, we argue that researchers should not refrain from conducting a study when a high level of statistical power is precluded. Then, we go one step furthe...
Several AI-aided screening tools have emerged to tackle the ever-expanding body of literature. These tools employ active learning, where algorithms sort abstracts based on human feedback. However, researchers using these tools face a crucial dilemma: When should they stop screening without knowing the proportion of relevant studies? Although numero...
With the rapid growth of scholarly literature, efficient AI-aided abstract screening tools are becoming increasingly important. This study evaluated nine different machine learning algorithms used in AI-aided screening tools for ordering abstracts according to their estimated relevance. We focused on assessing their performance in terms of the numb...
We propose an optimally regularized Bayesian estimator of multilevel latent
variable models that aims to outperform traditional maximum likelihood (ML)
estimation in mean squared error (MSE) performance. We focus on the between-group slope in a two-level model with a latent covariate. Our estimator combines
prior information with data-driven insigh...
Researchers working with intensive longitudinal designs often encounter the challenge of determining whether to relax the assumption of stationarity in their models. Given that these designs typically involve data from a large number of subjects (N ≫ 1), visual screening all time series can quickly become tedious. Even when conducted by experts, su...
Continuous-time modeling has become a cornerstone in psychological research for analyzing longitudinal data, traditionally focusing on time-stable dynamics. However, the exploration of nonstationary psychological phenomena, which may be reflected by time-varying auto- and cross-effects, remains underdeveloped. In this paper, we extend a widely used...
The relations between perceived fatigue and changes in sustained attention performance during early stages of working on cognitive demanding tasks remain poorly understood. In addition, concerns have been raised that self-ratings of fatigue may be biased by socially desirable response tendencies, confounding the relation between perceived fatigue a...
Small sample sizes pose a severe threat to convergence and accuracy of between-group level parameter estimates in multilevel structural equation modeling (SEM). However, in certain situations, such as pilot studies or when populations are inherently small, increasing samples sizes is not feasible. As a remedy, we propose a two-stage regularized est...
The APA encourages authors to thoroughly report their results, including confidence intervals. However, considerable debate exists regarding the computation of confidence intervals in within-subject designs. Nathoo et al.’s (2018) recently proposed a Bayesian within-subject credible interval, which has faced criticism for not accounting for the unc...
Research has shown that current mental fatigue and self-control capacity play a crucial role in the goal-directed regulation of emotion, motivation, and cognition. However, whether the emergence of fatigue during the exercise of cognitive performance is indicative of individuals' time-invariant fatigue vulnerability traits is still not well underst...
A two-level data set can be structured in either long format (LF) or wide format (WF), and both have corresponding SEM approaches for estimating multilevel models. Intuitively, one might expect these approaches to perform similarly. However, the two data formats yield data matrices with different numbers of columns and rows, and their cols:rows is...
Item response theory (IRT) has evolved as a standard psychometric approach in recent years, in particular for test construction based on dichotomous (i.e., true/false) items. Unfortunately, large samples are typically needed for item refinement in unidimensional models and even more so in the multidimensional case. However, Bayesian IRT approaches...
Background: How to best operationalize teachers’ autonomy support, an instructional style aiming to satisfy students’ psychological need for autonomy, is unclear because teachers can support the whole class and/or individual students. Students might perceive inequalities concerning the autonomy support they receive relative to classmates, which mig...
Systematic reviews and meta-analyses are crucial for advancing research, yet they are time-consuming and resource-demanding. Although machine learning and natural language processing algorithms may reduce this time and these resources, their performance has not been tested in education and educational psychology, and there is a lack of clear inform...
This paper introduces Dynamic Intellectual Investment Trait and State Theory, an extension of Intellectual Investment Trait Theory. Our theory extension (a) centers on dynamic within-person reciprocal effects between cognitive performance states and intellectual investment personality states (b) integrates within-person dynamics and developmental t...
Multilingualism is often associated with advantages for acquiring additional languages. Theoretical approaches explain these advantages by assuming a Common Underlying Proficiency or a Metalinguistic Awareness. At the State Europe School in Berlin, students from different language backgrounds receive instruction in German and a partner language acc...
To effectively adopt technology during teaching, teachers require knowledge of how to operate technology. Especially first-time technology users need knowledge of how to handle digital devices and software programs as a foundation to use technology in the classroom successfully. This knowledge has so far been assessed mainly using self-reports. How...
This letter serves to remind readers of Schizophrenia Research that like all measures in psychiatry, the Positive and Negative Syndrome Scale (PANSS; Kay et al., 1987) used for assessing symptom severity of patients with schizophrenia is contaminated with error. According to true-score theory (Lord and Novick, 1968; Novick, 1966), the PANSS total s...
Procrastination leads to obstructive consequences for students in higher education. Cross-sectional studies show that procrastination is positively associated with study dissatisfaction and students' intentions to drop out of their university degree program. However, the reciprocal effects between these variables throughout an entire university deg...
Background
Previous research has shown that the more people believe their emotions are controllable and useful (BECU), the less they generally report psychological distress. Psychological distress, in turn, impacts health outcomes, and is among the most frequently reported complaints in psychotherapeutic and psychosomatic practice.
Objective
We ai...
A crucial challenge in Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation is to diagnose the convergence of the chains so that the draws can be expected to closely approximate the posterior distribution on which inference is based. A close approximation guarantees that the MCMC error exhibits only a negligible impact on model estima...
In multilevel nonlinear structural equation modeling via latent moderated structural equations, the homoscedasticity assumption is typically made; that is, it is assumed that the variances within higher-level units are equal across these units. However, this assumption is frequently violated in research, potentially leading to inaccuracies in stand...
Epistemic virtues are character traits conducive to principled ways of thinking, leading to a life of flourishing. Recent years have witnessed an emergence of theoretical accounts describing how they develop. However, few if any studies have conducted rigorous empirical investigation into the mechanisms of intellectual virtue development. In this s...
In random-effects models, hierarchical linear models, or multilevel models, it is typically assumed that the variances within higher-level units are homoscedastic, meaning that they are equal across these units. However, this assumption is often violated in research. Depending on the degree of violation, this can lead to biased standard errors of h...
Theories of cognitive development among emerging adults posit that environmental and age-related influences are responsible for individual differences in complex reasoning abilities. Exposure to and engagement with a diverse set of ideas and perspectives is stipulated to provide a context for which individuals are positioned to coordinate, integrat...
Mit den Methoden der psychologischen Diagnostik lassen sich nicht nur Fragen bezüglich des Ist-Zustands psychischer Merkmale adressieren sondern auch Fragen bezüglich der Veränderung dieser Merkmale. In diesem Beitrag werden verschiedene Ansätze zur Quantifizierung individueller, den Einzelfall betreffender Veränderungen diskutiert. Zunächst wird d...
This article is a comment on Brady et al. (Educational Psychology Review, 35, 6–37, 2023) with which we largely agree. We add to this important discussion by pointing to the underestimated importance of communicating findings to stakeholders, which is important because recommendations are derived from them, and a correct understanding is essential...
The recently proposed continuous-time latent curve model with structured residuals (CT-LCM-SR) addresses several challenges associated with longitudinal data analysis in the behavioral sciences. First, it provides information about process trends and dynamics. Second, using the continuous-time framework, the CT-LCM-SR can handle unequally spaced me...
The relationship between students’ subject-specific academic self-concept and their academic achievement is one of the most widely researched topics in educational psychology. A large proportion of this research has considered cross-lagged panel models (CLPMs), oftentimes synonymously referred to as reciprocal effects models (REMs), as the gold sta...
To compute factor score estimates, lavaan version 0.6–12 offers the function lavPredict( ) that can not only be applied in single-level modeling but also in multilevel modeling, where characteristics of higher-level units such as working environments or team leaders are often assessed by ratings of employees. Surprisingly, the function provides res...
One major challenge of longitudinal data analysis is to find an appropriate statistical model that corresponds to the theory of change and the research questions at hand. In the present article, we argue that continuous-time models are well suited to study the continuously developing constructs of primary interest in the education sciences and outl...
Feedback may be given to teachers about how well they teach. Such feedback can inform teachers and stimulate professional development. The feedback is often based on student ratings of teaching quality. A teacher’s true value of their teaching quality can be estimated, for example, by the Maximum Likelihood (ML) estimate or the Expected A Posterior...
Die Forschungsdesigns der Lehr- und Lernforschung weisen häufig eine Mehrebenenstruktur auf, in der Schüler/-innen in übergeordneten Gruppierungseinheiten geschachtelt sind (z. B. Schüler/-innen in Schulklassen oder Schulen). Zur Analyse solcher Daten werden Mehrebenmodelle eingesetzt, die es ermöglichen, die Mehrebenenstruktur zu berücksichtigen....
The default procedures of the software programs Mplus and lavaan tend to yield an inadmissible solution (also called a Heywood case) when the sample is small or the parameter is close to the boundary of the parameter space. In factor models, a negatively estimated variance does often occur. One strategy to deal with this is fixing the variance to z...
Perceived individualized teacher frame of reference (students’ perception that teacher feedback considers students’ effort and former achievements) assumingly positively affects academic self-concept, especially for low-performing students. Following Dimensional Comparison Theory, individualized teacher frame in one school subject might negatively...
Croon and van Veldhoven discussed a model for analyzing micro–macro multilevel designs in which a variable measured at the upper level is predicted by an explanatory variable that is measured at the lower level. Additionally, the authors proposed an approach for estimating this model. In their approach, estimation is carried out by running a regres...
Two-way immersion (TWI) is a variant of the increasingly popular bilingual instruction. Most TWI research lacks longitudinal data or the consideration of background variables to control for possible selection effects. This article examines the development of German reading comprehension of TWI students (N = 984) from fourth to sixth grade compared...
Continuous-time modeling is gaining in popularity as more and more intensive longitudinal data need to be analyzed. Current Bayesian software implementations of continuous-time models suffer from rather high, inadequate run times. Therefore, we apply a model reformulation approach to reduce run time. In a simulation study, we investigate the estima...
The relationship between students’ subject-specific academic self-concept and their academic achievement is one of the most widely researched topics in educational psychology. A large body of this research has considered cross-lagged panel models (CLPMs), oftentimes synonymously referred to as reciprocal effects models (REMs), as a gold standard to...
Bayesian MCMC is a widely used model estimation technique, and software from the BUGS family, such as JAGS, have been popular for over two decades. Recently, Stan entered the market with promises of higher efficiency fueled by advanced and more sophisticated algorithms. With this study, we want to contribute empirical results to the discussion abou...
The concept of respect figures prominently in several theories on intergroup relations. Previous studies suggested that the experience of being respected is primarily related to the feeling of being recognized as an equal, as opposed to social recognition of needs or achievements. Those studies focused, however, on either minority groups or ad hoc...
In their situated expectancy-value theory, Eccles and Wigfield (2020) assume students’ competence and value beliefs to be situation-specific and thereby to be “situative” in nature. Even though motivation research has gradually been developing an understanding of this situative nature, for instance, by disentangling time-consistent and fluctuating...
In many disciplines of the social sciences, comparisons between a group mean and the total mean is a common but also challenging task. As one solution to this statistical testing problem, we propose using linear regression with weighted effect coding. For random samples, this procedure is straightforward and easy to implement by means of standard s...
Bayesian modeling using Markov chain Monte Carlo (MCMC) estimation requires researchers to decide not only whether estimation has converged but also whether the Bayesian estimates are well-approximated by summary statistics from the chain. On the contrary, software such as the Bayes module in Mplus, which helps researchers check whether convergence...
Our own prior research has demonstrated that respect for disapproved others predicts and might foster tolerance toward them. This means that without giving up their disapproval of others’ way of life, people can tolerate others when they respect them as equals (outgroup respect–tolerance hypothesis). Still, there was considerable variation in the s...
Cross-lagged panel models have been commonly applied to investigate the dynamic interplay of variables. In such discrete-time models, the size of the cross-lagged effects depends on the length of the time interval between the measurement occasions. Continuous-time modeling allows to explore this interval dependence of cross-lagged effects and thus...
Die Forschungsdesigns der Lehr- und Lernforschung weisen häufig eine Mehrebenenstruktur auf, in der Schüler/-innen in übergeordneten Gruppierungseinheiten geschachtelt sind (z. B. Schüler/-innen in Schulklassen oder Schulen). Zur Analyse solcher Daten werden Mehrebenmodelle eingesetzt, die es ermöglichen, die Mehrebenenstruktur zu berücksichtigen....
In this commentary, we argue that contemporary psychology can be viewed as similar to democratic societies because they face similar challenges and offer similar solutions to these challenges. After a brief discussion of such challenges, we offer an outlook on a new postmodern methodology that may help psychology overcome these challenges and that...
The present study sought to determine the relative contributions of two aspects of school adjustment to children’s academic progress. We asked if social integration and persistence of effort mediate effects of preschool academic skills, peer problems, and disruptive behavior on Grade 4 achievement. Results based on a German sample of children from...
Bayesian approaches for estimating multilevel latent variable models can be beneficial in small samples. Prior distributions can be used to overcome small sample problems, for example, when priors that increase the accuracy of estimation are chosen. This article discusses two different but not mutually exclusive approaches for specifying priors. Bo...
In (post-)modern, plural societies, consisting of numerous subgroups, mutual respect between groups plays a central role for a constructive social and political life. In this article, we examine whether group members’ perception of being respected by outgroups fosters respect for these outgroups. In Study 1, we employed a panel sample of supporters...
In their situated expectancy-value theory, Eccles et al. (2020) assume students’ competence and value beliefs to be situation-specific and thereby to be “situative” in nature. Even though motivation research has gradually been developing an understanding of this situative nature, for instance, by disentangling time-consistent and fluctuating propor...
Autoregressive modeling has traditionally been concerned with time-series data from one unit (N = 1). For short time series (T < 50), estimation performance problems are well studied and documented. Fortunately, in psychological and social science research, besides T, another source of information is often available for model estimation, that is, t...
Bayesian estimation has become very popular. However, run time of Bayesian models is often unsatisfactorily high. In this illustration, we show how to reduce run time by (a) integrating out nuisance model parameters and by (b) reformulating the model based on covariances and means. The core concept is to use the sample scatter matrix which is in ou...
Dimensional comparisons, where students compare their achievements in different subjects, have a significant impact on the formation of students’ subject-specific self-concepts. This research examines the influence of five moderators that have been shown in previous research to affect the strength of dimensional comparison effects: (1) the intraind...
Developments concerning report cards have led to a potential shift from reporting traditional grades to reporting multiple competencies within and across subjects. In this study, we analyzed the dimensional structure of the teacher judgments on a competency-based report card on fourth-grade elementary school students (N = 469). With a methodologica...
According to the internal/external frame of reference model, academic achievement has a strong impact on people’s self-concept, both within and between subjects. We conducted a series of meta-analyses of k = 505 data sets containing the six bivariate correlations between achievement and self-concept in two subjects. Negative paths from achievement...
Zusammenfassung. Unternehmen und Organisationen stehen unter ständigem Veränderungsdruck. Vor allem seit PISA 2000 können sich auch Schulen in Deutschland diesem nicht mehr entziehen. Der vorliegende Beitrag geht der Frage nach, inwieweit kommunikative Rahmenbedingungen im Veränderungsprozess als relevant für die wahrgenommene Entwicklung der Betro...
Continuous-time modeling is gaining in popularity as more and more intensive longitudinal data need to be analyzed. Current Bayesian software implementations of continuous-time models suffer from rather high, inadequate run times. Therefore, we apply a model reformulation approach to reduce run time. In a simulation study, we investigate the estima...
Bayesian estimation has become very popular. However, run time of Bayesian models is often unsatisfactorily high. In this illustration, we show how to reduce run time by (a) integrating out nuisance model parameters and by (b) reformulating the model based on covariances and means. The core concept is to use the sample scatter matrix which is in ou...
Two-way immersion combines native speakers of the majority language with native speakers of a minority language and provides instruction in both languages. It remains unclear whether students have better reading comprehension in their L1 (native language hypothesis) and/or whether all students, irrespective of their L1, have better reading comprehe...
Most of the software that is available to implement Bayesian approaches uses Markov chain Monte Carlo (MCMC) methods. It is our impression that many researchers are primarily concerned with convergence as assessed by the Potential Scale Reduction (PSR) and that other aspects of MCMC are largely ignored. In this article, we argue that the precision...
We conducted two studies to test the hypothesis that respect for disapproved outgroups increases tolerance toward them. In Study 1, we employed a panel sample of supporters of the Tea Party movement in the United States and found that Tea Party supporters’ respect for homosexuals and Muslims as equal fellow citizens positively predicted Tea Party s...
Using longitudinal research designs, we examine the role of politicization in the development of polarization. We conducted research in two different political and national contexts. In Study 1, we employ a panel sample of supporters of the Tea Party movement in the United States and examine the relationship between the strength of their politiciza...
Over the last decade or two, multilevel structural equation modeling (ML-SEM) has become a prominent modeling approach in the social sciences because it allows researchers to correct for sampling and measurement errors and thus to estimate the effects of Level 2 (L2) constructs without bias. Because the latent variable modeling software Mplus uses...