Multivariate Behavioral Research

Multivariate Behavioral Research

Published by Taylor & Francis on behalf of the Society of Multivariate Experimental Psychology

Online ISSN: 1532-7906

·

Print ISSN: 0027-3171

Disciplines: Behavior; Experimental Psychology; Psychometrics

Journal websiteAuthor guidelines

Top-read articles

68 reads in the past 30 days

Moving Beyond Likert and Traditional Forced-Choice Scales: A Comprehensive Investigation of the Graded Forced-Choice Format

June 2023

·

1,558 Reads

·

13 Citations

·

·

The graded forced-choice (FC) format has recently emerged as an alternative that may preserve the advantages and overcome the issues of the dichotomous FC measures. The current study presented the first large-scale evaluation of the performance of three types of FC measures (FC2, FC4 and FC5 with 2, 4 and 5 response options, respectively) and compared their performances to their Likert (LK) counterparts (LK2, LK4, and LK5) on (1) psychometric properties, (2) respondent reactions, and (3) susceptibility to response styles. Results showed that, compared to LK measures with the same number of response options, the three FC scales provided better support for the hypothesized factor structure, were perceived as more faking-resistant and cognitive demanding, and were less susceptible to response styles. FC4/5 and LK4/5 demonstrated similarly good reliability, while LK2 provided more reliable scores than FC2. When compared across the three FC measures, FC4 and FC5 displayed comparable psychometric performance and respondent reactions. FC4 exhibited a moderate presence of extreme response style, while FC5 had a weak presence of both extreme and middle response styles. Based on these findings, the study recommends the use of graded FC over dichotomous FC and LK, particularly FC5 when extreme response style is a concern.

Download

20 reads in the past 30 days

Estimated Factor Scores Are Not True Factor Scores

January 2025

·

97 Reads

In this tutorial, we clarify the distinction between estimated factor scores, which are weighted composites of observed variables, and true factor scores, which are unobservable values of the underlying latent variable. Using an analogy with linear regression, we show how predicted values in linear regression share the properties of the most common type of factor score estimates, regression factor scores, computed from single-indicator and multiple indicator latent variable models. Using simulated data from 1- and 2-factor models, we also show how the amount of measurement error affects the reliability of regression factor scores, and compare the performance of regression factor scores with that of unweighted sum scores.

Aims and scope


Multivariate Behavioral Research publishes research related to multivariate analysis under three main categories: substantive articles, methodological articles, and tutorials.

  • Multivariate Behavioral Research is the flagship journal of the Society of Multivariate Experimental Psychology (SMEP). It is a journal devoted to the dissemination, evaluation, and application of quantitative methods to the behavioral sciences.
  • It aims to publish articles intended to cater to a wide audience of quantitative methodologists and substantive researchers who wish to use advanced methods in their work. Its goal is to have a long-term transformative effect on research in psychology and the behavioral sciences.
  • The journal is the result of the joint effort of a team of Associate Editors and seeks four kinds of contributions...

For a full list of the subject areas this journal covers, please visit the journal website.

Recent articles


Bayesian Growth Curve Modeling with Measurement Error in Time
  • Article

March 2025

·

3 Reads









Interrater Reliability for Interdependent Social Network Data: A Generalizability Theory Approach

February 2025

·

6 Reads

We propose interrater reliability coefficients for observational interdependent social network data, which are dyadic data from a network of interacting subjects that are observed by external raters. Using the social relations model, dyadic scores of subjects' behaviors during these interactions can be decomposed into actor, partner, and relationship effects. These effects constitute different facets of theoretical interest about which researchers formulate research questions. Based on generalizability theory, we extended the social relations model with rater effects, resulting in a model that decomposes the variance of dyadic observational data into effects of actors, partners, relationships, raters, and their statistical interactions. We used the variances of these effects to define intraclass correlation coefficients (ICCs) that indicate the extent the actor, partner, and relationship effects can be generalized across external raters. We proposed Markov chain Monte Carlo estimation of a Bayesian hierarchical linear model to estimate the ICCs, and tested their bias and coverage in a simulation study. The method is illustrated using data on social mimicry.



Estimated Factor Scores Are Not True Factor Scores

January 2025

·

97 Reads

In this tutorial, we clarify the distinction between estimated factor scores, which are weighted composites of observed variables, and true factor scores, which are unobservable values of the underlying latent variable. Using an analogy with linear regression, we show how predicted values in linear regression share the properties of the most common type of factor score estimates, regression factor scores, computed from single-indicator and multiple indicator latent variable models. Using simulated data from 1- and 2-factor models, we also show how the amount of measurement error affects the reliability of regression factor scores, and compare the performance of regression factor scores with that of unweighted sum scores.


Non-Stationarity in Time-Series Analysis: Modeling Stochastic and Deterministic Trends

January 2025

·

28 Reads

Time series analysis is increasingly popular across scientific domains. A key concept in time series analysis is stationarity, the stability of statistical properties of a time series. Understanding stationarity is crucial to addressing frequent issues in time series analysis such as the consequences of failing to model non-stationarity, how to determine the mechanisms generating non-stationarity, and consequently how to model those mechanisms (i.e., by differencing or detrending). However, many empirical researchers have a limited understanding of stationarity, which can lead to the use of incorrect research practices and misleading substantive conclusions. In this paper, we address this problem by answering these questions in an accessible way. To this end, we study how researchers can use detrending and differencing to model trends in time series analysis. We show via simulation the consequences of modeling trends inappropriately, and evaluate the performance of one popular approach to distinguish different trend types in empirical data. We present these results in an accessible way, providing an extensive introduction to key concepts in time series analysis, illustrated throughout with simple examples. Finally, we discuss a number of take-home messages and extensions to standard approaches, which directly address more complex time-series analysis problems encountered by empirical researchers.


Evidence That Growth Mixture Model Results Are Highly Sensitive to Scoring Decisions

January 2025

·

5 Reads

Interest in identifying latent growth profiles to support the psychological and social-emotional development of individuals has translated into the widespread use of growth mixture models (GMMs). In most cases, GMMs are based on scores from item responses collected using survey scales or other measures. Research already shows that GMMs can be sensitive to departures from ideal modeling conditions and that growth model results outside of GMMs are sensitive to decisions about how item responses are scored, but the impact of scoring decisions on GMMs has never been investigated. We start to close that gap in the literature with the current study. Through empirical and Monte Carlo studies, we show that GMM results-including convergence, class enumeration, and latent growth trajectories within class-are extremely sensitive to seemingly arcane measurement decisions. Further, our results make clear that, because GMM latent classes are not known a priori, measurement models used to produce scores for use in GMMs are, almost by definition, misspecified because they cannot account for group membership. Misspecification of the measurement model then, in turn, biases GMM results. Practical implications of these results are discussed. Our findings raise serious concerns that many results in the current GMM literature may be driven, in part or whole, by measurement artifacts rather than substantive differences in developmental trends.


Causal Estimands and Multiply Robust Estimation of Mediated-Moderation
  • Article
  • Full-text available

January 2025

·

33 Reads

When studying effect heterogeneity between different subgroups (i.e., moderation), researchers are frequently interested in the mediation mechanisms underlying the heterogeneity, that is, the mediated moderation. For assessing mediated moderation, conventional methods typically require parametric models to define mediated moderation, which has limitations when parametric models may be misspecified and when causal interpretation is of interest. For causal interpretations about mediation, causal mediation analysis is increasingly popular but is underdeveloped for mediated moderation analysis. In this study, we extend the causal mediation literature, and we propose a novel method for mediated moderation analysis. Using the potential outcomes framework, we obtain two causal estimands that decompose the total moderation: (i) the mediated moderation attributable to a mediator and (ii) the remaining moderation unattributable to the mediator. We also develop a multiply robust estimation method for the mediated moderation analysis, which can incorporate machine learning methods in the inference of the causal estimands. We evaluate the proposed method through simulations. We illustrate the proposed mediated moderation analysis by assessing the mediation mechanism that underlies the gender difference in the effect of a preventive intervention on adolescent behavioral outcomes.


MIIVefa: An R Package for a New Type of Exploratory Factor Anaylysis Using Model-Implied Instrumental Variables

December 2024

·

109 Reads

We present the R package MIIVefa, designed to implement the MIIV-EFA algorithm. This algorithm explores and identifies the underlying factor structure within a set of variables. The resulting model is not a typical exploratory factor analysis (EFA) model because some loadings are fixed to zero and it allows users to include hypothesized correlated errors such as might occur with longitudinal data. As such, it resembles a confirmatory factor analysis (CFA) model. But, unlike CFA, the MIIV-EFA algorithm determines the number of factors and the items that load on these factors directly from the data. We provide both simulation and empirical examples to illustrate the application of MIIVefa and discuss its benefits and limitations.


On the Latent Structure of Responses and Response Times from Multidimensional Personality Measurement with Ordinal Rating Scales

December 2024

·

15 Reads

In this article, we propose latent variable models that jointly account for responses and response times (RTs) in multidimensional personality measurements. We address two key research questions regarding the latent structure of RT distributions through model comparisons. First, we decompose RT into decision and non-decision times by incorporating irreducible minimum shifts in RT distributions, as done in cognitive decision-making models. Second, we investigate whether the speed factor underlying decision times should be multidimensional with the same latent structure as personality traits, or, if a unidimensional speed factor suffices. Comprehensive model comparisons across four distinct datasets suggest that a joint model with person-specific parameters to account for shifts in RT distributions and a unidimensional speed factor provides the best account for ordinal responses and RTs. Posterior predictive checks further confirm these findings. Additionally, simulation studies validate the parameter recovery of the proposed models and support the empirical results. Most importantly, failing to account for the irreducible minimum shift in RT distributions leads to systematic biases in other model components and severe underestimation of the nonlinear relationship between responses and RTs.


Evaluating Contextual Models for Intensive Longitudinal Data in the Presence of Noise

December 2024

·

13 Reads

·

1 Citation

Nowadays research into affect frequently employs intensive longitudinal data to assess fluctuations in daily emotional experiences. The resulting data are often analyzed with moderated autoregressive models to capture the influences of contextual events on the emotion dynamics. The presence of noise (e.g., measurement error) in the measures of the contextual events, however, is commonly ignored in these models. Disregarding noise in these covariates when it is present may result in biased parameter estimates and wrong conclusions drawn about the underlying emotion dynamics. In a simulation study we evaluate the estimation accuracy, assessed in terms of bias and variance, of different moderated autoregressive models in the presence of noise in the covariate. We show that estimation accuracy decreases when the amount of noise in the covariate increases. We also show that this bias is magnified by a larger effect of the covariate, a slower switching frequency of the covariate, a discrete rather than a continuous covariate, and constant rather than occasional noise in the covariate. We also show that the bias that results from a noisy covariate does not decrease when the number of observations increases. We end with a few recommendations for applying moderated autoregressive models based on our simulation.


A Gentle Introduction and Application of Feature-Based Clustering with Psychological Time Series

December 2024

·

69 Reads

Psychological researchers and practitioners collect increasingly complex time series data aimed at identifying differences between the developments of participants or patients. Past research has proposed a number of dynamic measures that describe meaningful developmental patterns for psychological data (e.g., instability, inertia, linear trend). Yet, commonly used clustering approaches are often not able to include these meaningful measures (e.g., due to model assumptions). We propose feature-based time series clustering as a flexible, transparent, and well-grounded approach that clusters participants based on the dynamic measures directly using common clustering algorithms. We introduce the approach and illustrate the utility of the method with real-world empirical data that highlight common ESM challenges of multivariate conceptualizations, structural missingness, and non-stationary trends. We use the data to showcase the main steps of input selection, feature extraction, feature reduction, feature clustering, and cluster evaluation. We also provide practical algorithm overviews and readily available code for data preparation, analysis, and interpretation.


Using Projective IRT to Evaluate the Effects of Multidimensionality on Unidimensional IRT Model Parameters

December 2024

·

37 Reads

The application of unidimensional IRT models requires item response data to be unidimensional. Often, however, item response data contain a dominant dimension, as well as one or more nuisance dimensions caused by content clusters. Applying a unidimensional IRT model to multidimensional data causes violations of local independence, which can vitiate IRT applications. To evaluate and, possibly, remedy the problems caused by forcing unidimensional models onto multidimensional data, we consider the creation of a projected unidimensional IRT model, where the multidimensionality caused by nuisance dimensions is controlled for by integrating them out from the model. Specifically, when item response data have a bifactor structure, one can create a unidimensional model based on projecting to the general factor. Importantly, the projected unidimensional IRT model can be used as a benchmark for comparison to a unidimensional model to judge the practical consequences of multidimensionality. Limitations of the proposed approach are detailed.


On the Importance of Considering Concurrent Effects in Random-Intercept Cross-Lagged Panel Modelling: Example Analysis of Bullying and Internalising Problems

November 2024

·

30 Reads

Random-intercept cross-lagged panel models (RI-CLPMs) are increasingly used to investigate research questions focusing on how one variable at one time point affects another variable at the subsequent time point. Due to the implied temporal sequence of events in such research designs, interpretations of RI-CLPMs primarily focus on longitudinal cross-lagged paths while disregarding concurrent associations and modeling these only as residual covariances. However, this may cause biased cross-lagged effects. This may be especially so when data collected at the same time point refers to different reference timeframes, creating a temporal sequence of events for constructs measured concurrently. To examine this issue, we conducted a series of empirical analyses in which the impact of modeling or not modeling of directional within-time point associations may impact inferences drawn from RI-CLPMs using data from the longitudinal z-proso study. Results highlight that not considering directional concurrent effects may lead to biased cross-lagged effects. Thus, it is essential to carefully consider potential directional concurrent effects when choosing models to analyze directional associations between variables over time. If temporal sequences of concurrent effects cannot be clearly established, testing multiple models and drawing conclusions based on the robustness of effects across all models is recommended.


Latently Mediating: A Bayesian Take on Causal Mediation Analysis with Structured Survey Data

November 2024

·

13 Reads

In this paper, we propose a Bayesian causal mediation approach to the analysis of experimental data when both the outcome and the mediator are measured through structured questionnaires based on Likert-scaled inquiries. Our estimation strategy builds upon the error-in-variables literature and, specifically, it leverages Item Response Theory to explicitly model the observed surrogate mediator and outcome measures. We employ their elicited latent counterparts in a simple g-computation algorithm, where we exploit the fundamental identifying assumptions of causal mediation analysis to impute all the relevant counterfactuals and estimate the causal parameters of interest. We finally devise a sensitivity analysis procedure to test the robustness of the proposed methods to the restrictive requirement of mediator's conditional ignorability. We demonstrate the functioning of our proposed methodology through an empirical application using survey data from an online experiment on food purchasing intentions and the effect of different labeling regimes.


Why You Should Not Estimate Mediated Effects Using the Difference-in-Coefficients Method When the Outcome is Binary

October 2024

·

63 Reads

Despite previous warnings against the use of the difference-in-coefficients method for estimating the indirect effect when the outcome in the mediation model is binary, the difference-in-coefficients method remains readily used in a variety of fields. The continued use of this method is presumably because of the lack of awareness that this method conflates the indirect effect estimate and non-collapsibility. In this paper, we aim to demonstrate the problems associated with the difference-in-coefficients method for estimating indirect effects for mediation models with binary outcomes. We provide a formula that decomposes the difference-in-coefficients estimate into (1) an estimate of non-collapsibility, and (2) an indirect effect estimate. We use a simulation study and an empirical data example to illustrate the impact of non-collapsibility on the difference-in-coefficients estimate of the indirect effect. Further, we demonstrate the application of several alternative methods for estimating the indirect effect, including the product-of-coefficients method and regression-based causal mediation analysis. The results emphasize the importance of choosing a method for estimating the indirect effect that is not affected by non-collapsibility.


A Causal View on Bias in Missing Data Imputation: The Impact of Evil Auxiliary Variables on Norming of Test Scores

October 2024

·

28 Reads

Among the most important merits of modern missing data techniques such as multiple imputation (MI) and full-information maximum likelihood estimation is the possibility to include additional information about the missingness process via auxiliary variables. During the past decade, the choice of auxiliary variables has been investigated under a variety of different conditions and more recent research points to the potentially biasing effect of certain auxiliary variables, particularly colliders (Thoemmes & Rose, 2014). In this article, we further extend biasing mechanisms of certain auxiliary variables considered in previous research and thereby focus on their effects on individual diagnosis based on norming, in which the whole distribution of a variable is of interest rather than average coefficients (e.g., means). For this, we first provide the theoretical underpinnings of the mechanisms under study and then provide two focused simulations that (i) directly expand on the collider scenario in Thoemmes and Rose (2014, appendix A) by considering outcomes that are relevant to norming and (ii) extend the scenarios under consideration by instrumental variable mechanisms. We illustrate the bias mechanisms for two different norming approaches and exemplify the procedures by means of an empirical example. We end by discussing limitations and implications of our research.


Make Some Noise: Generating Data from Imperfect Factor Models

October 2024

·

8 Reads

·

1 Citation

Researchers simulating covariance structure models sometimes add model error to their data to produce model misfit. Presently, the most popular methods for generating error-perturbed data are those by Tucker, Koopman, and Linn (TKL), Cudeck and Browne (CB), and Wu and Browne (WB). Although all of these methods include parameters that control the degree of model misfit, none can generate data that reproduce multiple fit indices. To address this issue, we describe a multiple-target TKL method that can generate error-perturbed data that will reproduce target RMSEA and CFI values either individually or together. To evaluate this method, we simulated error-perturbed correlation matrices for an array of factor analysis models using the multiple-target TKL method, the CB method, and the WB method. Our results indicated that the multiple-target TKL method produced solutions with RMSEA and CFI values that were closer to their target values than those of the alternative methods. Thus, the multiple-target TKL method should be a useful tool for researchers who wish to generate error-perturbed correlation matrices with a known degree of model error. All functions that are described in this work are available in the fungible R library. Additional materials (e.g., R code, supplemental results) are available at https://osf.io/vxr8d/.


Exploring Estimation Procedures for Reducing Dimensionality in Psychological Network Modeling

September 2024

·

19 Reads

To understand psychological data, it is crucial to examine the structure and dimensions of variables. In this study, we examined alternative estimation algorithms to the conventional GLASSO-based exploratory graph analysis (EGA) in network psychometric models to assess the dimensionality structure of the data. The study applied Bayesian conjugate or Jeffreys' priors to estimate the graphical structure and then used the Louvain community detection algorithm to partition and identify groups of nodes, which allowed the detection of the multi- and unidimensional factor structures. Monte Carlo simulations suggested that the two alternative Bayesian estimation algorithms had comparable or better performance when compared with the GLASSO-based EGA and conventional parallel analysis (PA). When estimating the multidimensional factor structure, the analytically based method (i.e., EGA.analytical) showed the best balance between accuracy and mean biased/absolute errors, with the highest accuracy tied with EGA but with the smallest errors. The sampling-based approach (EGA.sampling) yielded higher accuracy and smaller errors than PA; lower accuracy but also lower errors than EGA. Techniques from the two algorithms had more stable performance than EGA and PA across different data conditions. When estimating the unidimensional structure, the PA technique performed the best, followed closely by EGA, and then EGA.analytical and EGA.sampling. Furthermore, the study explored four full Bayesian techniques to assess dimensionality in network psychometrics. The results demonstrated superior performance when using Bayesian hypothesis testing or deriving posterior samples of graph structures under small sample sizes. The study recommends using the EGA.analytical technique as an alternative tool for assessing dimensionality and advocates for the usefulness of the EGA.sampling method as a valuable alternate technique. The findings also indicated encouraging results for extending the regularization-based network modeling EGA method to the Bayesian framework and discussed future directions in this line of work. The study illustrated the practical application of the techniques to two empirical examples in R.


Journal metrics


5.3 (2023)

Journal Impact Factor™


24%

Acceptance rate


7.6 (2023)

CiteScore™


70 days

Submission to first decision


66 days

Acceptance to publication


2.069 (2023)

SNIP


2.351 (2023)

SJR

Editors