About
112
Publications
25,451
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,589
Citations
Introduction
Current institution
Additional affiliations
September 1997 - August 2002
September 2007 - July 2013
September 2002 - August 2007
Education
September 1993 - August 1996
Publications
Publications (112)
We introduce a general framework for latent variable modeling, named Generalized Latent Variable Models for Location, Scale, and Shape parameters (GLVM-LSS). This framework extends the generalized linear latent variable model beyond the exponential family distributional assumption and enables the modeling of distributional parameters other than the...
This paper introduces the generalized Hausman test as a novel method for detecting the non-normality of the latent variable distribution of the unidimensional latent trait model for binary data. The test utilizes the pairwise maximum likelihood estimator for the parameters of the latent trait model, which assumes normality of the latent variable, a...
Peer grading is an educational system in which students assess each other's work. It is commonly applied under Massive Open Online Course (MOOC) and offline classroom settings. With this system, instructors receive a reduced grading workload, and students enhance their understanding of course materials by grading others' work. Peer grading data hav...
This paper discusses estimation and limited‐information goodness‐of‐fit test statistics in factor models for binary data using pairwise likelihood estimation and sampling weights. The paper extends the applicability of pairwise likelihood estimation for factor models with binary data to accommodate complex sampling designs. Additionally, it introdu...
Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a particular item, controlling for their overall latent construct level. DIF analysis is typically conducted to assess...
This paper presents the generalized Hausman test to detect non-normality of the latent variable distribution in unidimensional Item Response Theory (IRT) models for binary data. The test is based on the estimators resulting from the two-parameter IRT model, that assumes normality of the latent variable, and the semi-nonparametric IRT model, that as...
A composite likelihood is an inference function derived by multiplying a set of likelihood components. This approach provides a flexible framework for drawing inference when the likelihood function of a statistical model is computationally intractable. While composite likelihood has computational advantages, it can still be demanding when dealing w...
The paper proposes a novel model assessment paradigm aiming to address shortcoming of posterior predictive p $$ p $$ -values, which provide the default metric of fit for Bayesian structural equation modelling (BSEM). The model framework presented in the paper focuses on the approximate zero approach (Psychological Methods, 17, 2012, 313), which inv...
Measurement invariance across items is key to the validity of instruments like a survey questionnaire or an educational test. Differential item functioning (DIF) analysis is typically conducted to assess measurement invariance at the item level. Traditional DIF analysis methods require knowing the comparison groups (reference and focal groups) and...
The chapter reviews uni- and multidimensional association models (AMs) that provide a parsimonious modelling of interactions between categorical variables. AMs go beyond the standard log-linear modeling framework to model non-saturated models that exist between a saturated and an independence model, including also log-nonlinear models. AMs are pres...
Researchers have widely used exploratory factor analysis (EFA) to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that they often use to find interpretable loading matrices. In this paper, we propose a new family of oblique rotations based on component-wise [Formula: see...
This paper extends the generalized Hausman test to detect non-normality of the latent variable distribution in unidimensional IRT models for binary data. To build the test, we consider the estimator obtained from the two-parameter IRT model, that assumes normality of the latent variable, and the estimator obtained under a semi-nonparametric framewo...
This paper extends the generalized Hausman test to detect nonnormality
of the latent variable distribution in unidimensional IRT models for
binary data. To build the test, we consider the estimator obtained from the
two-parameter IRT model, that assumes normality of the latent variable, and
the estimator obtained under a semi-nonparametric framewor...
The paper proposes a new latent variable model for the simultaneous (two-way) detection of outlying individuals and items for item-response-type data. The proposed model is a synergy between a factor model for binary responses and continuous response times that captures normal item response behaviour and a latent class model that captures the outly...
Exploratory factor analysis (EFA) has been widely used to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that are widely used to find interpretable loading matrices. This paper proposes a new family of oblique rotations based on component-wise $L^p$ loss functions $(0 <...
We develop an efficient Bayesian sequential inference framework for factor analysis models observed via various data types, such as continuous, binary and ordinal data. In the continuous data case, where it is possible to marginalise over the latent factors, the proposed methodology tailors the Iterated Batch Importance Sampling (IBIS) of Chopin (2...
Families extend well beyond households. In particular, connections between parents and their adult offspring are often close and sustained, and transfers may include financial assistance, practical support, or both, provided by either generation to the other. Yet this major engine of welfare production, distribution, and redistribution has only rec...
Objective
To inform the development of an AGREE II extension specifically tailored for surgical guidelines.Summary background dataAGREE II was designed to inform the development, reporting, and appraisal of clinical practice guidelines. Previous research has suggested substantial room for improvement of the quality of surgical guidelines.MethodsA p...
In a quantitative synthesis of studies via meta‐analysis, it is possible that some studies provide a markedly different relative treatment effect or have a large impact on the summary estimate and/or heterogeneity. Extreme study effects (outliers) can be detected visually with forest/funnel plots and by using statistical outlying detection methods....
The paper proposes a joint mixture model to model non-ignorable drop-out in longitudinal cohort studies of mental health outcomes. The model combines a (non)-linear growth curve model for the time-dependent outcomes and a discrete-time survival model for the drop-out with random effects shared by the two sub-models. The mixture part of the model ta...
This paper studies the Type I error, false positive rates, and power of four versions of
the Lagrange Multiplier test to detect measurement non-invariance in Item Response
Theory (IRT) models for binary data under model misspecification. The tests
considered are the Lagrange Multiplier test computed with the Hessian and
cross-product approach, the...
This article studies the power of the Lagrange Multiplier Test and the Generalized Lagrange Multiplier Test to detect measurement non-invariance in Item Response Theory (IRT) models for binary data. We study the performance of these two tests under correct model specification and incorrect distribution of the latent variable. The asymptotic distrib...
Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework and under a missing at random (MAR) mechanism are proposed. Under a full information likelihood estimation framework and MAR, ignorability of the missing data mechanism does not lead to biased e...
We propose a generalised framework for Bayesian Structural Equation Modelling (SEM) that can be applied to a variety of data types. The introduced framework focuses on the approximate zero approach, according to which a hypothesised structure is formulated with approximate rather than exact zero. It extends previously suggested models by \citeA{MA1...
This article studies the power of the Lagrange Multiplier Test and the Generalized Lagrange Multiplier Test to detect measurement non-invariance in Item Response Theory (IRT) models for binary data. We study the performance of these two tests under correct model specification and incorrect distribution of the latent variable. The asymptotic distrib...
Penalized factor analysis is an efficient technique that produces a factor loading matrix with many zero elements thanks to the introduction of sparsity-inducing penalties within the estimation process. However, sparse solutions and stable model selection procedures are only possible if the employed penalty is non-differentiable, which poses certai...
Various global health initiatives are currently advocating the elimination of schistosomiasis within the next decade. Schistosomiasis is a highly debilitating tropical infectious disease with severe burden of morbidity and thus operational research accurately evaluating diagnostics that quantify the epidemic status for guiding effective strategies...
The likelihood ratio test (LRT) is widely used for comparing the relative fit of nested latent variable models. Following Wilks’ theorem, the LRT is conducted by comparing the LRT statistic with its asymptotic distribution under the restricted model, a $$\chi ^2$$ χ 2 distribution with degrees of freedom equal to the difference in the number of fre...
The likelihood ratio test (LRT) is widely used for comparing the relative fit of nested latent variable models. Following Wilks' theorem, the LRT is conducted by comparing the LRT statistic with its asymptotic distribution under the restricted model, a $\chi^2$-distribution with degrees of freedom equal to the difference in the number of free param...
This paper was inspired by the presentations and discussions from the panel “Successful Careers in Academia and Industry and What We Can Learn from Them” that took place at the IMPS meeting in 2019. In this paper, we discuss what makes a career successful in academia and industry and we provide examples from the past to the present. We include educ...
Partial association refers to the relationship between variables Y1,Y2,…,YK while adjusting for a set of covariates X={X1,…,Xp}. To assess such an association when Yk’s are recorded on ordinal scales, a classical approach is to use partial correlation between the latent continuous variables. This so-called polychoric correlation is inadequate, as i...
We propose a multilevel structural equation model to investigate the interrelationships between childhood socio‐economic circumstances, partnership formation and stability, and mid‐life health, using data from the 1958 British birth cohort. The structural equation model comprises latent class models that characterize the patterns of change in four...
Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one's life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in...
Pairwise likelihood is a limited information estimation method that has also been used for estimating the parameters of latent variable and structural equation models. Pairwise likelihood is a special case of composite likelihood methods that uses lower-order conditional or marginal log-likelihoods instead of the full log-likelihood. The composite...
Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework are proposed. In confirmatory factor analysis (CFA) with categorical observed variables and data being missing at random, multiple imputation followed by the three-stage weighted least squares i...
The problem of penalized maximum likelihood (PML) for an exploratory factor analysis (EFA) model is studied in this paper. An EFA model is typically estimated using maximum likelihood and then the estimated loading matrix is rotated to obtain a sparse representation. Penalized maximum likelihood simultaneously fits the EFA model and produces a spar...
This chapter presents the most widely known factor analysis models for handling categorical responses. The models specify the probability of response in each category as a function of parameters attributed to items and of a latent value attributed to the respondent. The chapter reviews the estimation methods that have been widely used in estimating...
The 3-step approach has been recently advocated over the simultaneous 1-step approach to model a distal outcome predicted by a latent categorical variable. We generalize the 3-step approach to situations where the distal outcome is predicted by multiple and possibly associated latent categorical variables. Although the simultaneous 1-step approach...
The aim of the present study was to identify the sexual behavior, attitudes, beliefs, and knowledge on sexually transmitted infections (STIs) focused on human papilloma virus (HPV) in the Greek adolescent population. The participants were 4547 adolescents, a representative sample for Greek territory with a mean age of 17 years. After written permis...
Correlated multivariate ordinal data can be analysed with structural equation models. Parameter estimation has been tackled in the literature using limited-information methods including three-stage least squares and pseudo-likelihood estimation methods such as pairwise maximum likelihood estimation. In this paper, two likelihood ratio test statisti...
When considering data from many trials, it is likely that some of them present a markedly different intervention effect or exert an undue influence on the summary results. We develop a forward search algorithm for identifying outlying and influential studies in meta-analysis models. The forward search algorithm starts by fitting the hypothesized mo...
When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed data should include a model for the probabilities of responding. In this paper we propose such models for nonresponse in survey questions which are treated as multiple-item measures of latent constructs and analysed using latent variable models. The no...
Composite likelihood estimation has been proposed in the literature for handling intractable likelihoods. In particular, pairwise likelihood estimation has been recently proposed to estimate models with latent variables and random effects that involve high dimensional integrals. Pairwise estimators are asymptotically consistent and normally distrib...
In studies of multiple groups of respondents, such as cross-national surveys and cross-cultural assessments in psychological or educational testing, an important methodological consideration is the comparability or “equivalence” of measurement across the groups. Ideally full equivalence would hold, but very often it does not. If nonequivalence of m...
IRT has been increasingly utilized in psychiatry for the purpose of describing the relationship among items in psychiatric disorder symptom batteries hypothesized to be indicators of an underlying latent continuous representing the severity of the psychiatric disorder. It is common to find zero-inflated data such that a large proportion of the samp...
The article provides an overview of latent variable models including the classical factor analysis model, factor models for categorical manifest variables, structural equation models and more recent extensions for mixed (categorical and continuous) manifest and latent variables. Emphasis is given on model specification, estimation methods, goodness...
Longitudinal data are collected for studying changes across time. We consider multivariate longitudinal data where multiple observed variables, measured at each time point, are used as indicators for theoretical constructs (latent variables) of interest. A common problem in longitudinal studies is dropout, where subjects exit the study prematurely....
Models with random effects/latent variables are widely used for capturing unobserved heterogeneity in multilevel/hierarchical
data and account for associations in multivariate data. The estimation of those models becomes cumbersome as the number of
latent variables increases due to high-dimensional integrations involved. Composite likelihood is a p...
Regular treatment with praziquantel (PZQ) is the strategy for human schistosomiasis control aiming to prevent morbidity in later life. With the recent resolution on schistosomiasis elimination by the 65th World Health Assembly, appropriate diagnostic tools to inform interventions are keys to their success. We present a discrete Markov chains modell...
In latent variable models the parameter estimation can be implemented by
using the joint or the marginal likelihood, based on independence or
conditional independence assumptions. The same dilemma occurs within the
Bayesian framework with respect to the estimation of the Bayesian marginal (or
integrated) likelihood, which is the main tool for model...
Responses to a set of indicators, or items, or variables are often used in social sciences for measuring unobserved constructs as attitudes. Latent variable models, which are also known as factor analysis models, are used for linking the observed responses to the latent constructs. Often, some respondents provide random responses to the items. We d...
Based on a longtime course for master's level students at the London School of Economics and Politics, where the authors are based, this text concentrates on the multivariate methods so useful to social science problems involving correlational rather than causal relationships. Chapters with application examples and further readings cover data preli...
The marginal likelihood can be notoriously difficult to compute, and particularly so in high-dimensional problems. Chib and Jeliazkov employed the local reversibility of the Metropolis–Hastings algorithm to construct an estimator in models where full conditional densities are not available analytically. The estimator is free of distributional assum...
In disease control or elimination programs, diagnostics are essential for assessing the impact of interventions, refining
treatment strategies, and minimizing the waste of scarce resources. Although high-performance tests are desirable, increased
accuracy is frequently accompanied by a requirement for more elaborate infrastructure, which is often n...
Purpose:
To investigate the dimensionality, construct validity in the form of factorial, convergent, discriminant, and known-groups validity, as well as scale reliability of the fifteen dimensional (15D) instrument.
Methods:
15D data were collected from a large Greek general population sample (N = 3,268) which was randomly split into two halves....
In studies of multiple groups of respondents, such as cross-national surveys and cross-cultural assessments in psychological or educational testing, an important methodological consideration is the comparability or "equivalence" of measurement across the groups. Ideally full equivalence would hold, but very often it does not. If nonequivalence of m...
A pairwise maximum likelihood (PML) estimation method is developed for factor analysis models with ordinal data and fitted both in an exploratory and confirmatory set-up. The performance of the method is studied via simulations and comparisons with full information maximum likelihood (FIML) and three-stage limited information estimation methods, na...
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate ordina...
Diagnosis of urogenital schistosomiasis in chronically infected adults is challenging but important, especially because long term infection of the bladder and urinary tract can have dire consequences. We evaluated three tests for viable infection: detection of parasite specific DNA Dra1 fragments, haematuria and presence of parasite eggs for sensit...
The statistical problemThe basic ideaTwo examplesA broader theoretical viewIllustration of an alternative approachAn overview of special casesPrincipal componentsThe historical contextClosely related fields in statistics
Latent Variable Models and Factor Analysis provides a comprehensive and unified approach to factor analysis and latent variable modeling from a statistical perspective. This book presents a general framework to enable the derivation of the commonly used models, along with updated numerical examples. Nature and interpretation of a latent variable is...
The basic aim of this paper is to investigate the impact that educational level of individuals and participation in training programmes (apprenticeship, intra-firm training, continuing vocational training, popular training) have on their job prospects in the two most populated Greek regions, Attica and Central Macedonia, during the implementation o...
In this article we implement a forward search algorithm for identifying atypical subjects/observations in factor analysis models for binary data. Forward plots of goodness-of-fit statistics, residuals, and parameter estimates help us identify aberrant observations and detect deviations from the hypothesized model. Methods to initialize, progress, a...
We consider a general type of model for analyzing ordinal variables with covariate effects and 2 approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We compare these 2 approaches on the basis of 2 examples, 1 involving only covariate effects directly on the ordinal variables a...
The paper proposes a full information maximum likelihood estimation method for modelling multivariate longitudinal ordinal variables. Two latent variable models are proposed that account for dependencies among items within time and between time. One model fits item-specific random effects which account for the between time points correlations and t...
Latent variable models with observed ordinal variables are particularly useful for analyzing survey data. Typical ordinal
variables express attitudinal statements with response alternatives like “strongly disagree”, “disagree”, “strongly agree”
or “very dissatisfied”, “dissatisfied”, “satisfied” and “very satisfied”.
In this article we extend and implement the forward search algorithm for identifying atypical subjects/observations in factor analysis models. The forward search has been mainly developed for detecting aberrant observations in regression models (Atkinson, 1994) and in multivariate methods such as cluster and discriminant analysis (Atkinson, Riani,...
Drawing on the authors’ varied experiences working and teaching in the field, Analysis of Multivariate Social Science Data, Second Editionenables a basic understanding of how to use key multivariate methods in the social sciences. With updates in every chapter, this edition expands its topics to include regression analysis, confirmatory factor anal...
The paper reviews recent work on latent variable models for ordinal longitudinal variables and factor models with non-linear
terms. The model for longitudinal data has been recently proposed by Cagnone, Moustaki and Vasdekis (2008). The model allows
for time-dependent latent variables to explain the associations among ordinal variables within time...
ABSTRACT Based on a longtime course for master's level students at the London School of Economics and Politics, where the authors are based, this text concentrates on the multivariate methods so useful to social science problems involving correlational rather than causal relationships. Chapters with application examples and further readings cover d...
Drawing on the authors' varied experiences working and teaching in the field, Analysis of Multivariate Social Science Data, Second Editionenables a basic understanding of how to use key multivariate methods in the social sciences. With updates in every chapter, this edition expands its topics to include regression analysis, confirmatory factor anal...
This chapter discusses the goodness-of-fit measures for latent variable models for binary responses. Goodness-of-fit tests are used to evaluate how well a proposed model fits or predicts a particular data set. Usually, test statistics compute deviations between the observed data and the predictions from the model. Overall goodness-of-fit statistics...
Until recently, item response models such as the factor analysis model for metric responses, the two-parameter logistic model for binary responses and the multinomial model for nominal responses considered only the main effects of latent variables without allowing for interaction or polynomial latent variable effects. However, non-linear relationsh...
Parameter constraints in generalized linear latent variable models are discussed. Both linear equality and inequality constraints are considered. Maximum likelihood estimators for the parameters of the constrained model and corrected standard errors are derived. A significant reduction in the dimension of the optimization problem is achieved with t...
Item response theory (IRT) deals with the statistical analysis of data in which responses of each of the number of respondents to each of the number of items or trials are assigned to define mutually exclusive categories. This chapter presents the best known item response models specifying the probability of response in each category as a function...
Latent variable models are used for analyzing multivariate data. Recently, generalized linear latent variable models for categorical, metric, and mixed-type responses estimated via maximum likelihood (ML) have been proposed. Model deviations, such as data contamination, are shown analytically, using the influence function and through a simulation s...
Until recently, latent variable models such as the factor analysis model for metric responses, the two-parameter logistic model for binary responses, the multinomial model for nominal responses considered only main effects of latent variables without allowing for interaction or polynomial latent variable effects. However, nonlinear relationships am...
In this article, we discuss a latent variable model with continuous latent variables for manifest variables that are a mixture of categorical and survival outcomes. Models for censored and uncensored survival data are discussed. The model allows for covariate effects both on the manifest variables (direct effects) and on the latent variable(s) (ind...
Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal)...
This paper proposes a robust estimator for a general class of linear latent variable models (GLLVM) (Moustaki and Knott 2000, Bartholomew and Knott 1999). It is based on a weighted score function that is simple to implement numerically and is made consistent using the basic idea of indirect inference. The need of a robust estimator for these models...
Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal)...
Previous work on a general class of multidimensional latent variable models for analysing ordinal manifest variables is extended here to allow for direct covariate effects on the manifest ordinal variables and covariate effects on the latent variables. A full maximum likelihood estimation method is used to estimate all the model parameters simultan...
Previous work on a general class of multidimensional latent variable models for analysing ordinal manifest variables is extended here to allow for direct covariate effects on the manifest ordinal variables and covariate effects on the latent variables. A full maximum likelihood estimation method is used to estimate all the model parameters simultan...
This edited volume features cutting-edge topics from the leading researchers in the areas of latent variable modeling. Content highlights include coverage of approaches dealing with missing values, semi-parametric estimation, robust analysis, hierarchical data, factor scores, multi-group analysis, and model testing. New methodological topics are il...
The paper discusses the effect of model deviations such as data contamination on the maximum likelihood estimator (MLE) for a general class of latent trait models (citeNP{MoKn:00}). This is done with the use of the influence function (Hampel 1968, 1974) a mathematical tool to assess the robustness properties of any statistic, such as an estimator....
Theory and methodology for exploratory factor analysis have been well developed for continuous variables. In practice, observed or measured variables are often ordinal. However, ordinality is most often ignored and numbers such as 1, 2, 3, 4, representing ordered categories, are treated as numbers having metric properties, a procedure which is inco...