Ir Moustaki

Ir Moustaki
  • PhD in Statistics
  • Professor at London School of Economics and Political Science

About

112
Publications
25,451
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,589
Citations
Current institution
London School of Economics and Political Science
Current position
  • Professor
Additional affiliations
September 1997 - August 2002
London School of Economics and Political Science
Position
  • Lecturer
September 2007 - July 2013
London School of Economics and Political Science
Position
  • Reader in Social Statistics
September 2002 - August 2007
Athens University of Economics and Business
Position
  • Professor (Associate)
Education
September 1993 - August 1996

Publications

Publications (112)
Article
Full-text available
We introduce a general framework for latent variable modeling, named Generalized Latent Variable Models for Location, Scale, and Shape parameters (GLVM-LSS). This framework extends the generalized linear latent variable model beyond the exponential family distributional assumption and enables the modeling of distributional parameters other than the...
Article
This paper introduces the generalized Hausman test as a novel method for detecting the non-normality of the latent variable distribution of the unidimensional latent trait model for binary data. The test utilizes the pairwise maximum likelihood estimator for the parameters of the latent trait model, which assumes normality of the latent variable, a...
Preprint
Peer grading is an educational system in which students assess each other's work. It is commonly applied under Massive Open Online Course (MOOC) and offline classroom settings. With this system, instructors receive a reduced grading workload, and students enhance their understanding of course materials by grading others' work. Peer grading data hav...
Article
This paper discusses estimation and limited‐information goodness‐of‐fit test statistics in factor models for binary data using pairwise likelihood estimation and sampling weights. The paper extends the applicability of pairwise likelihood estimation for factor models with binary data to accommodate complex sampling designs. Additionally, it introdu...
Article
Full-text available
Ensuring fairness in instruments like survey questionnaires or educational tests is crucial. One way to address this is by a Differential Item Functioning (DIF) analysis, which examines if different subgroups respond differently to a particular item, controlling for their overall latent construct level. DIF analysis is typically conducted to assess...
Conference Paper
Full-text available
This paper presents the generalized Hausman test to detect non-normality of the latent variable distribution in unidimensional Item Response Theory (IRT) models for binary data. The test is based on the estimators resulting from the two-parameter IRT model, that assumes normality of the latent variable, and the semi-nonparametric IRT model, that as...
Preprint
A composite likelihood is an inference function derived by multiplying a set of likelihood components. This approach provides a flexible framework for drawing inference when the likelihood function of a statistical model is computationally intractable. While composite likelihood has computational advantages, it can still be demanding when dealing w...
Article
Full-text available
The paper proposes a novel model assessment paradigm aiming to address shortcoming of posterior predictive p $$ p $$ -values, which provide the default metric of fit for Bayesian structural equation modelling (BSEM). The model framework presented in the paper focuses on the approximate zero approach (Psychological Methods, 17, 2012, 313), which inv...
Preprint
Full-text available
Measurement invariance across items is key to the validity of instruments like a survey questionnaire or an educational test. Differential item functioning (DIF) analysis is typically conducted to assess measurement invariance at the item level. Traditional DIF analysis methods require knowing the comparison groups (reference and focal groups) and...
Chapter
The chapter reviews uni- and multidimensional association models (AMs) that provide a parsimonious modelling of interactions between categorical variables. AMs go beyond the standard log-linear modeling framework to model non-saturated models that exist between a saturated and an independence model, including also log-nonlinear models. AMs are pres...
Article
Full-text available
Researchers have widely used exploratory factor analysis (EFA) to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that they often use to find interpretable loading matrices. In this paper, we propose a new family of oblique rotations based on component-wise [Formula: see...
Chapter
Full-text available
This paper extends the generalized Hausman test to detect non-normality of the latent variable distribution in unidimensional IRT models for binary data. To build the test, we consider the estimator obtained from the two-parameter IRT model, that assumes normality of the latent variable, and the estimator obtained under a semi-nonparametric framewo...
Conference Paper
This paper extends the generalized Hausman test to detect nonnormality of the latent variable distribution in unidimensional IRT models for binary data. To build the test, we consider the estimator obtained from the two-parameter IRT model, that assumes normality of the latent variable, and the estimator obtained under a semi-nonparametric framewor...
Article
The paper proposes a new latent variable model for the simultaneous (two-way) detection of outlying individuals and items for item-response-type data. The proposed model is a synergy between a factor model for binary responses and continuous response times that captures normal item response behaviour and a latent class model that captures the outly...
Preprint
Full-text available
Exploratory factor analysis (EFA) has been widely used to learn the latent structure underlying multivariate data. Rotation and regularised estimation are two classes of methods in EFA that are widely used to find interpretable loading matrices. This paper proposes a new family of oblique rotations based on component-wise $L^p$ loss functions $(0 <...
Preprint
We develop an efficient Bayesian sequential inference framework for factor analysis models observed via various data types, such as continuous, binary and ordinal data. In the continuous data case, where it is possible to marginalise over the latent factors, the proposed methodology tailors the Iterated Batch Importance Sampling (IBIS) of Chopin (2...
Article
Full-text available
Families extend well beyond households. In particular, connections between parents and their adult offspring are often close and sustained, and transfers may include financial assistance, practical support, or both, provided by either generation to the other. Yet this major engine of welfare production, distribution, and redistribution has only rec...
Article
Full-text available
Objective To inform the development of an AGREE II extension specifically tailored for surgical guidelines.Summary background dataAGREE II was designed to inform the development, reporting, and appraisal of clinical practice guidelines. Previous research has suggested substantial room for improvement of the quality of surgical guidelines.MethodsA p...
Article
Full-text available
In a quantitative synthesis of studies via meta‐analysis, it is possible that some studies provide a markedly different relative treatment effect or have a large impact on the summary estimate and/or heterogeneity. Extreme study effects (outliers) can be detected visually with forest/funnel plots and by using statistical outlying detection methods....
Article
Full-text available
The paper proposes a joint mixture model to model non-ignorable drop-out in longitudinal cohort studies of mental health outcomes. The model combines a (non)-linear growth curve model for the time-dependent outcomes and a discrete-time survival model for the drop-out with random effects shared by the two sub-models. The mixture part of the model ta...
Article
Full-text available
This paper studies the Type I error, false positive rates, and power of four versions of the Lagrange Multiplier test to detect measurement non-invariance in Item Response Theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange Multiplier test computed with the Hessian and cross-product approach, the...
Book
This article studies the power of the Lagrange Multiplier Test and the Generalized Lagrange Multiplier Test to detect measurement non-invariance in Item Response Theory (IRT) models for binary data. We study the performance of these two tests under correct model specification and incorrect distribution of the latent variable. The asymptotic distrib...
Article
Full-text available
Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework and under a missing at random (MAR) mechanism are proposed. Under a full information likelihood estimation framework and MAR, ignorability of the missing data mechanism does not lead to biased e...
Preprint
Full-text available
We propose a generalised framework for Bayesian Structural Equation Modelling (SEM) that can be applied to a variety of data types. The introduced framework focuses on the approximate zero approach, according to which a hypothesised structure is formulated with approximate rather than exact zero. It extends previously suggested models by \citeA{MA1...
Chapter
Full-text available
This article studies the power of the Lagrange Multiplier Test and the Generalized Lagrange Multiplier Test to detect measurement non-invariance in Item Response Theory (IRT) models for binary data. We study the performance of these two tests under correct model specification and incorrect distribution of the latent variable. The asymptotic distrib...
Article
Full-text available
Penalized factor analysis is an efficient technique that produces a factor loading matrix with many zero elements thanks to the introduction of sparsity-inducing penalties within the estimation process. However, sparse solutions and stable model selection procedures are only possible if the employed penalty is non-differentiable, which poses certai...
Article
Full-text available
Various global health initiatives are currently advocating the elimination of schistosomiasis within the next decade. Schistosomiasis is a highly debilitating tropical infectious disease with severe burden of morbidity and thus operational research accurately evaluating diagnostics that quantify the epidemic status for guiding effective strategies...
Article
Full-text available
The likelihood ratio test (LRT) is widely used for comparing the relative fit of nested latent variable models. Following Wilks’ theorem, the LRT is conducted by comparing the LRT statistic with its asymptotic distribution under the restricted model, a $$\chi ^2$$ χ 2 distribution with degrees of freedom equal to the difference in the number of fre...
Preprint
The likelihood ratio test (LRT) is widely used for comparing the relative fit of nested latent variable models. Following Wilks' theorem, the LRT is conducted by comparing the LRT statistic with its asymptotic distribution under the restricted model, a $\chi^2$-distribution with degrees of freedom equal to the difference in the number of free param...
Chapter
Full-text available
This paper was inspired by the presentations and discussions from the panel “Successful Careers in Academia and Industry and What We Can Learn from Them” that took place at the IMPS meeting in 2019. In this paper, we discuss what makes a career successful in academia and industry and we provide examples from the past to the present. We include educ...
Article
Partial association refers to the relationship between variables Y1,Y2,…,YK while adjusting for a set of covariates X={X1,…,Xp}. To assess such an association when Yk’s are recorded on ordinal scales, a classical approach is to use partial correlation between the latent continuous variables. This so-called polychoric correlation is inadequate, as i...
Article
We propose a multilevel structural equation model to investigate the interrelationships between childhood socio‐economic circumstances, partnership formation and stability, and mid‐life health, using data from the 1958 British birth cohort. The structural equation model comprises latent class models that characterize the patterns of change in four...
Preprint
Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one's life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in...
Article
Full-text available
Pairwise likelihood is a limited information estimation method that has also been used for estimating the parameters of latent variable and structural equation models. Pairwise likelihood is a special case of composite likelihood methods that uses lower-order conditional or marginal log-likelihoods instead of the full log-likelihood. The composite...
Preprint
Full-text available
Methods for the treatment of item non-response in attitudinal scales and in large-scale assessments under the pairwise likelihood (PL) estimation framework are proposed. In confirmatory factor analysis (CFA) with categorical observed variables and data being missing at random, multiple imputation followed by the three-stage weighted least squares i...
Article
Full-text available
The problem of penalized maximum likelihood (PML) for an exploratory factor analysis (EFA) model is studied in this paper. An EFA model is typically estimated using maximum likelihood and then the estimated loading matrix is rotated to obtain a sparse representation. Penalized maximum likelihood simultaneously fits the EFA model and produces a spar...
Chapter
This chapter presents the most widely known factor analysis models for handling categorical responses. The models specify the probability of response in each category as a function of parameters attributed to items and of a latent value attributed to the respondent. The chapter reviews the estimation methods that have been widely used in estimating...
Article
The 3-step approach has been recently advocated over the simultaneous 1-step approach to model a distal outcome predicted by a latent categorical variable. We generalize the 3-step approach to situations where the distal outcome is predicted by multiple and possibly associated latent categorical variables. Although the simultaneous 1-step approach...
Article
Full-text available
The aim of the present study was to identify the sexual behavior, attitudes, beliefs, and knowledge on sexually transmitted infections (STIs) focused on human papilloma virus (HPV) in the Greek adolescent population. The participants were 4547 adolescents, a representative sample for Greek territory with a mean age of 17 years. After written permis...
Article
Correlated multivariate ordinal data can be analysed with structural equation models. Parameter estimation has been tackled in the literature using limited-information methods including three-stage least squares and pseudo-likelihood estimation methods such as pairwise maximum likelihood estimation. In this paper, two likelihood ratio test statisti...
Article
When considering data from many trials, it is likely that some of them present a markedly different intervention effect or exert an undue influence on the summary results. We develop a forward search algorithm for identifying outlying and influential studies in meta-analysis models. The forward search algorithm starts by fitting the hypothesized mo...
Article
When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed data should include a model for the probabilities of responding. In this paper we propose such models for nonresponse in survey questions which are treated as multiple-item measures of latent constructs and analysed using latent variable models. The no...
Article
Composite likelihood estimation has been proposed in the literature for handling intractable likelihoods. In particular, pairwise likelihood estimation has been recently proposed to estimate models with latent variables and random effects that involve high dimensional integrals. Pairwise estimators are asymptotically consistent and normally distrib...
Article
Full-text available
In studies of multiple groups of respondents, such as cross-national surveys and cross-cultural assessments in psychological or educational testing, an important methodological consideration is the comparability or “equivalence” of measurement across the groups. Ideally full equivalence would hold, but very often it does not. If nonequivalence of m...
Article
Full-text available
IRT has been increasingly utilized in psychiatry for the purpose of describing the relationship among items in psychiatric disorder symptom batteries hypothesized to be indicators of an underlying latent continuous representing the severity of the psychiatric disorder. It is common to find zero-inflated data such that a large proportion of the samp...
Chapter
The article provides an overview of latent variable models including the classical factor analysis model, factor models for categorical manifest variables, structural equation models and more recent extensions for mixed (categorical and continuous) manifest and latent variables. Emphasis is given on model specification, estimation methods, goodness...
Article
Longitudinal data are collected for studying changes across time. We consider multivariate longitudinal data where multiple observed variables, measured at each time point, are used as indicators for theoretical constructs (latent variables) of interest. A common problem in longitudinal studies is dropout, where subjects exit the study prematurely....
Article
Full-text available
Models with random effects/latent variables are widely used for capturing unobserved heterogeneity in multilevel/hierarchical data and account for associations in multivariate data. The estimation of those models becomes cumbersome as the number of latent variables increases due to high-dimensional integrations involved. Composite likelihood is a p...
Article
Full-text available
Regular treatment with praziquantel (PZQ) is the strategy for human schistosomiasis control aiming to prevent morbidity in later life. With the recent resolution on schistosomiasis elimination by the 65th World Health Assembly, appropriate diagnostic tools to inform interventions are keys to their success. We present a discrete Markov chains modell...
Article
Full-text available
In latent variable models the parameter estimation can be implemented by using the joint or the marginal likelihood, based on independence or conditional independence assumptions. The same dilemma occurs within the Bayesian framework with respect to the estimation of the Bayesian marginal (or integrated) likelihood, which is the main tool for model...
Article
Responses to a set of indicators, or items, or variables are often used in social sciences for measuring unobserved constructs as attitudes. Latent variable models, which are also known as factor analysis models, are used for linking the observed responses to the latent constructs. Often, some respondents provide random responses to the items. We d...
Article
Based on a longtime course for master's level students at the London School of Economics and Politics, where the authors are based, this text concentrates on the multivariate methods so useful to social science problems involving correlational rather than causal relationships. Chapters with application examples and further readings cover data preli...
Article
Full-text available
The marginal likelihood can be notoriously difficult to compute, and particularly so in high-dimensional problems. Chib and Jeliazkov employed the local reversibility of the Metropolis–Hastings algorithm to construct an estimator in models where full conditional densities are not available analytically. The estimator is free of distributional assum...
Article
Full-text available
In disease control or elimination programs, diagnostics are essential for assessing the impact of interventions, refining treatment strategies, and minimizing the waste of scarce resources. Although high-performance tests are desirable, increased accuracy is frequently accompanied by a requirement for more elaborate infrastructure, which is often n...
Article
Full-text available
Purpose: To investigate the dimensionality, construct validity in the form of factorial, convergent, discriminant, and known-groups validity, as well as scale reliability of the fifteen dimensional (15D) instrument. Methods: 15D data were collected from a large Greek general population sample (N = 3,268) which was randomly split into two halves....
Article
In studies of multiple groups of respondents, such as cross-national surveys and cross-cultural assessments in psychological or educational testing, an important methodological consideration is the comparability or "equivalence" of measurement across the groups. Ideally full equivalence would hold, but very often it does not. If nonequivalence of m...
Article
Full-text available
A pairwise maximum likelihood (PML) estimation method is developed for factor analysis models with ordinal data and fitted both in an exploratory and confirmatory set-up. The performance of the method is studied via simulations and comparisons with full information maximum likelihood (FIML) and three-stage limited information estimation methods, na...
Article
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate ordina...
Article
Full-text available
Diagnosis of urogenital schistosomiasis in chronically infected adults is challenging but important, especially because long term infection of the bladder and urinary tract can have dire consequences. We evaluated three tests for viable infection: detection of parasite specific DNA Dra1 fragments, haematuria and presence of parasite eggs for sensit...
Chapter
The statistical problemThe basic ideaTwo examplesA broader theoretical viewIllustration of an alternative approachAn overview of special casesPrincipal componentsThe historical contextClosely related fields in statistics
Book
Latent Variable Models and Factor Analysis provides a comprehensive and unified approach to factor analysis and latent variable modeling from a statistical perspective. This book presents a general framework to enable the derivation of the commonly used models, along with updated numerical examples. Nature and interpretation of a latent variable is...
Article
Full-text available
The basic aim of this paper is to investigate the impact that educational level of individuals and participation in training programmes (apprenticeship, intra-firm training, continuing vocational training, popular training) have on their job prospects in the two most populated Greek regions, Attica and Central Macedonia, during the implementation o...
Article
In this article we implement a forward search algorithm for identifying atypical subjects/observations in factor analysis models for binary data. Forward plots of goodness-of-fit statistics, residuals, and parameter estimates help us identify aberrant observations and detect deviations from the hypothesized model. Methods to initialize, progress, a...
Article
Full-text available
We consider a general type of model for analyzing ordinal variables with covariate effects and 2 approaches for analyzing data for such models, the item response theory (IRT) approach and the PRELIS-LISREL (PLA) approach. We compare these 2 approaches on the basis of 2 examples, 1 involving only covariate effects directly on the ordinal variables a...
Article
The paper proposes a full information maximum likelihood estimation method for modelling multivariate longitudinal ordinal variables. Two latent variable models are proposed that account for dependencies among items within time and between time. One model fits item-specific random effects which account for the between time points correlations and t...
Chapter
Full-text available
Latent variable models with observed ordinal variables are particularly useful for analyzing survey data. Typical ordinal variables express attitudinal statements with response alternatives like “strongly disagree”, “disagree”, “strongly agree” or “very dissatisfied”, “dissatisfied”, “satisfied” and “very satisfied”.
Article
In this article we extend and implement the forward search algorithm for identifying atypical subjects/observations in factor analysis models. The forward search has been mainly developed for detecting aberrant observations in regression models (Atkinson, 1994) and in multivariate methods such as cluster and discriminant analysis (Atkinson, Riani,...
Article
Drawing on the authors’ varied experiences working and teaching in the field, Analysis of Multivariate Social Science Data, Second Editionenables a basic understanding of how to use key multivariate methods in the social sciences. With updates in every chapter, this edition expands its topics to include regression analysis, confirmatory factor anal...
Chapter
The paper reviews recent work on latent variable models for ordinal longitudinal variables and factor models with non-linear terms. The model for longitudinal data has been recently proposed by Cagnone, Moustaki and Vasdekis (2008). The model allows for time-dependent latent variables to explain the associations among ordinal variables within time...
Book
ABSTRACT Based on a longtime course for master's level students at the London School of Economics and Politics, where the authors are based, this text concentrates on the multivariate methods so useful to social science problems involving correlational rather than causal relationships. Chapters with application examples and further readings cover d...
Book
Drawing on the authors' varied experiences working and teaching in the field, Analysis of Multivariate Social Science Data, Second Editionenables a basic understanding of how to use key multivariate methods in the social sciences. With updates in every chapter, this edition expands its topics to include regression analysis, confirmatory factor anal...
Article
This chapter discusses the goodness-of-fit measures for latent variable models for binary responses. Goodness-of-fit tests are used to evaluate how well a proposed model fits or predicts a particular data set. Usually, test statistics compute deviations between the observed data and the predictions from the model. Overall goodness-of-fit statistics...
Article
Full-text available
Until recently, item response models such as the factor analysis model for metric responses, the two-parameter logistic model for binary responses and the multinomial model for nominal responses considered only the main effects of latent variables without allowing for interaction or polynomial latent variable effects. However, non-linear relationsh...
Article
Parameter constraints in generalized linear latent variable models are discussed. Both linear equality and inequality constraints are considered. Maximum likelihood estimators for the parameters of the constrained model and corrected standard errors are derived. A significant reduction in the dimension of the optimization problem is achieved with t...
Article
Item response theory (IRT) deals with the statistical analysis of data in which responses of each of the number of respondents to each of the number of items or trials are assigned to define mutually exclusive categories. This chapter presents the best known item response models specifying the probability of response in each category as a function...
Article
Full-text available
Latent variable models are used for analyzing multivariate data. Recently, generalized linear latent variable models for categorical, metric, and mixed-type responses estimated via maximum likelihood (ML) have been proposed. Model deviations, such as data contamination, are shown analytically, using the influence function and through a simulation s...
Article
Until recently, latent variable models such as the factor analysis model for metric responses, the two-parameter logistic model for binary responses, the multinomial model for nominal responses considered only main effects of latent variables without allowing for interaction or polynomial latent variable effects. However, nonlinear relationships am...
Article
In this article, we discuss a latent variable model with continuous latent variables for manifest variables that are a mixture of categorical and survival outcomes. Models for censored and uncensored survival data are discussed. The model allows for covariate effects both on the manifest variables (direct effects) and on the latent variable(s) (ind...
Article
Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal)...
Article
Full-text available
This paper proposes a robust estimator for a general class of linear latent variable models (GLLVM) (Moustaki and Knott 2000, Bartholomew and Knott 1999). It is based on a weighted score function that is simple to implement numerically and is made consistent using the basic idea of indirect inference. The need of a robust estimator for these models...
Article
Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal)...
Article
Previous work on a general class of multidimensional latent variable models for analysing ordinal manifest variables is extended here to allow for direct covariate effects on the manifest ordinal variables and covariate effects on the latent variables. A full maximum likelihood estimation method is used to estimate all the model parameters simultan...
Article
Previous work on a general class of multidimensional latent variable models for analysing ordinal manifest variables is extended here to allow for direct covariate effects on the manifest ordinal variables and covariate effects on the latent variables. A full maximum likelihood estimation method is used to estimate all the model parameters simultan...
Article
This edited volume features cutting-edge topics from the leading researchers in the areas of latent variable modeling. Content highlights include coverage of approaches dealing with missing values, semi-parametric estimation, robust analysis, hierarchical data, factor scores, multi-group analysis, and model testing. New methodological topics are il...
Article
The paper discusses the effect of model deviations such as data contamination on the maximum likelihood estimator (MLE) for a general class of latent trait models (citeNP{MoKn:00}). This is done with the use of the influence function (Hampel 1968, 1974) a mathematical tool to assess the robustness properties of any statistic, such as an estimator....
Article
Full-text available
Theory and methodology for exploratory factor analysis have been well developed for continuous variables. In practice, observed or measured variables are often ordinal. However, ordinality is most often ignored and numbers such as 1, 2, 3, 4, representing ordered categories, are treated as numbers having metric properties, a procedure which is inco...

Network

Cited By