ArticlePDF Available


The mutualism model, an alternative for the g-factor model of intelligence, implies a formative measurement model in which “g” is an index variable without a causal role. If this model is accurate, the search for a genetic of brain instantiation of “g” is deemed useless. This also implies that the (weighted) sum score of items of an intelligence test is just what it is: a weighted sum score. Preference for one index above the other is a pragmatic issue that rests mainly on predictive value.
J. Intell. 2014, 2, 12-15; doi:10.3390/jintelligence2010012
Journal of
ISSN 2079-3200
Intelligence Is What the Intelligence Test Measures. Seriously
Han L. J. van der Maas 1,*, Kees-Jan Kan 2 and Denny Borsboom 1
1 Psychological Methods, Department of Psychology, University of Amsterdam, Weesperplein 4,
room 207, Amsterdam 1018 NX, The Netherlands; E-Mail: (D.B.)
2 Department of Biological Psychology, VU University, van der Boechorststraat 1,
Amsterdam 1081 BT, The Netherlands; E-Mail: (K.-J.K.)
* Author to whom correspondence should be addressed; E-Mail:;
Tel.: +31-20-525-6678.
Received: 16 January 2014; in revised form: 12 February 2014 / Accepted: 17 February 2014 /
Published: 28 February 2014
Abstract: The mutualism model, an alternative for the g-factor model of intelligence,
implies a formative measurement model in which gis an index variable without a causal
role. If this model is accurate, the search for a genetic of brain instantiation of g is
deemed useless. This also implies that the (weighted) sum score of items of an intelligence
test is just what it is: a weighted sum score. Preference for one index above the other is a
pragmatic issue that rests mainly on predictive value.
Keywords: mutualism model; formative model; intelligence test; g-factor; IQ
The first thing psychology students learn about intelligence is that Boring’s 1923 definition of
intelligence as what an IQ-test measures is just silly [1]. In line with the literature on the topic, both
Johnson [2] and Hunt & Jaeggi [3] do not take it seriously. However, given recent developments in the
theory of intelligence, we think that there is reason to reconsider our opinion on this topic.
Empirically, the core of intelligence research rests on the positive manifold: the fact that all
intelligence subtests, ranging from scholastic tests to tests of social intelligence, correlate positively.
This correlational pattern can be modeled statistically with principal components analysis or with
factor models. Although principal components analysis and factor analysis are sometimes considered
to be more or less interchangeable, they are very different models [4]. The factor model is a reflective
latent variable model, in which the factor is a hypothesized entity that is posited to provide a putative
J. Intell. 2014, 2 13
explanation for the positive manifold [5]. The principal components model is a formative model, in
which the components are conveniently weighted total scores; these are composites of the observed
data, which do not provide an explanation of the positive manifold, but rather inherit their structure
entirely from the data [6].
Thus, the factor model embodies the idea that there is a common cause “out therethat we “detect
using factor analysis, and that should have an independently ascertainable identity in the form of, say,
a variable defined on some biological substrate [7]. The principal components model does not say
anything about the nature of the correlations in the positive manifold. It does not posit testable
restrictions on the data, and therefore is better thought of as a data reduction model than as a
explanatory model. Importantly, in formative models, the nature of the constructed components is
fixed by the subtests used to determine them: a different choice of subtests yield conceptually different
components (even though these may be highly correlated; see also [8]). In contrast, the latent variable
in the factor model is not specifically tied to any set of subtests: if the model is true, the latent variable
can in principle be measured by any suitable set of indicators that depends on it and fulfills relevant
model requirements. Although different choices of such indicators may change the precision of the
measurements, they need not change the nature of the latent variable measured.
Clearly, the classical g model, as for instance discussed by Jensen [9] is a reflective model:
whatever g is, it is thought to explain scores on tests, correlations between tests, and individual
differences between subjects or groups of subjects. In other words, the g-factor is posited as the
common cause of the correlations in the positive manifold. Recently, however, an alternative
explanation for the positive manifold has been proposed in the form of the mutualism model [10]. In
this model, the correlations between test scores are not explained through the dependence on a
common latent variable, but as a result of reciprocal positive interactions between abilities and
processes that play key roles in cognitive development, like memory, spatial ability, and language
skills. The model explains key findings in intelligence research, such as the hierarchical factor
structure of intelligence, the low predictability of intelligence from early childhood performance, the
age differentiation effect, the increase in heritability of g, and is consistent with current explanations of
the Flynn effect [10].
It is interesting to inquire what the status of g should be if such a mutualism model were true. Of
course, in this situation, one does not measure a common latent variable through IQ-tests, for there is
no such latent variable. Rather, the mutualism model would support a typical formative model [11].
Such a formative model is also implied by a much older alternative for the g model, sampling
theory [12]. In a formative model, the factor score estimates that results from applications of g factor
models represent just another weighted sum of test scores and should be interpreted as index statistics
instead of as a latent variable. Index statistics, such as the Dow Jones Industrial Average, the Ocean
Health index and physical health indexes, evidently do not cause economic growth or healthy
behaviors. Instead, they result from or supervene on them [6].
Traditionally, the principal components model has been seen as the weak sister of the factor model,
which was thought to give the better approach to modeling IQ subtest scores [13]. However, under the
mutualism model, the situation is reversed. The principal components model in fact yields as good a
composite as any other model. The use of the factor model, in this view, amounts to cutting butter with
a razor: it represents and extraordinarily complicated and roundabout way of constructing composite
J. Intell. 2014, 2 14
scores that are in fact no different from principal component scores. In particular, factor score estimates
do not yield measurements of a latent variable that leads an independent life in the form a biological
substrate or such. They are just weighted sum scores.
Thus, the mutualism model explains the positive manifold but at the same time denies the existence
of a realistic g. As a result, it motivates a formative interpretation of the factor analytic model. This has
many implications. First, if g is not a causal source of the positive manifold, the search for a gene or
brain area “for gwill be fruitless [14]. Again, the comparison with health is instructive. There are no
specific genes “for health, and health has no specific location in the body. Note that this line of
reasoning does not apply to genetic and brain research on components of intelligence (for instance
working memory) as these components often do have a realistic reflective interpretation. Working
memory capacity may very well be based on specific and independently identifiable brain processes,
even if g is not.
The implications of a mutualism model for approaches to measurement are likewise
significant [4,14,15]. One crucial difference between reflective and formative models, for instance,
concerns the role of the indicator variables (items or subtests). As noted above, in a reflective model
these indicators are exchangeable. Therefore, different working memory tests, with different factor
loadings, could be added to a test battery without changing the nature of the measured g factor. Also,
measurements of g can be better improved by simply adding more and more relevant tests. Tests can
also be ordered in how well they measure g, for instance by looking at patterns of factor loadings or by
computing indices of measurement precision.
In formative models, however, indicators are not exchangeable, unless they are completely
equivalent. There is no universally optimal way to compute the composite scores that function as an
index; instead, this choice rests on pragmatic grounds. For example, the justification for the choice of
index may lie in its predictive value. The best test is simply the test that best predicts educational,
societal or job success. The choice of indicators may however also depend on our “cognitive”
environment. When a certain cognitive capacity, say computational thinking, is valued to a greater
extent in the current society, intelligence tests may be adapted to reflect that. In an extended
mutualistic model, in which reciprocal effects take place via the environment (a gene-environment
model; see [16,17]), intelligence testing could even be extended to the assessment of features of the
environment that play a positively reinforcing role in promoting the mutualistic processes that produce
the positive manifold. On this viewpoint, the number of books somebody owns might very well be
included in the construction of composites under an index model of intelligence.
Finally, in a formative interpretation of IQ test scores there really is no such thing as a separate
latent variable that we could honor with the term intelligence”, and it is questionable whether one
should in fact use the word intelligence measurementat all in such a situation [18]. However, if one
insists on keeping the terminology of measurement around, there is little choice except to bite the
bullet: Interpreted as an index, intelligence is whatever IQ-tests measure. Seriously.
Conflicts of Interest
The authors declare no conflict of interest.
J. Intell. 2014, 2 15
1. Boring, E.G. Intelligence as the tests test it. New Repub. 1923, 36, 35–37.
2. Johnson, W. Whither intelligence research? J. Intell. 2013, 1, 25–35.
3. Hunt, E.; Jaeggi, S.M. Challenges for research on intelligence. J. Intell. 2013, 1, 36–54.
4. Edwards, J.R.; Bagozzi, R.P. On the nature and direction of relationships between constructs and
measures. Psychol. Methods 2000, 5, 155–174.
5. Haig, B.D. Exploratory factor analysis, theory generation, and scientific method.
Multivar. Behav. Res. 2005, 40, 303–329.
6. Markus, K.; Borsboom, D. Frontiers of Validity Theory: Measurement, Causation, and Meaning;
Routledge: New York, NY, USA, 2013.
7. Kievit, R.A.; Romeijn, J.W.; Waldorp, L.J.; Wicherts, J.M.; Scholte, H.S.; Borsboom, D. Mind
the gap: A psychometric approach to the reduction problem. Psychol. Inq. 2011, 22, 1–21.
8. Howell, R.D.; Breivik, E.; Wilcox, J.B. Is formative measurement really measurement?
Psychol. Methods 2007, 12, 238–245.
9. Jensen, A.R. The g Factor: The Science of Mental Ability; Praeger: Westport, CT, USA, 1999.
10. Van der Maas, H.L.J.; Dolan, C.V.; Grasman, R.P.P.P.; Wicherts, J.M.; Huizenga, H.M.;
Raijmakers, M.E.J. A dynamical model of general intelligence: The positive manifold of
intelligence by mutualism. Psychol. Rev. 2006, 113, 842–861.
11. Schmittmann, V.D.; Cramer, A.O.; Waldorp, L.J.; Epskamp, S.; Kievit, R.A.; Borsboom, D.
Deconstructing the construct: A network perspective on psychological phenomena.
New Ideas Psychol. 2013, 31, 43–53.
12. Thomson, G.H. A hierarchy without a general factor. Br. J. Psychol. 1916, 8, 271–281.
13. Bartholomew, D.J. Measuring Intelligence: Facts and Fallacies; Cambridge University Press:
Cambridge, UK, 2004.
14. Chabris, C.F.; Hebert, B.M.; Benjamin, D.J.; Beauchamp, J.; Cesarini, D.; van der Loos, M.;
Laibson, D. Most reported genetic associations with general intelligence are probably false
positives. Psychol. Sci. 2012, 23, 1314–1323.
15. Bollen, K.A.; Lennox, R. Conventional wisdom on measurement: A structural equation
perspective. Psychol. Bull. 1991, 110, 305–314.
16. Dickens, W.T.; Flynn, J.R. Heritability estimates versus large environmental effects: The IQ
paradox resolved. Psychol. Rev. 2001, 108, 346–369.
17. Kan, K.J.; Wicherts, J.M.; Dolan, C.V.; van der Maas, H.L.J. On the nature and nurture of
intelligence and specific cognitive abilities the more heritable, the more culture dependent.
Psychol. Sci. 2013, 24, 2420–2428.
18. Howell, R.D.; Breivik, E.; Wilcox, J.B. Is formative measurement really measurement?
Psychol. Methods 2007, 12, 238–245.
© 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article
distributed under the terms and conditions of the Creative Commons Attribution license
... .general intelligence is but one interpretation of that primary fact" (Protzko and Colom 2021a, p. 2; emphasis added). As described later, contemporary intelligence and cognitive psychology research has provided reasonable and respected theories (e.g., dynamic mutualism; process overlap theory; wired cognition; attentional control), robust methods (psychometric network analysis), and supporting research (Burgoyne et al. 2022;Conway and Kovacs 2015;Kan et al. 2019;Kievit et al. 2016;Conway 2016, 2019;van der Maas et al. 2006van der Maas et al. , 2014van der Maas et al. , 2019 that accounts for the positive manifold of IQ test correlations in the absence of an underlying latent causal theoretical or psychometric g construct. Fried (2020) and others (Eronen and Bringmann 2021;Kovacs and Conway 2019;Colom 2021a, 2021b) have cogently explained why the cavalier use of g-like terms (e.g., g for general intelligence; p for general psychopathology) and the failure to differentiate between theoretical and psychometric models, contributes to the theory crises in psychology. ...
Full-text available
For over a century, the structure of intelligence has been dominated by factor analytic methods that presume tests are indicators of latent entities (e.g., general intelligence or g). Recently, psychometric network methods and theories (e.g., process overlap theory; dynamic mutualism) have provided alternatives to g-centric factor models. However, few studies have investigated contemporary cognitive measures using network methods. We apply a Gaussian graphical network model to the age 9-19 standardization sample of the Woodcock-Johnson Tests of Cognitive Ability-Fourth Edition. Results support the primary broad abilities from the Cattell-Horn-Carroll (CHC) theory and suggest that the working memory-attentional control complex may be central to understanding a CHC network model of intelligence. Supplementary multidimensional scaling analyses indicate the existence of possible higher-order dimensions (PPIK; triadic theory; System I-II cognitive processing) as well as separate learning and retrieval aspects of long-term memory. Overall, the network approach offers a viable alternative to factor models with a g-centric bias (i.e., bifactor models) that have led to erroneous conclusions regarding the utility of broad CHC scores in test interpretation beyond the full-scale IQ, g.
... First, a sum score can be built if researchers have conceptualized the construct that they intend to represent in the score such that all indicators represent approximately equal shares of the construct. This might for example be the case if researchers aim at building an index of a construct of different skills (Van der Maas et al., 2014) or beliefs (e.g., Merk and Rosman, 2019;Schiefer et al., 2022). In this case, however, it should not be overlooked that equal weighting of indicators in an index should also be based on justification. ...
... The most famous example is perhaps the general factor of intelligence g, which for a long time was proposed as common cause underlying scores to various cognitive ability tests. Alternative theories have been proposed to explain the observed correlations between such tests without them having a common cause, such as the mutualism model (Kan et al., 2019;Van der Maas et al., 2014), according to which, throughout development, initially uncorrelated cognitive processes are involved in mutually beneficial interactions that lead to a positive manifold situation (i.e., many positively correlated abilities). In a recent empirical comparison between the mutualism model (represented statistically by a network model) and the g factor model, the mutualism model outperformed the g factor model in various intelligence test battery datasets (Kan et al., 2019). ...
Career adaptability is better described as interconnected resources than as manifestations of a common factor. Using a sample of 1053 responses to the Career Adapt-Abilities Scale, we compared traditional confirmatory factor analysis models (unidimensional, bifactor, hierarchical) with a confirmatory network model, which is found to outperform the others.
English as a global lingua franca interacts with other languages across a wide range of multilingual contexts. Combining insights from linguistics, education studies, and psychology, this book addresses the role of English within the current linguistic dynamics of globalization. It takes Singapore, Hong Kong, and Dubai as case studies to illustrate the use of English in different multilingual urban areas, arguing that these are places where competing historical assessments, and ideological conceptions of monolingualism and multilingualism, are being acted out most forcefully. It critically appraises the controversial concept of multilingual advantages, and studies multilingual cross-linguistic influence in relation to learning English in bilingual heritage contexts. It also scrutinises multilingual language policies in their impact on attitudes, identities, and investment into languages. Engaging and accessible, it is essential reading for academic researchers and advanced students of bi- and multilingualism, globalization, linguistic diversity, World Englishes, sociolinguistics, and second/third language acquisition.
This article describes a three-step process by which behaviors are associated with the concept of giftedness. In the first step, a three-way interaction of a person x task x situation leads to some kind of excellence in a societally significant performance. In the second step, that performance is identified as excellent and societally significant. In the third step, the performance is labeled as “gifted” and the person who did the performance potentially as a “gifted person.” Behavior may be excellent (Step 1) and societally significant but not recognized as such (Step 2). Or behavior may be recognized as excellent and even societally significant (Step 2), but not be labeled as “gifted” (Step 3). The article elaborates on this three-step process.
Adaptive intelligence features collective adaptation as a hallmark of human intelligence. Any thought and behavior labeled as adaptively intelligent must contribute to the perpetuation of human populations instead of being destructive to this perpetuation. Whereas most intellectual capacities captured by conventional IQ tests can be replaced by “intelligent” machines, adaptive intelligence—the ability to deliver contextually relevant outputs for the survival and sustainable development of humans and the world they inhabit—may be a uniquely human ability. In this chapter, we link adaptive intelligence to cultural evolution theories. We further propose that adaptive intelligence is supported by a concatenation of mutually reinforcing individual and interpersonal capacities. These capacities have evolved and are evolving to support adaptation of human populations to the environment and its changes. Furthermore, adaptive intelligence is solution-oriented; it enables human groups to identify/create and implement optimally adaptive strategies to meet challenges in concrete physical, socioeconomic and social ecologies. Based on these ideas, we propose a conceptual framework for understanding, measuring and developing a psychological system of adaptive intelligence.
Psychological science constructs much of the knowledge that we consume in our everyday lives. This book is a systematic analysis of this process, and of the nature of the knowledge it produces. The authors show how mainstream scientific activity treats psychological properties as being fundamentally stable, universal, and isolable. They then challenge this status quo by inviting readers to recognize that dynamics, context-specificity, interconnectedness, and uncertainty, are a natural and exciting part of human psychology – these are not things to be avoided and feared, but instead embraced. This requires a shift toward a process-based approach that recognizes the situated, time-dependent, and fundamentally processual nature of psychological phenomena. With complex dynamic systems as a framework, this book sketches out how we might move toward a process-based praxis that is more suitable and effective for understanding human functioning.
Full-text available
Researchers in Psychology, Education, and related fields are often unsure about whether they should use sum scores, or estimatesfrom latent variable models (e.g., factor analysis, item response theory) in their statistical analyses. I show throughargumentation that sum scores do not automatically imply any of these models and that researchers are free to use them if theybelieve that they have a good theoretical reason to do so.
Full-text available
Despite over a century of research on intelligence, the cognitive processes underlying intelligent behavior are still unclear. In this review, we summarize empirical results investigating the contribution of cognitive processes associated with working memory capacity, processing speed, and executive processes to intelligence differences. Specifically, we (a) evaluate how cognitive processes associated with the three different cognitive domains have been measured, and (b) how these processes are related to individual differences in intelligence. Consistently, this review illustrates that isolating single cognitive processes using average performance in cognitive tasks is hardly possible. Instead, formal models that implement theories of cognitive processes underlying performance in different cognitive tasks may provide more adequate indicators of single cognitive processes. Therefore, we outlined which models for working memory capacity, processing speed, and executive processes may provide more specific insights into cognitive processes associated with individual differences in intelligence. Finally, we discuss implications of a process-oriented intelligence research using cognitive measurement models for psy-chometric theories of intelligence and argue that a model-based approach might overcome validity problems of traditional intelligence theories.
Full-text available
After 100 years of research, the definition of the field is still inadequate. The biggest challenge we see is moving away from a de-factor definition of intelligence in terms of test scores, but at the same time making clear what the boundaries of the field are. We then present four challenges for the field, two within a biological and two within a social context. These revolve around the issues of the malleability of intelligence and its display in everyday life, outside of a formal testing context. We conclude that developments in cognitive neuroscience and increases in the feasibility of monitoring behavior outside of the context of a testing session offer considerable hope for expansion of our both the biological and social aspects of individual differences in cognition.
Full-text available
Scores on cognitive tasks used in intelligence tests correlate positively with each other, that is, they display a positive manifold of correlations. The positive manifold is often explained by positing a dominant latent variable, the g factor, associated with a single quantitative cognitive or biological process or capacity. In this article, a new explanation of the positive manifold based on a dynamical model is proposed, in which reciprocal causation or mutualism plays a central role. It is shown that the positive manifold emerges purely by positive beneficial interactions between cognitive processes during development. A single underlying g factor plays no role in the model. The model offers explanations of important findings in intelligence research, such as the hierarchical factor structure of intelligence, the low predictability of intelligence from early childhood performance, the integration/differentiation effect, the increase in heritability of g, and the Jensen effect, and is consistent with current explanations of the Flynn effect.
Full-text available
Today we have many exciting new technological tools that allow us to observe the brain and genome and lure us into new kinds of studies. I believe, however, that we will not be able to make truly effective use of these tools until we understand better what it is we mean to measure when we measure intelligence, how it develops, and the impact of the clear presence of gene-environment correlation on its development.
Full-text available
To further knowledge concerning the nature and nurture of intelligence, we scrutinized how heritability coefficients vary across specific cognitive abilities both theoretically and empirically. Data from 23 twin studies (combined N = 7,852) showed that (a) in adult samples, culture-loaded subtests tend to demonstrate greater heritability coefficients than do culture-reduced subtests; and (b) in samples of both adults and children, a subtest's proportion of variance shared with general intelligence is a function of its cultural load. These findings require an explanation because they do not follow from mainstream theories of intelligence. The findings are consistent with our hypothesis that heritability coefficients differ across cognitive abilities as a result of differences in the contribution of genotype-environment covariance. The counterintuitive finding that the most heritable abilities are the most culture-dependent abilities sheds a new light on the long-standing nature-nurture debate of intelligence.
Full-text available
This article examines the methodological foundations of exploratory factor analysis (EFA) and suggests that it is properly construed as a method for generating explana- tory theories. In the first half of the article it is argued that EFA should be understood as an abductive method of theory generation that exploits an important precept of sci- entific inference known as the principle of the common cause. This characterization of the inferential nature of EFA coheres well with its interpretation as a latent vari- able method. The second half of the article outlines a broad theory of scientific method in which abductive reasoning figures prominently. It then discusses a number of methodological features of EFA in the light of that method. Specifically, it is ar- gued that EFA helps researchers generate theories with genuine explanatory merit; that factor indeterminacy is a methodological challenge for both EFA and confirma- tory factor analysis, but that the challenge can be satisfactorily met in each case; and, that EFA, as a useful method of theory generation, can be profitably employed in tan- dem with confirmatory factor analysis and other methods of theory evaluation. The first 60 years of the 100-year history of factor analysis was largely devoted to the development of exploratory factor analytic (EFA) methods. However, de- spite the advanced statistical state and frequent use of EFA within the behavioral sciences, debate about its basic nature and worth continues. Most factor analytic methodologists take EFA to be a method for postulating latent variables which are thought to underlie patterns of correlations. Some, however, understand it as a method of data reduction which provides an economical description of correlational data. Further, with the advent of confirmatory factor analysis and full structural equation modeling, the prominence of EFA in multivariate re-
This book penetrates the thicket of controversy, ideology and prejudice surrounding the measurement of intelligence to provide a clear non-mathematical analysis of it. The testing of intelligence has a long and controversial history and whether intelligence exists and can be measured still remains unresolved. The debate about it has centered on the "nurture versus nature" controversy and especially on alleged racial differences and the heritability of intelligence. © David J. Bartholomew 2004 and Cambridge University Press, 2009.
This book examines test validity in the behavioral, social, and educational sciences by exploring three fundamental problems: measurement, causation and meaning. Psychometric and philosophical perspectives receive attention along with unresolved issues. The authors explore how measurement is conceived from both the classical and modern perspectives. The importance of understanding the underlying concepts as well as the practical challenges of test construction and use receive emphasis throughout. The book summarizes the current state of the test validity theory field. Necessary background on test theory and statistics is presented as a conceptual overview where needed.
In this article we first describe a broad multilevel framework representing the determinants of human behavior and consider its advantages. Expanding on the upper part of this framework, we then propose the Multilevel Personality in Context (MPIC) model, showing how it integrates and extends past theorizing on the hierarchical organization of personality. The model builds upon McAdams's three-tier (traits, goals, and selves) conception of personality, adding a foundational level (psychological needs) beneath individual differences and incorporating social relations and cultural factors as higher level influences upon behavior and individual differences. New data (N= 3,665 in 21 cultures) are briefly presented showing that culture, self, motive, and trait variables each have independent effects upon subjective well-being (SWB) and showing that psychological need satisfaction (at the foundational level) mediates these effects as predicted. Consistent with McAdams and Pals's (2006)34. McAdams , D. P. and Pals , J. L. 2006 . A new big five: Fundamental principles for an integrative science of personality. . American Psychologist , 61 : 204 – 217 . [CrossRef], [PubMed], [Web of Science ®]View all references “fifth principle” of personality, culture had top-down effects upon self-level variables and moderated several of the relations to SWB. We conclude by suggesting some general heuristics for designing studies using the MPIC approach.
During the 1920's intelligence tests came in for violent criticism. The myth of the "child mind" of the American adult, fostered by an uncritical view of the Army test data, was ridiculed; critics, notably Walter Lippmann, lambasted psychologists for pretending to test what they could not even define. This paper was an attempt to communicate to a popular audience what the psychologist was doing and what a score on an intelligence test meant. (PsycINFO Database Record (c) 2012 APA, all rights reserved)