ArticlePDF Available

Evidence of a Collective Intelligence Factor in the Performance of Human Groups

Authors:

Abstract and Figures

Meeting of Minds The performance of humans across a range of different kinds of cognitive tasks has been encapsulated as a common statistical factor called g or general intelligence factor. What intelligence actually is, is unclear and hotly debated, yet there is a reproducible association of g with performance outcomes, such as income and academic achievement. Woolley et al. (p. 686 , published online 30 September) report a psychometric methodology for quantifying a factor termed “collective intelligence” ( c ), which reflects how well groups perform on a similarly diverse set of group problem-solving tasks. The primary contributors to c appear to be the g factors of the group members, along with a propensity toward social sensitivity—in essence, how well individuals work with others.
Content may be subject to copyright.
DOI: 10.1126/science.1193147
, 686 (2010);330 Science , et al.Anita Williams Woolley
Human Groups
Evidence for a Collective Intelligence Factor in the Performance of
This copy is for your personal, non-commercial use only.
clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others
here.following the guidelines can be obtained byPermission to republish or repurpose articles or portions of articles
): January 17, 2011 www.sciencemag.org (this infomation is current as of
The following resources related to this article are available online at
http://www.sciencemag.org/content/330/6004/686.full.html
version of this article at: including high-resolution figures, can be found in the onlineUpdated information and services,
http://www.sciencemag.org/content/suppl/2010/09/29/science.1193147.DC1.html http://www.sciencemag.org/content/suppl/2010/09/30/science.1193147.DC2.html
can be found at: Supporting Online Material
http://www.sciencemag.org/content/330/6004/686.full.html#ref-list-1
, 2 of which can be accessed free:cites 10 articlesThis article
http://www.sciencemag.org/cgi/collection/psychology
Psychology subject collections:This article appears in the following
registered trademark of AAAS. is aScience2010 by the American Association for the Advancement of Science; all rights reserved. The title CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience
on January 17, 2011www.sciencemag.orgDownloaded from
task (correct, error, inserted error, and corrected
error) to allow typists to distinguish sources of errors
and correct responses and, therefore, provide a
stronger test of illusions of authorship. We asked 24
skilled typists (WPM = 70.7 T16.4) to type 600
words, each of which was followed by a four-
alternative explicit report screen. Typists typed
91.8% of the words correctly. Mean interkeystroke
intervals, plotted in Fig. 3A, show post-error slow-
ing for incorrect responses (F
1,138
= 117.7, p< 0.01)
and corrected errors (F
1,138
= 120.0, p<0.01),but
not for inserted errors (F< 1.0), indicating that
inner-loop detection distinguishes between actual
errors and correct responses.
Explicit detection probabilities, plotted in Fig.
3B, show good discrimination between correct
and error responses. For correct responses, typists
said correctmore than error[t(23) = 97.29,
p< 0.01]; for error responses, typists said error
more than correct[t(23) = 8.22, p< 0.01]. Typ-
ists distinguished actual errors from inserted errors
well, avoiding an illusion of authorship. They
said errormore than insertedfor actual errors
[t(23) = 7.06, p< 0.01] and insertedmore than
errorfor inserted errors [t(23) = 14.75, p<
0.01]. However, typists showed a strong illusion
of authorship with corrected errors. They were
just as likely to call them correct responses as
corrected errors [t(23) = 1.38].
The post-error slowing and post-trial report
data show a dissociation between inner- and outer-
loop error detection. We assessed the dissociation
further by comparing post-error slowing on trials
in which typists did and did not experience
illusions of authorship (21). The pattern of post-
error slowing was the same for both sets of trials
(fig. S6), suggesting that the pattern in Fig. 3A is
representative of all trials.
The three experiments found strong dissocia-
tions between explicit error reports and post-error
slowing. These dissociations are consistent with
the hierarchical error-detection mechanism that we
proposed, with an outer loop that mediates ex-
plicit reports and an inner loop that mediates post-
error slowing. This nested-loop description of error
detection is consistent with hierarchical models
of cognitive control in typewriting (9,10,1517)
and with models of hierarchical control in other
complex tasks (2, 8,22). Speaking, playing music,
and navigating through space may all involve
inner loops that take care of the details of per-
formance (e.g., uttering phonemes, playing notes,
and walking) and outer loops that ensure that in-
tentions are fulfilled (e.g., messages communi-
cated, songs performed, and destinations reached).
Hierarchical control may be prevalent in highly
skilled performers who have had enough practice
to develop an autonomous inner loop. Previous
studies of error detection in simple tasks may
describe inner-loop processing. The novel con-
tribution of our research is to dissociate the outer
loop from the inner loop.
The three experiments demonstrate cogni-
tive illusions of authorship in skilled typewriting
(1114). Typists readily take credit for correct
output on the screen, interpreting corrected errors
as their own correct responses. They take the
blame for inserted errors, as in the first and sec-
ond experiments, but they also blame the com-
puter, as in the third experiment. These illusions
are consistent with the hierarchical model of error
detection, with the outer loop assigning credit
and blame and the inner loop doing the work of
typing (10,17). Thus, illusions of authorship
may be a hallmark of hierarchical control systems
(2,11,22,23).
References and Notes
1. P. M. A. Rabbitt, J. Exp. Psychol. 71, 264 (1966).
2. D. A. Norman, Psychol. Rev. 88, 1 (1981).
3. C. B. Holroyd, M. G. H.Coles, Psychol. Rev. 109, 679 (2002).
4. N. Yeung, M. M. Botvinick, J. D. Cohen, Psychol. Rev.
111, 931 (2004).
5. W. J. Gehring, B. Goss, M. G. H. Coles, D. E. Meyer,
E. Donchin, Psychol. Sci. 4, 385 (1993).
6. S. Dehaene, M. I. Posner, D. M. Tucker, Psychol. Sci. 5,
303 (1994).
7. C. S. Carter et al., Science 280, 747 (1998).
8. K. S. Lashley, in Cerebral Mechanisms in Behavior,
L. A. Jeffress, Ed. (Wiley, New York, 1951), pp. 112136.
9. T. A. Salthouse, Psychol. Bull. 99, 303 (1986).
10. G. D. Logan, M. J. C. Crump, Psychol. Sci. 20, 1296
(2009).
11. T. I. Nielsen, Scand. J. Psychol. 4, 225 (1963).
12. M. M. Botvinick, J. D. Cohen, Nature 391, 756 (1998).
13. D. M. Wegner, The Illusion of Conscious Will (MIT Press,
Cambridge, MA, 2002).
14. G. Knoblich, T. T. J. Kircher, J. Exp. Psychol. Hum.
Percept. Perform. 30, 657 (2004).
15. D. E. Rumelhart, D. A. Norman, Cogn. Sci. 6, 1 (1982).
16. L. H. Shaffer, Psychol. Rev. 83, 375 (1976).
17. X. Liu, M. J. C. Crump, G. D. Logan, Mem. Cognit. 38,
474 (2010).
18. A. M. Gordon, J. F. Soechting, Exp. Brain Res. 107,
281 (1995).
19. J. Long, Ergonomics 19, 93 (1976).
20. P. Rabbitt, Ergonomics 21, 945 (1978).
21. Materials and methods are available as supporting
material on Science Online.
22. M. M. Botvinick, Trends Cogn. Sci. 12, 201 (2008).
23. R. Cooper, T. Shallice, Cogn. Neuropsychol. 17, 297 (2000).
24. We thank J. D. Schall for comments on the manuscript.
This research was supported by grants BCS 0646588
and BCS 0957074 from the NSF.
Supporting Online Material
www.sciencemag.org/cgi/content/full/330/6004/683/DC1
Materials and Methods
SOM Text
Figs. S1 to S6
References
5 April 2010; accepted 13 September 2010
10.1126/science.1190483
Evidence for a Collective Intelligence
Factor in the Performance of
Human Groups
Anita Williams Woolley,
1
*Christopher F. Chabris,
2,3
Alex Pentland,
3,4
Nada Hashmi,
3,5
Thomas W. Malone
3,5
Psychologists have repeatedly shown that a single statistical factoroften called general
intelligence”—emerges from the correlations among peoples performance on a wide variety of cognitive
tasks. But no one has systematically examined whether a similar kind of collective intelligenceexists for
groups of people. In two studies with 699 people, working in groups of two to five, we find converging
evidence of a general collective intelligence factor that explains a groups performance on a wide variety
of tasks. This cfactoris not strongly correlated with the average or maximum individual intelligence
of group members but is correlated with the average social sensitivity of group members, the equality in
distribution of conversational turn-taking, and the proportion of females in the group.
As research, management, and many other
kinds of tasks are increasingly accom-
plished by groupsworking both face-
to-face and virtually (13)it is becoming ever
more important to understand the determinants of
group performance. Over the past century,
psychologists made considerable progress in
defining and systematically measuring intelli-
gence in individuals (4). We have used the sta-
tistical approach they developed for individual
intelligence to systematically measure the intelli-
gence of groups. Even though social psycholo-
gists and others have studied for decades how
well groups perform specific tasks (5,6), they have
not attempted to measure group intelligence in the
same way individual intelligence is measured
by assessing how well a single group can perform
a wide range of different tasks and using that
information to predict how that same group will
perform other tasks in the future. The goal of the
research reported here was to test the hypothesis
that groups, like individuals, do have character-
istic levels of intelligence, which can be measured
andusedtopredictthegroupsperformance on a
wide variety of tasks.
Although controversy has surrounded it, the
concept of measurable human intelligence is based
on a fact that is still as remarkable as it was to
Spearman when he first documented it in 1904
1
Carnegie Mellon University, Tepper School of Business, Pitts-
burgh, PA 15213, USA.
2
Union College, Schenectady, NY
12308, USA.
3
Massachusetts Institute of Technology (MIT)
Center for Collective Intelligence, Cambridge, MA 02142, USA.
4
MIT Media Lab, Cambridge, MA 02139, USA.
5
MIT Sloan School
of Management, Cambridge, MA 02142, USA.
*To whom correspondence should be addressed. E-mail:
awoolley@cmu.edu
29 OCTOBER 2010 VOL 330 SCIENCE www.sciencemag.org686
REPORTS
on January 17, 2011www.sciencemag.orgDownloaded from
(7): People who do well on one mental task tend to
do well on most others, despite large variations in
the testscontents and methods of administration
(4). In principle, performance on cognitive tasks
could be largely uncorrelated, as one might expect
if each relied on a specific set of capacities that
was not used by other tasks (8). It could even be
negatively correlated, if practicing to improve one
task caused neglect of others (9). The empirical
fact of general cognitive ability as first demon-
strated by Spearman is now, arguably, the most
replicated result in all of psychology (4).
Evidence of general intelligence comes from
the observation that the average correlation among
individualsperformance scores on a relatively
diverse set of cognitive tasks is positive, the first
factor extracted in a factor analysis of these scores
generally accounts for 30 to 50% of the variance,
and subsequent factors extracted account for
substantially less variance. This first factor extracted
in an analysis of individual intelligence tests is
referred to as general cognitive ability, or g,andit
is the main factor that intelligence tests measure.
What makes intelligence tests of substantial prac-
tical (not just theoretical) importance is that in-
telligence can be measured in an hour or less,
and is a reliable predictor of a very wide range
of important life outcomes over a long span of
time, including grades in school, success in many
occupations, and even life expectancy (4).
By analogy with individual intelligence, we
define a groups collective intelligence (c)asthe
general ability of the group to perform a wide
variety of tasks. Empirically, collective intelligence
is the inference one draws when the ability of a
group to perform one task is correlated with that
groups ability to perform a wide range of other
tasks. This kind of collective intelligence is a prop-
erty of the group itself, not just the individuals in it.
Unlike previous work that examined the effect on
group performance of the average intelligence of
individual group members (10), one of our goals is
to determine whether the collective intelligence of
the group as a whole has predictive power above
and beyond what can be explained by knowing
the abilities of the individual group members.
The first question we examined was whether
collective intelligencein this senseeven exists.
Is there a single factor for groups, a cfactor, that
functions in the same way for groups as general
intelligence does for individuals? Or does group
performance, instead, have some other correla-
tional structure, such as several equally important
but independent factors, as is typically found in
research on individual personality (11)?
To answer this question, we randomly as-
signed individuals to groups and asked them to
perform a variety of different tasks (12). In Study
1, 40 three-person groups worked together for up
to 5 hours on a diverse set of simple group tasks
plus a more complex criterion task. To guide our
task sampling, we drew tasks from all quadrants
of the McGrath Task Circumplex (6,12), a well-
established taxonomy of group tasks based on the
coordination processes they require. Tasks in-
cluded solving visual puzzles, brainstorming,
making collective moral judgments, and negoti-
ating over limited resources. At the beginning of
each session, we measured team membersindi-
vidual intelligence. And, as a criterion task at the
end of each session, each group played checkers
against a standardized computer opponent.
The results support the hypothesis that a
general collective intelligence factor (c)existsin
groups. First, the average inter-item correlation
for group scores on different tasks is positive (r=
0.28) (Table 1). Next, factor analysis of team
scores yielded one factor with an initial eigen-
value accounting for more than 43% of the
variance (in the middle of the 30 to 50% range
typical in individual intelligence tests), whereas
the next factor accounted for only 18%. Confir-
matory factor analysis supported the fit of a
single latent factor model with the data [c
2
=
1.66, P= 0.89, df = 5; comparative fit index
(CFI) =.99, root mean square error of approxi-
mation (RMSEA) = 0.01]. Furthermore, when
the factor loadings for different tasks on the first
general factor are used to calculate a cscore for
each group, this score strongly predicts perform-
ance on the criterion task (r= 0.52, P= 0.01).
Finally, the average and maximum intelligence
scores of individual group members are not
significantly correlated with c[r= 0.19, not
significant (ns); r= 0.27, ns,respectively] and
not predictive of criterion task performance (r=
0.18, ns; r=0.13, ns, respectively). In a regres-
sion using both individual intelligence and cto
predict performance on the criterion task, chas
a significant effect (b= 0.51, P= 0.001), but
average individual intelligence (b= 0.08, ns) and
maximum individual intelligence (b=.01, ns) do
not (Fig. 1).
In Study 2, we used 152 groups ranging from
two to five members. Our goal was to replicate
these findings in groups of different sizes, using a
broader sample of tasks and an alternative mea-
sure of individual intelligence. As expected, this
study replicated the findings of Study 1, yielding
a first factor explaining 44% of the variance and a
second factor explaining only 20%. In addition, a
confirmatory factor analysis suggests an excel-
lent fit of the single-factor model with the data
(c
2
= 5.85, P=0.32,df=5;CFI=0.98,NFI=
0.89, RMSEA = 0.03).
In addition, for a subset of the groups in Study
2, we included five additional tasks, for a total of
ten. The results from analyses incorporating all
ten tasks were also consistent with the hypothesis
that a general cfactor exists (see Fig. 2). The
scree test (13) clearly suggests that a one-factor
model is the best fit for the data from both studies
[Akaike Information Criterion (AIC) = 0.00 for
single-factor solution]. Furthermore, parallel anal-
ysis (13) suggests that only factors with an eigen-
value above 1.38 should be retained, and there is
only one such factor in each sample. These conclu-
sions are supported by examining the eigenvalue s
both before and after principal axis extraction,
which yields a first factor explaining 31% of
Table 1. Correlations among group tasks and descriptive statistics for Study 1. n= 40 groups; *P
0.05; **P0.001.
12345 6 789
1 Collective intelligence (c)
2 Brainstorming 0.38*
3 Group matrix reasoning 0.86** 0.30*
4 Group moral reasoning 0.42* 0.12 0.27
5 Plan shopping trip 0.66** 0.21 0.38* 0.18
6 Group typing 0.80** 0.13 0.50** 0.25* 0.43*
7 Avg member intelligence 0.19 0.11 0.19 0.12 0.06 0.22
8 Max member intelligence 0.27 0.09 0.33* 0.05 0.04 0.28 0.73**
9 Video game 0.52* 0.17 0.38* 0.37* 0.39* 0.44* 0.18 0.13
Minimum 2.67 9 2 32 10.80 148 4.00 8.00 26
Maximum 1.56 55 17 81 82.40 1169 12.67 15.67 96
Mean 0 28.33 11.05 57.35 46.92 596.13 8.92 11.67 61.80
SD 1.00 11.36 3.02 10.96 19.64 263.74 1.82 1.69 17.56
Fig. 1. Standardized regression coefficients for
collective intelligence (c) and average individual
member intelligence when both are regressed to-
gether on criterion task performance in Studies
1 and 2 (controlling for group size in Study 2).
Coefficient for maximum member intelligence is
also shown for comparison, calculated in a separate
regression because it is too highly correlated with
individual member intelligence to incorporate both
in a single analysis (r= 0.73 and 0.62 in Studies
1 and 2, respectively). Error bars, mean TSE.
www.sciencemag.org SCIENCE VOL 330 29 OCTOBER 2010 687
REPORTS
on January 17, 2011www.sciencemag.orgDownloaded from
the variance in Study 1 and 35% of the variance
in Study 2. Multiple-group confirmatory factor
analysis suggests that the factor structures of
the two studies are invariant (c
2
= 11.34, P=
0.66, df = 14; CFI = 0.99, RMSEA = 0.01).
Taken together, these results provide strong
support for the existence of a single dominant
cfactor underlying group performance.
The criterion task used in Study 2 was an ar-
chitectural design task modeled after a complex
research and development problem (14). We had
a sample of 63 individuals complete this task
working alone, and under these circumstances,
individual intelligence was a significant predictor
of performance on the task (r=0.33,P= 0.009).
When the same task was done by groups,
however, the average individual intelligence of
the group members was not a significant predictor
of group performance (r= 0.18, ns). When both
individual intelligence and careusedtopredict
group performance, cis a significant predictor (b=
0.36, P= 0.0001), but average group member
intelligence (b= 0.05, ns) and maximum member
intelligence (b= 0.12, ns) are not (Fig. 1).
If cexists, what causes it? Combining the find-
ings of the two studies, the average intelligence of
individual group members was moderately cor-
related with c(r=0.15,P= 0.04), and so was the
intelligence of the highest-scoring team member
(r=0.19,P= 0.008). However, for both studies, c
was still a much better predictor of group per-
formance on the criterion tasks than the average or
maximum individual intelligence (Fig. 1).
We also examined a number of group and indi-
vidual factors that might be good predictors of c.We
found that many of the factors one might have ex-
pected to predict group performancesuch as group
cohesion, motivation, and satisfactiondid not.
However, three factors were significantly cor-
related with c. First, there was a significant corre-
lation between cand the average social sensitivity
of group members, as measured by the Reading
the Mind in the Eyestest (15)(r= 0.26, P=
0.002). Second, cwas negatively correlated with
the variance in the number of speaking turns by
group members, as measured by the sociometric
badges worn by a subset of the groups (16)(r=
0.41, P= 0.01). In other words, groups where a
few people dominated the conversation were less
collectively intelligent than those with a more
equal distribution of conversational turn-taking.
Finally, cwas positively and significantly
correlated with the proportion of females in the
group (r= 0.23, P= 0.007). However, this result
appears to be largely mediated by social sensitiv-
ity (Sobel z=1.93,P= 0.03), because (consistent
with previous research) women in our sample
scored better on the social sensitivity measure
than men [t(441) = 3.42, P= 0.001]. In a regres-
sion analysis with the groups for which all three
variables (social sensitivity, speaking turn vari-
ance, and percent female) were available, all had
similar predictive power for c, although only
social sensitivity reached statistical significance
(b= 0.33, P=0.05)(12).
These results provide substantial evidence for
the existence of cin groups, analogous to a well-
known similar ability in individuals. Notably, this
collective intelligence factor appears to depend
both on the composition of the group (e.g., aver-
age member intelligence) and on factors that emerge
from the way group members interact when they
are assembled (e.g., their conversational turn-
taking behavior) (17,18).
These findings raise many additional questions.
For example, could a short collective inteligence
test predict a sales teams or a top management
teams long-term effectiveness? More important-
ly, it would seem to be much easier to raise the
intelligence of a group than an individual. Could
a groups collective intelligence be increased by,
for example, better electronic collaboration tools?
Many previous studies have addressed ques-
tions like these for specific tasks, but by measur-
ing the effects of specific interventions on a groups
c, one can predict the effects of those interventions
on a wide range of tasks. Thus, the ability to
measure collective intelligence as a stable property
of groups provides both a substantial economy of
effort and a range of new questions to explore in
building a science of collective performance.
References and Notes
1. S.Wuchty,B.F.Jones,B.Uzzi,Science 316, 1036 (2007).
2. T. Gowers, M. Nielsen, Nature 461, 879 (2009).
3. J. R. Hackman, Leading Teams: Setting the Stage for
Great Performances (Harvard Business School Press,
Boston, 2002).
4. I. J. Deary, Looking Down on Human Intelligence: From
Psychometrics to the Brain (Oxford Univ. Press, New
York, 2000).
5. J. R. Hackman, C. G. Morris, in Small Groups and Social
Interaction, Volume 1, H. H. Blumberg, A. P. Hare,
V. Kent, M. Davies, Eds. (Wiley, Chichester, UK, 1983),
pp. 331345.
6. J. E. McGrath, Groups: Interaction and Performance
(Prentice-Hall, Englewood Cliffs, NJ, 1984).
7. C. Spearman, Am. J. Psychol. 15, 201 (1904).
8. C. F. Chabris, in Integrating the Mind: Domain General
Versus Domain Specific Processes in Higher Cognition,
M. J. Roberts, Ed. (Psychology Press, Hove, UK, 2007),
pp. 449491.
9. C. Brand, The g Factor (Wiley, Chichester, UK, 1996).
10. D. J. Devine, J. L. Philips, Small Group Res. 32, 507
(2001).
11. R. R. McCrae, P. T. Costa Jr., J. Pers. Soc. Psychol.
52, 81 (1987).
12. Materials and methods are available as supporting
material on Science Online.
13. R. B. Cattell, Multivariate Behav. Res. 1, 245 (1966).
14. A. W. Woolley, Organ. Sci. 20, 500 (2009).
15. S. Baron-Cohen, S. Wheelwright, J. Hill, Y. Raste, I. Plumb,
J. Child Psychol. Psychiatry 42, 241 (2001).
16. A. Pentland, Honest Signals: How They Shape Our World
(Bradford Books, Cambridge, MA, 2008).
17. L. K. Michaelsen, W. E. Watson, R. H. Black, J. Appl.
Psychol. 74, 834 (1989).
18. R. S. Tindale, J. R. Larson, J. Appl. Psychol. 77, 102
(1992).
19. This work was made possible by financial support from the
National Science Foundation (grant IIS-0963451), the
Army Research Office (grant 56692-MA), the Berkman
Faculty Development Fund at Carnegie Mellon University,
and Cisco Systems, Inc., through their sponsorship of the
MIT Center for Collective Intelligence. We would especially
like to thank S. Kosslyn for his invaluable help in the
initial conceptualization and early stages of this work and
I.AggarwalandW.Dongforsubstantialhelpwithdata
collection and analysis. We are also grateful for comments
and research assistance from L. Argote, E. Anderson,
J. Chapman, M. Ding, S. Gaikwad, C. Huang, J. Introne,
C.Lee,N.Nath,S.Pandey,N.Peterson,H.Ra,C.Ritter,
F. Sun, E. Sievers, K. Tenabe, and R. Wong. The hardware
and software used in collecting sociometric data are the
subject of an MIT patent application and will be provided for
academic research via a not-for-profit arrangement through
A.P. In addition to the affiliations listed above, T.W.M.
is also a member of the Strategic Advisory Board at
InnoCentive, Inc.; a director of Seriosity, Inc.; and chairman
of Phios Corporation.
Supporting Online Material
www.sciencemag.org/cgi/content/full/science.1193147/DC1
Materials and Methods
Tables S1 to S4
References
2 June 2010; accepted 10 September 2010
Published online 30 September 2010;
10.1126/science.1193147
Include this information when citing this paper.
Fig. 2. Scree plot demonstrating
the first factor from each study ac-
counting for more than twice as
much variance as subsequent fac-
tors. Factor analysis of items from
theWonderlicPersonnelTestofIn-
dividual intelligence administered
to 642 individuals is included as a
comparison.
29 OCTOBER 2010 VOL 330 SCIENCE www.sciencemag.org688
REPORTS
on January 17, 2011www.sciencemag.orgDownloaded from
... Research has shown psychological safety predicts both team learning and individual learning, especially from failure (Bresman and Zellmer-Bruhn, 2013;Edmondson and Lei, 2014;Newman et al., 2017;Edmondson, 2019). Psychologically safe practices, such as social sensitivity and even or equitable turn-taking, are important conditions that can help build collective intelligence, which can support team learning (Woolley et al., 2010). ...
... Iteration of ideas in response to frequent and diverse feedback is known to aid innovation (Ulibarri et al., 2019). Teams are generally more effective when they leverage the strengths of the entire group, not just a few individuals (Woolley et al., 2010;Edmondson, 2019). When one individual is struggling to complete their task and does not ask for help, this may ultimately set back the entire team by weakening or delaying the scientific project. ...
... One often overlooked practice is even turn-taking to ensure that one person does not dominate the conversation (Woolley et al. 2010;Duhigg, 2016). While this might sound simple, "collaborative" research efforts where only one or two people talk most of the time are far too common, especially among groups of mixed career stage. ...
Article
Full-text available
Science is increasingly dependent on large teams working well together. Co-creating knowledge in this way, usually across disciplines and institutions, requires team members to feel comfortable taking interpersonal risks with each other; in other words, to have what is known as “psychological safety”. Although the importance of psychological safety for team functioning is increasingly well understood, the behaviours necessary to foster psychological safety are harder to define. We suggest that science facilitation expertise offers a path forward for scientific teams—particularly through the integration of outside facilitators or team members taking on the facilitation role—to identify dynamics that can promote or curtail psychological safety, interpret those dynamics accurately, and intervene appropriately to shift a group towards greater psychological safety. We describe how specific practices can support this cycle of observation, interpretation, and action to promote psychological safety across the team process and at key moments. We conclude with ideas for how research teams might embed these facilitation practices into their work, and how institutions can drive more widespread recognition and development of the expertise needed to cultivate psychologically safe scientific teams.
... In the context of LLMs, this theory suggests that different LLM architectures (e.g., GPT-4, Claude-3, LLaMA, Gemini) offer varied approaches to problem-solving, potentially leading to more robust solutions Hong and Page [2004]. It informs our approach to combining individual model outputs through structured consensus formation Woolley et al. [2010]. Additionally, each model's independent processing of questions helps maintain solution diversity and reduces cascading errors Vercammen et al. [2019]. ...
... Integrating collective intelligence, distributed cognition, and consensus formation frameworks provides a comprehensive foundation for analyzing collaborative AI systems Hutchins [1995], Woolley et al. [2010]. This integration can be formally expressed as: ...
Preprint
Full-text available
We explore the collaborative dynamics of an innovative language model interaction system involving advanced models such as GPT-4-0125-preview, Meta-LLaMA-3-70B-Instruct, Claude-3-Opus, and Gemini-1.5-Flash. These models generate and answer complex, PhD-level statistical questions without exact ground-truth answers. Our study investigates how inter-model consensus enhances the reliability and precision of responses. By employing statistical methods such as chi-square tests, Fleiss' Kappa, and confidence interval analysis, we evaluate consensus rates and inter-rater agreement to quantify the reliability of collaborative outputs. Key results reveal that Claude and GPT-4 exhibit the highest reliability and consistency, as evidenced by their narrower confidence intervals and higher alignment with question-generating models. Conversely, Gemini and LLaMA show more significant variability in their consensus rates, as reflected in wider confidence intervals and lower reliability percentages. These findings demonstrate that collaborative interactions among large language models (LLMs) significantly improve response reliability, offering novel insights into autonomous, cooperative reasoning and validation in AI systems.
... Amalio A. Rey en "El libro de la inteligencia colectiva: ¿qué ocurre cuando hacemos cosas juntos?" define el término de inteligencia colectiva como aquella que surge de las personas que hacen cosas juntas. Otra definición más precisa nos la proporciona Woolley, A.C quien la define como "el tipo de Inteligencia que surge cuando un elevado número de individuos trabaja de manera colaborativa en un mismo esfuerzo intelectual" (Woolley et al., 2010). Para ello se tienen que dar tres condiciones: un grupo (dos o más personas que interaccionen), una agregación de mecanismos que combinen las contribuciones individuales para convertirlas en un resultado ...
Conference Paper
Full-text available
The lack of references and sustainability strategies in urban transformation projects in vulnerable fabrics highlights the need to create a good practice manual. In view of this, a pilot experience is presented whose ultimate aim will be to create a collective imaginary: a catalogue of concepts and tools for the improvement of the urban habitat in these fabrics. To this end, the experimental work carried out by undergraduate students (Projects 1 and 2) in the first four-month period will be incorporated into the Master's classrooms as intuitive approaches. The different strategies proposed, grouped into labels and applied to the different places of work (Favela Rocinha in Brazil and Barrio de San Bernardo in Bogotá), make up the sought-after manual of good practices. All of this is supported by the use of analogue and digital tools, as well as by a classroom model based on the workshop structure. La falta de referencias para estrategias de sostenibilidad en los proyectos de transformación urbana sobre tejidos vulnerables, pone de manifiesto la necesidad de elaborar un manual de buenas prácticas. En este artículo presentamos una experiencia piloto cuyo último fin será crear un imaginario colectivo: un catálogo de conceptos y herramientas para la mejora del hábitat urbano en estos tejidos. A partir del trabajo experimental realizado por los alumnos de Grado en Proyectos 1 y 2 en el primer cuatrimestre, este trabajo se incorporará a las aulas del Máster como aproximaciones intuitivas. Las diferentes estrategias propuestas en ambos talleres, agrupadas en etiquetas y aplicadas a los diferentes lugares de trabajo (Favela Rocinha en Río y Barrio de San Bernardo en Bogotá), conformarán la base del manual de buenas prácticas completado con el empleo de herramientas análogicas y digitales, así como por un modelo de aula basado en la estructura del taller.
... And how can we prevent colleagues from feeling angry or hurt-and avoid damaged ties or even retaliation-when we prioritize synchrony/reciprocity with others over them? Thus, our paper answers calls for research to explain how employees collectively attend to one another in ways that are productive for them, for their relationships, and for the organization (Gupta & Woolley, 2021;Marks, DeChurch, Mathieu, Panzer, & Alonso, 2005;Ocasio, 1997;O'Leary et al., 2011;Woolley, Chabris, Pentland, Hashmi, & Malone, 2010;Zaccaro et al., 2012). We build on substantial micro-and macro-level research on attention but focus our theorizing on the meso level and bridge it to other relevant work. ...
Article
Full-text available
As rapid organizational and technological change makes boundaries within workplaces more permeable, employees are gaining unprecedented access to new people and information. This both increases opportunities for collaboration and heightens the risk of attention overload. While scholars have investigated overload with respect to “what” employees attend to, little research has examined the challenges concerning “whom” to attend to, resulting in ambiguity that can undermine collaborative relationships. In this paper, we integrate and advance insights from organizational control and selective-attention research, building on those macro- and micro-level theories to better conceptualize collective attention when the potential target is a colleague (human) rather than information (nonhuman)—which we conceptualize as relational attention, i.e., attention-to-whom. Further, we propose a separate, meso-level theory of transactive control of relational attention, building on concepts of transactive behavior from other fields. By exploring how such transactive control works, we begin to define the conditions organizations need to cultivate—regarding mutually transparent availability, synchronous attentional allocation, and reciprocal attentional allocation—to reduce relational overload without sacrificing productive work relationships or other benefits of more permeable internal boundaries. In addition to shedding light on underexamined attention problems in the workplace, this model contributes to future research by forging multi-level connections between individual meta-attention, transactive control over relational attention, and more traditional forms of organizational control.
Chapter
Sustainability is a catchword of the present generation. The evolution of sustainability through the sustainable development model of Brundtland report and Sustainable Development Goals (SDGs) affirms that the present generation is not only focused on its developmental needs but also concerned about the survival needs of the future generations. This constructive and progressive vision emerged from the positive mentality of many people affirming that the existence of collective intelligence (CI) is advantageous to society. CI insists that open dialogue, mutual sharing, and collaborative efforts can make wonders in human existence. Human coexistence becomes more meaningful with CI. It has also shaped the idea of corporate social responsibility (CSR), which stands for the firms' responsibility towards society. CSR also tries to share a portion of the economic benefits to the marginalized sections of society. From this perspective, the chapter analyzes the role of CI-guided CSR for establishing a sustainable society.
Article
Full-text available
Objectives This study aimed to determine the clinical utility of the androgen deprivation therapy (ADT)+docetaxel (DOCE)+androgen receptor-targeted agent (ARTA) triplet therapy in patients with metastatic hormone-sensitive prostate cancer (mHSPC) in the UK. Design A modified Delphi method. A steering group of eight UK healthcare professionals experienced in prostate cancer care discussed treatment challenges, developing 39 consensus statements across four topics. Agreement with the statements was tested with a broader panel of professionals within this therapeutic area in the UK through an anonymous survey, using a four-point Likert scale. This was distributed by the steering group members and an independent third party. Following the survey, the steering group convened to discuss the results and formulate recommendations. Setting The steering group convened online for discussions. The survey was distributed via email by the clinicians and the independent third party. Participants Healthcare professionals involved in the provision of prostate cancer care, working in relevant professional roles (oncology, urology or geriatric consultant, oncology nurse specialist, and hospital pharmacist) within the UK. No patients or members of the public were involved within the study. Interventions None. Primary and secondary outcome measures Consensus was defined as high (≥75% agreement) and very high (≥90% agreement). Results Responses were received from 120 healthcare professionals, including oncologists (n=73), urologists (n=16), geriatricians (n=15), nurse specialists (n=11) and hospital pharmacists (n=5). Consensus was reached for 37 out of 39 (95%) statements, and 27/39 (69%) statements achieved very high agreement ≥90%. Consensus was not reached for 2/39 (5%) statements. Conclusions Based on the consensus observed, the steering group developed a set of recommendations for the clinical utility of ADT+DOCE+ARTA in treating patients with mHSPC in the UK. Following these recommendations enables clinicians to identify appropriate patients with mHSPC for triplet treatment, thereby improving patients’ outcomes.
Article
Full-text available
This study reports the results of several meta-analyses examining the relationship between four operational definitions of cognitive ability within teams (highest member score, lowest member score, mean score, standard deviation of scores) and team performance. The three indices associated with level yielded moderate and positive sample-weighted estimates of the population relationship (.21 to .29), but sampling error failed to account for enough variation to rule out moderator variables. In contrast, the index associated with dispersion (i.e., standard deviation of member scores) was essentially unrelated to team performance (-.03), and sampling error provided a plausible explanation for the observed variation across studies. A subgroup analysis revealed that mean cognitive ability was a much better predictor of team performance in laboratory settings (.37) than in field settings (.14). Study limitations, practical implications, and future research directions are discussed.
Article
Full-text available
Nearly all research on the accuracy of individual versus group decision making has used ad hoc groups, artificial problems, and trivial or nonexistent reward contingencies. These studies have generally concluded that the knowledge base of the most competent group member appears to be the practical upper limit of group performance and that process gains will rarely be achieved. We studied individual versus group decision making by using data from 222 project teams, ranging in size from 3 to 8 members. These teams were engaged in solving contextually relevant and consequential problems and, in direct contrast with previous research, the groups outperformed their most proficient group member 97% of the time. Furthermore, 40% of the process gains could not be explained by either average or most knowledgeable group member scores. Implications for management practice are also discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
L. K. Michaelsen et al (see record 1990-04483-001) argue that, by using experienced groups working on relevant tasks with real rewards, an assembly bonus effect (group performance that is better than the performance of any individual group member or any combination of individual member efforts [B. E. Collins and H. Guetzkow, 1964]) was demonstrated. Using computer simulations based on the Michaelsen et al findings, the authors argue that it is highly unlikely that an assembly bonus effect was found and that the results are typical of those obtained in standard laboratory experiments on group problem solving. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Part I: Teams Chapter 1: The Challenge Part II: Enabling Conditions Chapter 2: A Real Team Chapter 3: Compelling Direction Chapter 4: Enabling Structure Chapter 5: Supportive Context Chapter 6: Expert Coaching Part III: Opportunities Chapter 7: Imperatives for Leaders Chapter 8: Thinking Differently About Teams
Article
Michaelsen, Watson, and Black (1989) argued that, by using experienced groups working on relevant tasks with real rewards, they were able to demonstrate an assembly bonus effect (Collins & Guetzkow, 1964)-group performance that is better than the performance of any individual group member or any combination of individual member efforts. Using computer simulations based on Michaelsen et al's findings and some recent data collected under circumstances similar to those used by Michaelsen et al, we demonstrate that is highly unlikely that they found an assembly bonus effect and that their results are typical of those obtained in standard laboratory experiments on group problem solving.
Article
Nearly all research on the accuracy of individual versus group decision making has used ad hoc groups, artificial problems, and trivial or nonexistent reward contingencies. These studies have generally concluded that the knowledge base of the most competent group member appears to be the practical upper limit of group performance and that process gains will rarely be achieved. We studied individual versus group decision making by using data from 222 project teams, ranging in size from 3 to 8 members. These teams were engaged in solving contextually relevant and consequential problems and, in direct contrast with previous research, the groups outperformed their most proficient group member 97% of the time. Furthermore, 40% of the process gains could not be explained by either average or most knowledgeable group member scores. Implications for management practice are also discussed.