ArticlePDF Available

Speaking out of turn: How video conferencing reduces vocal synchrony and collective intelligence


Abstract and Figures

Collective intelligence (CI) is the ability of a group to solve a wide range of problems. Synchrony in nonverbal cues is critically important to the development of CI; however, extant findings are mostly based on studies conducted face-to-face. Given how much collaboration takes place via the internet, does nonverbal synchrony still matter and can it be achieved when collaborators are physically separated? Here, we hypothesize and test the effect of nonverbal synchrony on CI that develops through visual and audio cues in physically-separated teammates. We show that, contrary to popular belief, the presence of visual cues surprisingly has no effect on CI; furthermore, teams without visual cues are more successful in synchronizing their vocal cues and speaking turns, and when they do so, they have higher CI. Our findings show that nonverbal synchrony is important in distributed collaboration and call into question the necessity of video support.
Content may be subject to copyright.
Speaking out of turn: How video conferencing
reduces vocal synchrony and collective
Maria TomprouID
*, Young Ji Kim
, Prerna Chikersal
, Anita Williams Woolley
, Laura
A. Dabbish
1Tepper School of Business, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of
America, 2Department of Communication, University of California, Santa Barbara, Santa Barbara, California,
United States of America, 3Human-Computer Interaction Institute, Carnegie Mellon University, Pittsburgh,
Pennsylvania, United States of America
Collective intelligence (CI) is the ability of a group to solve a wide range of problems. Syn-
chrony in nonverbal cues is critically important to the development of CI; however, extant
findings are mostly based on studies conducted face-to-face. Given how much collaboration
takes place via the internet, does nonverbal synchrony still matter and can it be achieved
when collaborators are physically separated? Here, we hypothesize and test the effect of
nonverbal synchrony on CI that develops through visual and audio cues in physically-sepa-
rated teammates. We show that, contrary to popular belief, the presence of visual cues sur-
prisingly has no effect on CI; furthermore, teams without visual cues are more successful in
synchronizing their vocal cues and speaking turns, and when they do so, they have higher
CI. Our findings show that nonverbal synchrony is important in distributed collaboration and
call into question the necessity of video support.
In order to survive, members of social species need to find ways to coordinate and collaborate
with each other [1]. Over a number of decades, scientists have come to study the collaboration
ability of collectives within a framework of collective intelligence, exploring the mechanisms
that enable groups to effectively collaborate to accomplish a wide variety of functions [26].
Recent research demonstrates that, like other species, human groups exhibit “collective
intelligence” (CI), defined as a group’s ability to solve a wide range of problems [2,3]. As
humans are a more cerebral species, researchers have thought that their group performance
depends largely on verbal communication and a high investment of time in interpersonal rela-
tionships that foster the development of trust and attachment [7,8]. However, more recent
research on collective intelligence in human groups illustrates that it forms rather quickly [2],
is partially dependent on members’ ability to pick up on subtle, nonverbal cues [911], and is
strongly associated with teams’ ability to engage in tacit coordination, or coordination without
PLOS ONE | March 18, 2021 1 / 14
Citation: Tomprou M, Kim YJ, Chikersal P, Woolley
AW, Dabbish LA (2021) Speaking out of turn: How
video conferencing reduces vocal synchrony and
collective intelligence. PLoS ONE 16(3): e0247655.
Editor: Marcus Perlman, University of Birmingham,
Received: August 5, 2020
Accepted: February 10, 2021
Published: March 18, 2021
Peer Review History: PLOS recognizes the
benefits of transparency in the peer review
process; therefore, we enable the publication of
all of the content of peer review and author
responses alongside final, published articles. The
editorial history of this article is available here:
Copyright: ©2021 Tomprou et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which
permits unrestricted use, distribution, and
reproduction in any medium, provided the original
author and source are credited.
Data Availability Statement: The data of the study
are publicly available at
Funding: This material is based upon work
supported by the National Science Foundation
under grant numbers CNS-1205539 (url: https://
verbal communication [12]. This suggests that there is likely a so-called deep structure to CI in
human groups, with nonverbal and physiological underpinnings [12,13], just as is the case in
other social species [14,15].
Existing research suggests that nonverbal cues, and their synchronization, play an impor-
tant role in human collaboration and CI [10]. Nonverbal cues are those that encompass all the
messages other than words that people exchange in interactive contexts. Researchers consider
nonverbal cues more reliable than verbal cues in conveying emotion and relational messages
[16] and find that nonverbal cues are important for regulating the communication pace and
flow between interacting partners [17,18]. The literature on interpersonal coordination
explores many forms of synchrony [19,20], but the common view is that synchrony is
achieved when two or more nonverbal cues or behaviors are aligned [21,22]. Social psychology
researchers traditionally study synchrony in terms of body movements, such as leg movements
[23], body posture sway [24,25], finger tapping [26] and dancing [27]. These forms of syn-
chrony contribute to interpersonal liking, cohesion, and coordination in relatively simple tasks
[28,29]. Synchrony in facial muscle activity [30] and prosodic cues such as vocal pitch and
voice quality [3133] are of particular importance for the coordination of interacting group
members, as these facilitate both communication and interpersonal closeness. For example,
synchrony in facial cues has been consistently found to indicate partners’ liking for each other
and cohesion [30].
While humans in general tend to synchronize with others, interaction partners also vary in
the level of synchrony they achieve. The level of synchrony in a group can be influenced by the
qualities of existing relationships [34] but can also be influenced by the characteristics of indi-
vidual team members; for instance, individuals who are more prosocial [35] and more atten-
tive to social cues [10,36] are more likely to achieve synchrony and cooperation with
interaction partners. And, consistent with the link between synchrony and cooperation, recent
studies demonstrate that greater synchrony in teams is associated with better performance
Among the elements that nonverbal cues coordinate is spoken communication, particularly
conversational speaking turns, wherein partners regulate nonverbal cues to signal their inten-
tion to maintain or yield turns [39]. Conversational turn-taking has fairly primitive origins,
being observed in other species and emerging in infants prior to linguistic competence, and is
evident in different spoken languages around the world [40]. The equality with which interac-
tion partners speak varies, however, and those who do have more speaking equality consis-
tently exhibit higher collective intelligence [2,11]. The negative effect of speaking inequality
on collective intelligence has been demonstrated both in face-to-face and online interactions
The majority of existing studies on synchrony were conducted in face-to-face environments
[20,30,41] and focused on the relationship between synchrony and cohesion. We have a lim-
ited understanding of how synchrony relates to collective intelligence, particularly when group
members are not collocated and collaborate on an ad hoc basis -a form of modern organization
that has become increasingly common [42,43]. Given the exponential growth in the use of
technology to mediate human relationships [44,45], an important question is whether syn-
chrony in common, nonverbal communication cues in face-to-face interaction, such as facial
expression and tone of voice, still plays a role in human problem-solving and collaboration in
mediated contexts, and how the role of different cues changes based on the communication
medium used.
Researchers and managers alike assume that the closer a technology-mediated interaction is
to face-to-face interaction–by including the full range of nonverbal cues (e.g., visual, audio,
physical environment)–the better it will be at fostering high quality collaboration [4648]. The
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 2 / 14
1205539&HistoricalAwards=false) Author who
received the award: L.D., OAC-1322278 (url:https://
1322278) (Author who received the award A.W.),
and OAC-1322254 (url:.
(Author who received the award A.W.). The funders
had no role in study design, data collection and
analysis, decision to publish, or preparation of the
Competing interests: The authors have declared
that no competing interests exist.
idea that having more cues available helps collaborators bridge distance is strongly represented
in both the management literature [49,50] and lay theory [51]. However, some empirical
research suggests that visual cue availability may not always be superior to audio cues alone. In
the absence of visual cues, communicators can effectively compensate, seek social information,
and develop relationships in technology-mediated environments [5255]. Indeed, in some
cases, task-performing groups find their partners more satisfactory and trustworthy in audio-
only settings than in audiovisual settings [56,57], suggesting that visual cues may serve as dis-
tractors in some conditions.
Purpose of the study and hypotheses
The primary goal of this research is to understand whether physically distributed collaborators
develop nonverbal synchrony, and how variation in audio-visual cue availability during collab-
oration affects nonverbal synchrony and collective intelligence. Specifically, we test whether
nonverbal synchrony–an implicit signal of coordination–is a mechanism regulating the effect
of communication technologies on collective intelligence. Previous research defines nonverbal
synchrony as any type of synchronous movement and vocalization that involves the matching
of actions in time with others [23]. This study focuses on two types of nonverbal synchrony
that are particularly relevant to the quality of communication and are available through virtual
collaboration and interaction–namely, facial expression and prosodic synchrony. We hypothe-
size that in environments where people have access to both visual and audio cues, collective
intelligence will develop through facial expression synchrony as a coordination mechanism.
When visual cues are absent, however, we anticipate that interacting partners will reach higher
levels of collective intelligence through prosodic synchrony. It will also be interesting to see if
facial expression synchrony develops and affects collective intelligence even in the absence of
visual cues; if this occurs, it would suggest that this type of synchrony forms, at least in part,
based on similarity in partners’ internal reactions to shared experiences, versus simply as reac-
tions to partner’s facial expressions. If facial expression synchrony is important for CI only
when partners see each other, it would suggest that the expressions play a predominantly social
communication role under those conditions, and the joint attention of partners to these signals
is an indicator of the quality of their communication. To explore these predictions, we con-
ducted an experiment where we utilized two different conditions of distributed collaboration,
one with no video access to collaboration partners (Condition 1) and one with video access
(Condition 2) to disentangle how the types of cues available affect the type of synchrony that
forms and its implications for collective intelligence.
Participant recruitment and data collection
Our sample included 198 individuals (99 dyads; 49 in Condition 1 and 50 in Condition 2). We
recruited 292 individuals from a research participation pool of a northeastern university in the
United States and randomly assigned into 146 dyads (59 in condition 1 and 87 in condition 2).
Due to technical problems with audio recording, ten dyads had missing audio data in Condi-
tion 1 and 37 dyads in Condition 2 resulting in 62% valid responses. To test for possible bias
introduced by missing data, we conducted independent sample t-tests to assess any differences
in demographics between the dyads retained and those we excluded due to technical difficul-
ties; no differences were detected (see S1 Appendix). All signed an informed consent form.
The average age in the sample was 24.82 years old (SD = 7.18 years); Ninety-six participants
(48.7%) were female. The ethnic composition of our sample was racially diverse: 6.6% from
different races, 50% Asian or Pacific, 33% White or Caucasian, 7% Black or African American,
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 3 / 14
2.5% Latin or Hispanic. Carnegie Mellon University’s Institutional Review Board approved all
materials and procedures in our study. The participant in Fig 1 has provided a written
informed consent to publish their case details.
The procedure was the same in both conditions, except that in Condition 1 there was no
camera and participants could only hear each other through an audio connection. In Condi-
tion 2, participants could also see each other through a video connection. Both conditions had
approximately equal numbers of dyads in terms of gender composition (i.e., no female, one
female, only-female dyads). Each session lasted about 30 minutes. Members of each dyad were
seated in two separate rooms. After participants completed the pre-test survey independently,
they initiated a conference call with their partner. Participants logged onto the Platform for
Online Group Studies (POGS:, a web browser-based platform supporting syn-
chronous multiplayer interaction, to complete the Test of Collective Intelligence (TCI) with
their partner [2,11]. The TCI contained six tasks ranging from 2 to 6 minutes each, and
instructions were displayed before each task for 15 seconds to 1.5 minutes. At the end of the
test, participants were instructed to sign off the conference call. Participants were then com-
pensated and debriefed. The publication has created a laboratory protocol with DOI.
Collective intelligence. Collective intelligence was measured using the Test of Collective
Intelligence (TCI) completed by dyads working together. The TCI is an online version of the
collective intelligence battery of tests used by [2], which contains a wide range of group tasks
[11,58]. The TCI was adapted into an online tool to allow researchers to administer the test in
a standardized way, even when participants are not collocated. Participants completed six
tasks representing a variety of group processes (e.g., generating, deciding, executing, remem-
bering) in a sequential order (see study’s protocol). To obtain collective intelligence scores for
all dyads, we first scored each of the six tasks and then standardized the raw task scores. We
then computed an unweighted mean of the six standardized scores, a method adapted from
Fig 1. This flowchart illustrates the methodology used to transform the raw data of each participant into individual
signals or measures from which synchrony and spoken communication features are calculated.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 4 / 14
prior research on collective intelligence [58]. Cronbach’s alpha for the reliability of the TCI
scores was .81.
Facial expressions. We used OpenFace [59] to automatically detect facial movements in
each frame, based on the Facial Action Coding System (FACS). We categorized these facial
movements as positive (AU12 i.e., lip corner puller with and without AU6 i.e., cheek raiser),
negative (AU15 lip i.e., corner depressor and AU1 i.e., inner brow raiser and/or AU4 i.e., brow
lowerer) or other expressions (i.e., everything else in low occurrence that may be random).
Facial expression synchrony of the dyad is a variable encoding the synchrony between the
coded facial expression signals of the partners.
Prosodic features. Prosodic characteristics of speech contribute to linguistic functions
such as intonation, tone, stress, and rhythm. We used OpenSMILE [60] to extract 16 prosodic
features over time from the audio recording of each participant. These features included pitch,
loudness, and voice quality, as well as the frame-to-frame differences (deltas) between them.
We conducted principal components analysis with varimax rotation and used the first factor
extracted, which accounted for 55.87% of the variance in the data. The first factor included
four prosodic features: pitch, jitter, shimmer, and harmonics-to-noise ratio. Pitch is the funda-
mental frequency (or F0); jitter, shimmer, and harmonics-to-noise ratio are the three features
that index voice quality [61]. Jitter describes pitch variation in voice, which is perceived as
sound roughness. Shimmer describes the fluctuation of loudness in the voice. Harmonics-to-
noise ratio captures perceived hoarseness. Previous research has also identified these features
as important in predicting quality in social interactions [62]. All features were normalized
using z-scores to account for individual differences in range. Speaker diarization was not
needed, as the speech of each participant was recorded in separate files.
Nonverbal synchrony. Fig 1 illustrates how the raw data of each participant was trans-
formed to derive individual signals or measures. These individual signals or measures were then
used to calculate dyadic synchrony in facial expressions and prosodic features, speaking turn
inequality, and amount of overall communication. We computed synchrony in facial expres-
sions (coded as positive, negative, and other in each frame) and prosodic features between part-
ners for each dyad, using Dynamic Time Warping (DTW). DTW takes two signals and warps
them in a nonlinear manner to match them with each other and adjust to different speeds. It
then returns the distance between the warped signals. The lower this distance, the higher the
synchrony between members of the dyad. Hence, we reversed the signs of the DTW distance
measure to facilitate its interpretation as a measure of synchrony. We use DTW instead of other
distance metrics such as the Pearson correlation or simple Euclidean distance because DTW is
able to match similar behaviors of different duration that occur a few seconds apart, which better
captures the responsive, social nature of these expressions (see comparison in Fig 2) For both
facial expressions and prosodic features, we calculated synchrony across the six tasks of the TCI.
Spoken communication. We computed two features of spoken communication: speaking
turn inequality and the amount of overall spoken communication in the dyad. In order to
compute features related to the number of speaking turns, we first identified speaking turns in
audio recordings of each dyad. All audio frames for which Covarep [63] returned a voicing
probability over .80 were considered to contain speech. We extracted turns using the following
process [64]. First, only one person can hold a turn at a given time. Each turn passes from per-
son A to person B if person A stops speaking before person B starts. If person B interrupts per-
son A, then the turn only passes from A to B if A stops speaking before B stops. If person A
pauses for longer than one second, A’s turn ends. When both participants are silent for greater
than one second, no one holds the turn. We heuristically chose the threshold of one second,
since the pauses between most words in English are less than one second [64]. To measure
speaking turn inequality, we computed the absolute difference between the total number of
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 5 / 14
turns of both partners in the dyad. To measure the amount of overall spoken communication,
we summed the total number of samples of speech (i.e., the amount of time each person spoke
with voicing probability >.80) of both partners in the dyad.
Social perceptiveness. At the beginning of the session, each participant completed the
Reading the Mind in the Eyes (RME) test to assess the participant’s social perceptiveness [65].
This characteristic gauges individuals’ ability to draw inferences about how others think or feel
based on subtle nonverbal cues. Previous research has shown that social perceptiveness
enhances interpersonal coordination [66] and collective intelligence [2,11]. The test consists
of 36 images of the eye region of individual faces. Participants were asked to choose among
possible mental states to describe what the person pictured was feeling or thinking. The
options were complex mental states (e.g., guilt) rather than simple emotions (e.g., anger). Indi-
vidual participants’ scores were averaged for each dyad. We controlled for social perceptiveness
in our analyses predicting CI, because it is a consistent predictor of collective intelligence in
prior work.
Demographics. We also collected demographic attributes such as race, age, education,
and gender for each participant. As our level of analysis was the dyad, we calculated race simi-
larity, age and education distance, and number of females in the dyad.
Table 1 provides bi-variate correlations among study variables and descriptive statistics. We
first examined whether collective intelligence differs as a function of video availability. An
Fig 2. Dynamic Time Warping (DTW) is a better measure of behavioral synchrony than Euclidean distance
because it is able to match similar behaviors of different duration that occur a few seconds apart.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 6 / 14
independent samples t-test comparing our two experimental conditions (no video vs. video)
revealed that there was not a significant difference in the observed level of collective intelli-
gence (M
= -.07, SD
= .64; M
= .08, SD
= .53; t(97) = -1.23, p= .22). Fur-
ther, and surprisingly, the level of synchrony in facial expressions was also not significantly
different between the two conditions; dyads with access to video did not synchronize facial
expressions more than dyads without access to video (M
= -7614.80, SD
= 3472.92;
= -7248.58, SD
= 3167.11;t(97) = -.55, p= .56). By contrast, the difference in
prosodic synchrony between the two conditions was significant; prosodic synchrony was sig-
nificantly higher in dyads without access to video (M
= -.32, SD
= 1.18; M
.26, SD
= .72; t(97) = -2.95, p= .004).
Finally, partners’ number of speaking turns were significantly less equally distributed in
dyads with video than in dyads with no video (speaking turn inequality M
= 26.31, SD
= 22.96; M
= 9.14, SD
= 5.63; t(97) = 5.13, p= .000).
We further examined whether synchrony affects CI differently depending on the availability
of video. Though collective intelligence did not differ with access to video, nor did the level of
facial expression synchrony achieved, we found that synchrony in facial expressions positively
predicted collective intelligence only in the video condition (see Fig 3; the unstandardised coef-
ficient for the conditional effect = .0001, t= 2.70, p= .01, bias-corrected bootstrap confidence
intervals were between.0000 and.0001, suggesting that when video was available, facial expres-
sions play more of a social role and partners jointly attend to them. Furthermore, social percep-
tiveness significantly predicted facial expression synchrony in the video condition (r= .31, p=
.03), consistent with previous research [10], but not in the no video condition (r= -.17, p= .25).
In addition, in the sample overall we found a main effect of prosodic synchrony on CI; con-
trolling for covariates, prosodic synchrony significantly and positively predicted CI (b= .29,
p= .003). We wondered why prosodic synchrony was higher in the no video condition, so
we explored other qualities of the dyads’ speaking patterns, particularly the distribution in
Table 1. Correlation matrix for study variables and descriptive statistics.
1 2 3 4 5 6 7 8 9 10 11
1. Collective intelligence
2. Facial expression synchrony .16
3. Prosodic synchrony .29�� .02
4. Speaking turn inequality -.13 .10 -.35��
5. Overall spoken communication -.24-.05 -.10 -.11
6. Video condition -.12 -.05 -.28�� .46-.16
7. Social perceptiveness .33�� .08 .02 .03 .02 -.04
8. Female number .15 .04 .07 .00 -.09 .00 .20
9. Age distance -.15 -.04 -.04 .16 -.06 .36-.18 -.12
10. Ethnic similarity -.02 -.09 .00 -.02 .08 .05 -.22-.00 -.03
11. Education distance -.18 .10 -.19 .05 -.08 .05 -.19 -.00 .25.09
Minimum -1.64 -27428 -3.26 0 214221 0 17.5 0 0 0 0
Maximum 1.35 -1617 1.63 82 16575414 1 32.5 2 49 4 4
Mean .00 -7789.28 0 17.47 6765098.17 - 26.25 .98 5.64 .36 1.25
SD .58 4206.59 1 18.44 3520702.91 - 2.78 .83 7.59 .48 1.14
�� p<.01; N = 99 dyads.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 7 / 14
speaking turns which, as discussed earlier, is an aspect of communication shown to be an
important predictor of CI in prior studies [2,11]. Speaking turn inequality negatively pre-
dicted prosodic synchrony, controlling for covariates (b= -.35, p= .001). Mediation analyses
showed that speaking turn inequality mediated the relationship between video condition and
prosodic synchrony (effect size = .26, and the bias-corrected bootstrap confidence intervals are
between.05 and.44). To test the causal pathway from video access to speaking turn inequality
to prosodic synchrony to collective intelligence, we formally tested a serial mediation model.
The serial mediation was significant (effect size = .05, and the bias-corrected bootstrap confi-
dence intervals are between -.09 and -.018 (see Fig 4).
That is, video access leads to greater speaking turn inequality and, in turn, decreases the
dyad’s prosodic synchrony, which then decreases the dyad’s collective intelligence (see also
Table 2). Note here that an analysis of reverse causality, predicting the speaking turn inequality
from prosodic synchrony, was not supported as an alternative explanation.
Fig 3. Interaction effects of facial expression synchrony and video access condition on collective intelligence.
Fig 4. Serial mediation analysis of the effect of video access on collective intelligence.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 8 / 14
We explored what role, if any, video access to partners plays in facilitating collaboration when
partners are not collocated. Though we found no direct effects of video access on collective
intelligence or facial expression synchrony, we did find that in the video condition, facial
expression synchrony predicts collective intelligence. This result suggests that when visual
cues are available it is important that interaction partners attend to them. Furthermore, when
video was available, social perceptiveness predicted facial synchrony, reinforcing the role this
individual characteristic plays in heightening attention to available cues. We also found that
prosodic synchrony improves collective intelligence in physically separated collaborators
whether or not they had access to video. An important precursor to prosodic synchrony is the
equality in speaking turns that emerges among collaborators, which enhances prosodic syn-
chrony and, in turn, collective intelligence. Surprisingly, our findings suggest that video access
may, in fact, impede the development of prosodic synchrony by creating greater speaking turn
inequality, countering some prevailing assumptions about the importance of richer media to
facilitate distributed collaboration.
Our findings build on existing research demonstrating that synchrony improves coordina-
tion [30,33] by showing that it also improves cognitive aspects of a group, such as joint
Table 2. Summary of regression analyses for serial mediation.
Dependent Variable: Speaking turn inequality coefficient se t p 95% Confidence Intervals
Lower Bound Upper Bound
constant -.88 .91 -.97 .33 -2.69 .92
Social perceptiveness .02 .03 .59 .55 -.04 .08
Female number -.01 .11 -.14 .88 -.24 .20
Overall spoken communication -.03 .09 -.38 .69 -.22 .15
Video condition .92 .18 4.95 .00 .55 1.29
= .21, F(4,94) = 6.53, p = .001
Dependent Variable: Prosodic synchrony coefficient se t p 95% Confidence Intervals
Lower Bound Upper Bound
constant -.79 .94 -.83 .40 -2.67 1.08
Social perceptiveness .00 .03 .16 .87 -.06 .07
Female number .06 .11 .54 .58 -.16 .29
Overall spoken communication -.16 .09 -1.67 .09 -.35 .03
Video condition -.36 .21 -1.70 .09 -.79 .06
Speaking turn inequality -.28 .10 -2.63 .00 -.49 -.07
= .17, F(5,93) = 3.85, p = .003
Dependent Variable: Collective intelligence coefficient se t p 95% Confidence Intervals
Lower Bound Upper Bound
constant -1.90 .52 -3.63 .00 -2.95 -8.64
Social perceptiveness .06 .01 3.51 .00 .02 .10
Female number .02 .06 .45 .64 -.09 .15
Overall spoken communication -.14 .05 -2.58 .01 -.25 -.03
Video condition -.06 .12 -.54 .58 -.30 .17
Speaking turn inequality -.03 .06 -.63 .52 -.16 .08
prosodic synchrony .12 .05 2.23 .02 .01 .24
= .25, F(6,92) = 5.23, p =.001
Note. N = 99 dyads; Video condition coded as 1, No video condition coded as 0.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 9 / 14
problem-solving and collective intelligence in distributed collaboration. Much of the previous
research on synchrony has been conducted in face-to-face settings. We offer evidence that
nonverbal synchrony can occur and is important to the level of collective intelligence in dis-
tributed collaboration. Furthermore, we demonstrate different pathways through which differ-
ent types of cues can affect nonverbal synchrony and, in turn, collective intelligence. For
example, prosodic synchrony and speaking turn equality seem to be important means for reg-
ulating collaboration. Speaking turns are a key communication mechanism operating in social
interaction by regulating the pace at which communication proceeds, and is governed by a set
of interaction rules such as yielding, requesting, or maintaining turns [18]. These rules are
often subtly communicated through nonverbal cues such as eye contact and vocal cues (e.g.,
back channels), altering volume and rate [18]. However, our findings suggest that visual non-
verbal cues may also enable some interacting partners to dominate the conversation. By con-
trast, we show that when interacting partners have audio cues only, the lack of video does not
hinder them from communicating these rules but instead helps them to regulate their conver-
sation more smoothly by engaging in more equal exchange of turns and by establishing
improved prosodic synchrony. Previous research has focused largely on synchrony regulated
by visual cues, such as studies showing that synchrony in facial expressions improves cohesion
in collocated teams [30]. Our study underscores the importance of audio cues, which appear
to be compromised by video access.
Our findings offer several avenues for future research on nonverbal synchrony and human
collaboration. For instance, how can we enhance prosodic synchrony? Some research has
examined the role of interventions to enhance speaking turn equality for decision making
effectiveness [67]. Could regulating conversational behavior increase prosodic synchrony?
Furthermore, does nonverbal synchrony affect collective intelligence similarly in larger
groups? For example, as group size increases, a handful of team members tend to dominate the
conversation [68] with implications for spoken communication, nonverbal synchrony, and
ultimately collective intelligence. Our results also underscore the importance of using behav-
ioral measures to index the quality of collaboration to augment the dominant focus on self-
report measures of attitudes and processes in the social sciences, because collaborators may
not always report better collaborations despite exhibiting increased synchrony and collective
intelligence [2,10]. Our study has limitations, which offer opportunities for future research.
For example, our findings were observed in newly formed and non-recurring dyads in the lab-
oratory. It remains to be seen whether our findings will generalize to teams that are ongoing or
in which there is greater familiarity among members, as in the case of distributed teams in
organizations. We encourage future research to test these findings in the field within organiza-
tional teams.
Overall, our findings enhance our understanding of the nonverbal cues that people rely on
when collaborating with a distant partner via different communication media. As distributed
collaboration increases as a form of work (e.g., virtual teams, crowdsourcing), this study sug-
gests that collective intelligence will be a function of subtle cues and available modalities.
Extrapolating from our results, one can argue that limited access to video may promote better
communication and social interaction during collaborative problem solving, as there are fewer
stimuli to distract collaborators. Consequently, we may achieve greater problem solving if new
technologies offer fewer distractions and less visual stimuli.
Supporting information
S1 Appendix. t-test results comparing cases with valid and missing data.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 10 / 14
We thank research assistants Thomas Rasmussen, Brian Hall, and Mikahla Vicino for their
help with data collection. We are also grateful to Ella Glickson and Rosalind Chow for provid-
ing valuable feedback in earlier versions of this manuscript.
Author Contributions
Conceptualization: Maria Tomprou, Young Ji Kim, Prerna Chikersal, Anita Williams Wool-
ley, Laura A. Dabbish.
Data curation: Prerna Chikersal.
Formal analysis: Maria Tomprou, Young Ji Kim.
Funding acquisition: Anita Williams Woolley, Laura A. Dabbish.
Investigation: Maria Tomprou, Prerna Chikersal.
Methodology: Maria Tomprou, Prerna Chikersal.
Project administration: Laura A. Dabbish.
Resources: Anita Williams Woolley, Laura A. Dabbish.
Software: Prerna Chikersal, Laura A. Dabbish.
Supervision: Anita Williams Woolley, Laura A. Dabbish.
Writing – original draft: Maria Tomprou, Young Ji Kim, Prerna Chikersal, Anita Williams
Writing – review & editing: Maria Tomprou, Young Ji Kim, Prerna Chikersal, Anita Williams
Woolley, Laura A. Dabbish.
1. Bear A, Rand DG. Intuition, deliberation, and the evolution of cooperation. Proceedings of the National
Academy of Sciences. 2016; 113(4):936–941. PMID:
2. Woolley AW, Chabris CF, Pentland A, Hashmi N, Malone TW. Evidence for a collective intelligence fac-
tor in the performance of human groups. Science. 2010; 330(6004):686–688.
science.1193147 PMID: 20929725
3. Bernstein E, Shore J, Lazer D. How intermittent breaks in interaction improve collective intelligence.
Proceedings of the National Academy of Sciences. 2018; p.8734–8739.
1802407115 PMID: 30104371
4. Bonabeau E, Dorigo M, Theraulaz G. Inspiration for optimization from social insect behaviour. Nature.
2000; 406(6791): 39–42. PMID: 10894532
5. Hong L, Page SE. Groups of diverse problem solvers can outperform groups of high-ability problem
solvers. Proceedings of the National Academy of Sciences. 2004; 101(46):16385–16389. https://doi.
org/10.1073/pnas.0403723101 PMID: 15534225
6. Kittur A, Kraut RE. Harnessing the wisdom of crowds in wikipedia: quality through coordination. In: Pro-
ceedings of the 2008 ACM conference on Computer supported cooperative work. ACM; 2008. p. 37–
7. Dirks KT. The effects of interpersonal trust on work group performance. Journal of Applied Psychology.
1999; 84(3):445. PMID: 10380424
8. Lindskold S. Trust development, the GRIT proposal, and the effects of conciliatory acts on conflict and
cooperation. Psychological Bulletin. 1978; 85(4):772.
9. Carney DR, Harrigan JA. It takes one to know one: Interpersonal sensitivity is related to accurate
assessments of others’ interpersonal sensitivity. Emotion. 2003; 3(2):194–200.
1528-3542.3.2.194 PMID: 12899418
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 11 / 14
10. Chikersal P, Tomprou M, Kim YJ, Woolley AW, Dabbish L. Deep Structures of Collaboration: Physiolog-
ical Correlates of Collective Intelligence and Group Satisfaction. In: Proceedings of the 2017 ACM con-
ference on Computer supported cooperative work; 2017. p. 873–888.
11. Engel D, Woolley AW, Jing LX, Chabris CF, Malone TW. Reading the mind in the eyes or reading
between the lines? Theory of mind predicts collective intelligence equally well online and face-to-face.
PloS one. 2014; 9(12):e115212. PMID: 25514387
12. Aggarwal I, Woolley AW, Chabris CF, Malone TW. The impact of cognitive style diversity on implicit
learning in teams. Frontiers in Psychology. 2019; 10:112.
PMID: 30792672
13. Akinola M, Page-Gould E, Mehta PH, Lu JG. Collective hormonal profiles predict group performance.
Proceedings of the National Academy of Sciences. 2016; 113(35):9774–9779.
pnas.1603443113 PMID: 27528679
14. Berdahl A, Torney CJ, Ioannou CC, Faria JJ, Couzin ID. Emergent sensing of complex environments by
mobile animal groups. Science. 2013; 339(6119):574–576.
PMID: 23372013
15. Gordon DM. Collective wisdom of ants. Scientific American. 2016; 314(2):44–47.
1038/scientificamerican0216-44 PMID: 26930827
16. Guerrero LK, DeVito JA, Hecht ML. The nonverbal communication reader: Classic and contemporary
readings. Waveland Press Prospect Heights, IL; 1999.
17. Duncan S. Some signals and rules for taking speaking turns in conversations. Journal of Personality
and Social Psychology. 1972; 23(2):283–292.
18. Knapp ML, Hall JA, Horgan TG. Nonverbal communication in human interaction. Cengage Learning;
19. Bernieri FJ, Davis JM, Rosenthal R, Knee CR. Interactional synchrony and rapport: Measuring syn-
chrony in displays devoid of sound and facial affect. Personality and Social Psychology Bulletin. 1994;
20. Vacharkulksemsuk T, Fredrickson BL. Strangers in sync: Achieving embodied rapport through shared
movements. Journal of Experimental Social Psychology. 2012; 48(1):399–402.
j.jesp.2011.07.015 PMID: 22389521
21. Miles LK, Griffiths JL, Richardson MJ, Macrae CN. Too late to coordinate: Contextual influences on
behavioral synchrony. European Journal of Social Psychology. 2010; 40(1):52–60.
22. Konvalinka I, Xygalatas D, Bulbulia J, Schjødt U, JegindøEM, Wallot S, et al. Synchronized arousal
between performers and related spectators in a fire-walking ritual. Proceedings of the National Acad-
emy of Sciences. 2011; 108(20):8514–8519.
23. Wiltermuth SS, Heath C. Synchrony and cooperation. Psychological Science. 2009; 20(1):1–5. https:// PMID: 19152536
24. Lakens D. Movement synchrony and perceived entitativity. Journal of Experimental Social Psychology.
2010; 46(5):701–708.
25. Valdesolo P, Ouyang J, DeSteno D. The rhythm of joint action: Synchrony promotes cooperative ability.
Journal of Experimental Social Psychology. 2010; 46(4):693–695.
26. Oullier O, De Guzman GC, Jantzen KJ, Lagarde J, Scott Kelso JA. Social coordination dynamics: Mea-
suring human bonding. Social Neuroscience. 2008; 3(2):178–192.
17470910701563392 PMID: 18552971
27. Kirschner S, Tomasello M. Joint music making promotes prosocial behavior in 4-year-old children. Evo-
lution and Human Behavior. 2010; 31(5):354–364.
28. Baimel A, Birch SA, Norenzayan A. Coordinating bodies and minds: Behavioral synchrony fosters men-
talizing. Journal of Experimental Social Psychology. 2018; 74:281–290.
29. Vicaria IM, Dickens L. Meta-analyses of the intra-and interpersonal outcomes of interpersonal
coordination. Journal of Nonverbal Behavior. 2016; 40(4):335–361.
30. Mønster D, Håkonsson DD, Eskildsen JK, Wallot S. Physiological evidence of interpersonal dynamics
in a cooperative production task. Physiology & behavior. 2016; 156:24–34.
31. Coulston R, Oviatt S, Darves C. Amplitude convergence in children’s conversational speech with ani-
mated personas. In: Seventh International Conference on Spoken Language Processing; 2002.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 12 / 14
32. Lubold N, Pon-Barry H. A comparison of acoustic-prosodic entrainment in face-to-face and remote col-
laborative learning dialogues. In: Spoken Language Technology Workshop (SLT), 2014 IEEE. IEEE;
2014. p. 288–293.
33. Lubold N, Pon-Barry H. Acoustic-prosodic entrainment and rapport in collaborative learning dialogues.
In: Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand
Challenge. ACM; 2014. p. 5–12.
34. Julien D, Brault M, Chartrand E
´, Be
´gin J. Immediacy behaviours and synchrony in satisfied and dissatis-
fied couples. Canadian Journal of Behavioural Science/Revue canadienne des sciences du comporte-
ment. 2000; 32(2):84.
35. Lumsden J, Miles LK, Richardson MJ, Smith CA, Macrae CN. Who syncs? Social motives and interper-
sonal coordination. Journal of Experimental Social Psychology. 2012; 48(3):746–751.
36. Krych-Appelbaum M, Law JB, Jones D, Barnacz A, Johnson A, Keenan JP. I think I know what you
mean: The role of theory of mind in collaborative communication. Interaction Studies. 2007; 8(2):267–
37. Curhan JR, Pentland A. Thin slices of negotiation: Predicting outcomes from conversational dynamics
within the first 5 minutes. Journal of Applied Psychology. 2007; 92(3):802–811.
0021-9010.92.3.802 PMID: 17484559
38. Riedl C, Woolley AW. Teams vs. crowds: A field test of the relative contribution of incentives, member
ability, and emergent collaboration to crowd-based problem solving performance. Academy of Manage-
ment Discoveries. 2017; 3(4):382–403.
39. Wiemann JM, Knapp ML. Turn-taking in conversations. Journal of Communication. 1975; 25(2):75–92.
40. Levinson Stephen C -taking in human communication–origins and implications for language processing
Trends in cognitive sciences,2016; 20 (1), p.6–14. PMID:
41. Van Baaren RB, Holland RW, Kawakami K, Van Knippenberg A. Mimicry and prosocial behavior. Psy-
chological Science. 2004; 15(1):71–74. PMID:
42. Valentine MA, Retelny D, To A, Rahmati N, Doshi T, Bernstein MS. Flash organizations: Crowdsourcing
complex work by structuring crowds as organizations. In: Proceedings of the 2017 CHI Conference on
Human Factors in Computing Systems. ACM; 2017. p. 3523–3537.
43. Lodato TJ, DiSalvo C. Issue-oriented hackathons as material participation. New Media & Society. 2016;
44. O’Mahony S, Barley SR. Do digital telecommunications affect work and organization? The state of our
knowledge. Research in Organizational Behavior, VOL 21, 1999. 1999; 21:125–161.
45. Johnson DW, Johnson RT. Cooperation and the use of technology. Handbook of research for educa-
tional communications and technology: A project of the Association for Educational Communications
and Technology. 1996; p. 1017–1044.
46. Culnan MJ, Markus ML. Information technologies. Sage Publications, Inc; 1987.
47. Daft RL, Lengel RH. Organizational information requirements, media richness and structural design.
Management Science. 1986; 32(5):554–571.
48. Short J, Williams E, Christie B. The social psychology of telecommunications. John Wiley and Sons
Ltd; 1976.
49. Marlow SL, Lacerenza C, Salas E. Communication in virtual teams: A conceptual framework and
research agenda. Human Resource Management Review. 2017; 27(4):575–589.
50. Schulze J, Krumm S. The virtual team player: A review and initial model of knowledge, skills, abilities,
and other characteristics for virtual collaboration. Organizational Psychology Review. 2017; 7(1):66–95.
51. Team I. Optimizing Team Performance: How and Why Video Conferencing Trumps Audio. Forbes
Insights. 2017.
52. Ramirez A Jr, Walther JB, Burgoon JK, Sunnafrank M. Information-seeking strategies, uncertainty, and
computer-mediated communication: Toward a conceptual model. Human Communication Research.
2002; 28(2):213–228.
53. Walther JB. Interpersonal effects in computer-mediated interaction: A relational perspective. Communi-
cation Rresearch. 1992; 19(1):52–90.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 13 / 14
54. Walther JB. Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interac-
tion. Communication Research. 1996; 23(1):3–43.
55. Walther JB, Burgoon JK. Relational communication in computer-mediated interaction. Human Commu-
nication Research. 1992; 19(1):50–88.
56. Burgoon JK, Bonito JA, Ramirez A Jr, Dunbar NE, Kam K, Fischer J. Testing the interactivity principle:
Effects of mediation, propinquity, and verbal and nonverbal modalities in interpersonal interaction. Jour-
nal of Communication. 2002; 52(3):657–677.
57. Chillcoat Y, DeWine S. Teleconferencing and interpersonal communication perception. Journal of
Applied Communication Research. 1985; 13(1):14–32.
58. Engel D, Woolley AW, Aggarwal I, Chabris CF, Takahashi M, Nemoto K, et al. Collective intelligence in
computer-mediated collaboration emerges in different contexts and cultures. In: Proceedings of the
33rd annual ACM conference on human factors in computing systems. ACM; 2015. p. 3769–3778.
59. Amos B, Ludwiczuk B, Satyanarayanan Mea. Openface: A general-purpose face recognition library
with mobile applications. CMU School of Computer Science. 2016.
60. Eyben F, Weninger F, Gross F, Schuller B. Recent developments in opensmile, the munichopen-
source multimedia feature extractor. In: Proceedings of the 21st ACM international conference on Multi-
media. ACM; 2013. p. 835–838.
61. Levitan R, Gravano A, Willson L, Benus S, Hirschberg J, Nenkova A. Acoustic-prosodic entrainment
and social behavior. In: Proceedings of the 2012 Conference of the North American Chapter of the
Association for Computational Linguistics: Human language technologies. Association for Computa-
tional Linguistics; 2012. p. 11–19.
62. Apple W, Streeter LA, Krauss RM. Effects of pitch and speech rate on personal attributions. Journal of
Personality and Social Psychology. 1979; 37(5):715.
63. Degottex, Gilles and Kane, John and Drugman, Thomas and Raitio, Tuomo and Scherer, Stefan. COV-
AREP—A collaborative voice analysis repository for speech technologies. IEEE international confer-
ence on acoustics, speech and signal processing (icassp), 2014, 960–964.
64. Pedott PR, Bacchin LB, Ca
´ceres-Assenc¸o AM, Befi-Lopes DM. Does the duration of silent pauses differ
between words of open and closed class? Audiology-Communication Research. 2014; 19(2):153–157.
65. Baron-Cohen S, Wheelwright S, Hill J, Raste Y, Plumb I. The “Reading the Mind in the Eyes” test
revised version: A study with normal adults, and adults with Asperger syndrome or high-functioning
autism. Journal of Child Psychology and Psychiatry. 2001; 42(2):241–251.
1469-7610.00715 PMID: 11280420
66. Curry O, Chesters MJ. Putting Ourselves in the Other Fellow’s Shoes: The Role of Theory of Mind in
Solving Coordination Problems. Journal of Cognition and Culture. 2012; 12(1-2):147–159. https://doi.
67. DiMicco JM, Hollenbach KJ, Bender W. Using visualizations to review a group’s interaction dynamics.
In: CHI’06 extended abstracts on Human factors in computing systems. ACM; 2006. p. 706–711.
68. Shaw ME. Group dynamics: The psychology of small group behavior. McGraw Hill; 1971.
Collective intelligence and non-verbal synchrony
PLOS ONE | March 18, 2021 14 / 14
... While studies drawing on communication accommodation or entrainment have used a variety of research paradigms, disciplines, and communication media (e.g., face to face, phone, online), most research in sociolinguistic settings involves dyadic interactions or what we would consider individuallevel entrainment-an individual's entrainment toward others (e.g., immigrants and their speech patterns; Giles et al., 1991). Entrainment has been examined on a variety of dimensions including nonverbal (e.g., smiles, gaze), linguistic and lexical (e.g., specific words), and acoustic-prosodic signals such as loudness (e.g., Anzalone et al., 2015;Chartrand & Lakin, 2013;Giles et al., 1991;Lubold et al., 2018;Tomprou et al., 2021;Van Swol & Kane, 2019). Moment-by-moment entrainment between infants and mothers on a variety of nonverbal aspects is considered important for secure attachment, and entrainment between adults can promote perceptions of being a social unit and build rapport (Delaherche et al., 2012). ...
... While suggestions for practice are premature from our case study findings, we provide some potential practical insights for small group researchers in their study of entrainment. Most literature on entrainment has been dominated by examinations of lexical measures (see Van Swol & Kane, 2019 for a review), and studies of acoustic-prosodic entrainment have largely focused on dyadic entrainment or on individual tendencies to entrain (e.g., Gessinger et al., 2021;Lubold & Pon-Barry, 2014;Tomprou et al., 2021). Our study suggests the utility of expanding beyond lexical measures and dyads to acoustic-prosodic measures and teams. ...
This study introduces the concept of acoustic-prosodic entrainment (ways people speak similarly). We review prior research on entrainment theory and methods from computational linguistics, and then apply this concept to team research by examining the relationship between team personality composition and subsequent entrainment in an exploratory case study. With 62 teams playing a cooperative board game, team average Agreeableness and team Agreeableness diversity positively, and Openness to Experience diversity negatively, preceded different kinds of entrainment. This study suggests entrainment is not a singular construct. Small group researchers could leverage technological, methodological, and conceptual advances in computational linguistics to study emergent team processes.
... With the advances in communication technology has come wider availability of videoconferencing, which many view as providing better opportunities for collaboration and knowledge sharing by providing near face-to-face experiences. However, research has demonstrated that the impact of richer communication channels is not always beneficial (Eisenberg et al., 2021;Glikson et al., 2019;Tomprou et al., 2021). For instance, recent research has shown that the use of rich media, such as videoconferencing, can reduce vocal synchrony among collaborators which leads to a decrease in collective intelligence (Tomprou et al., 2021). ...
... However, research has demonstrated that the impact of richer communication channels is not always beneficial (Eisenberg et al., 2021;Glikson et al., 2019;Tomprou et al., 2021). For instance, recent research has shown that the use of rich media, such as videoconferencing, can reduce vocal synchrony among collaborators which leads to a decrease in collective intelligence (Tomprou et al., 2021). ...
The internet has enabled an increasing amount of collaboration to occur via virtual teamwork, including more complex forms where individuals are working on multiple teams simultaneously. We argue that the environmental complexity teams face requires they be designed for collective intelligence, a capability enabling groups to accomplish goals across a wide range of environments. We describe the transactive systems model of collective intelligence, which articulates how individual memory, attention, and reasoning give rise to the emergence and mutual adaptation of the transactive memory, attention, and reasoning processes underlying collective intelligence. Furthermore, as artificial intelligence develops more capabilities to facilitate human interaction, we see how it might augment human cognition in ways that will enhance collective intelligence. Developing trust in AI will be essential for enabling higher levels of collective intelligence with tremendous benefits for organizations and society.
... Individual differences are large, but the absence of a video connection makes it easier not to participate and more difficult for others to observe the emotional stance or attentiveness. On the other hand, there is some laboratory evidence (Tomprou et al., 2021) that audio-only communication can enhance pairs' equality of turns and, given that, help reach higher collective intelligence, as measured in a computer-mediated test. However, solving real-life collaborative tasks requires complex meaning-making in which the social space is starkly different to that of a laboratory setting or assessment by a test. ...
Full-text available
Building on social constructivist theory, this case study analyzed how pre-service secondary teachers co-constructed knowledge and expressed socioemotional interaction in online breakout rooms during a collaborative task. Video data was analyzed by content and interaction analysis. There was more higher-level knowledge construction than in most studies from asynchronous settings. Active listening and humor were thoroughly present. Talk about personal experiences occurred at both lower and higher levels of thinking. The teacher educator’s visits to the breakout rooms and purposeful dissonance affected knowledge co-construction and socioemotional interaction. The findings will help in designing high-quality online and blended teacher education.
... For the feature extraction, there are existing packages, for example, COVAREP [28] which is written in MATLAB and can extract 12 sets of MFCC features, Librosa [6] which is written in Python and can extract MFCC, Mel-Spectrogram, Chroma features, etc., and the last one Open-Smile, which is implemented by [29] using C + + , can extract signal energy, loudness, Mel-spectra, MFCC, PLP-CC features, etc. These feature extraction packages have been utilized by the authors in [30][31][32][33][34][35][36][37] for their SER studies. [9,38] noted that feature engineering is important in speech emotion recognition, and [39] cited low accuracy in model prediction due to features not being extracted adequately. ...
Full-text available
Speech emotion recognition (SER), which has gained greater attention in recent years, is a key aspect of the human–computer interaction process. However, a wide range of strategies has been offered in SER, and these approaches have yet to increase performance. In this study, a deep neural network model for classifying voice emotions is suggested. It is divided into three stages: feature extraction, normalization, and emotion recognition. The Librosa Python Toolkit is used to acquire the MFCC, Mel-Spectrogram Frequency, Chroma, and Poly Features during feature extraction. Data augmentation for the minority class using SMOTE (synthetic minority oversampling technique) and the Min–Max scaler for the normalization process were used. The model was evaluated on three frequently used languages: German, English, and French, using the Berlin Emotional Speech Database (EMODB), Surrey Audio-Visual Expressed Emotion Dataset (SAVEE), and the Canadian French Emotional (CaFE) speech datasets. The recognition rates of unweighted accuracy of 95% on EMODB, 90% on SAVEE, and 92% on CaFE are gained in speaker-dependent experiments. The results show that the suggested method is capable of efficiently recognizing emotions and outperformed the other approaches utilized for comparison in terms of performance indicators.
... Moreover, this kind of research could add scientific guidance to current theoretical work about what are the best practices in the use of various online communication platforms in the context of affective regulation. For instance, work by Tomprou et al. (2021) has highlighted that in the absence of visual cues in virtual environments people are better able to synchronize vocal cues and turn-taking, and consequently do better in tests of collective intelligence. Could similar studies be designed to test for effects on emotional intelligence or interpersonal affectivity? ...
Full-text available
Recent theorizing argues that online communication technologies provide powerful, although precarious, means of emotional regulation. We develop this understanding further. Drawing on subjective reports collected during periods of imposed social restrictions under COVID-19, we focus on how this precarity is a source of emotional dysregulation. We make our case by organizing responses into five distinct but intersecting dimensions wherein the precarity of this regulation is most relevant: infrastructure, functional use, mindful design (individual and social), and digital tact. Analyzing these reports, along with examples of mediating technologies (i.e., self-view) and common interactive dynamics (e.g., gaze coordination), we tease out how breakdowns along these dimensions are sources of affective dysregulation. We argue that the adequacy of available technological resources and competencies of various kinds matter greatly to the types of emotional experiences one is likely to have online. Further research into online communication technologies as modulators of both our individual and collective well-being is urgently needed, especially as the echoes of the digital push that COVID-19 initiated are set to continue reverberating into the future.
Full-text available
The COVID-19 pandemic led to social restrictions that often prevented us from hugging the ones we love. This absence helped some realize just how important these interactions are to our sense of care and connection. Many turned to digitally mediated social interactions to address these absences, but often unsatisfactorily. Some theorists might blame this on the disembodied character of our digital spaces, e.g., that interpersonal touch is excluded from our lives online. However, others continued to find care and connection in their digitally mediated interactions despite not being able to touch. Inspired by such contrasting cases, we ask if ‘digital hugs’ can work? We use the Mixed Reality Interaction Matrix to examine hugging as a social practice. This leads us to several claims about the nature of our embodied social interactions and their digital mediation: (1) all social interaction is mediated; (2) all virtual experiences are embodied; (3) technology has become richer and more supportive of embodiment; and (4) expertise plays a role. These claims help make the case that quality social connections online are substantially dependent upon the dynamic skilful resourcing of multiple mediating components, what we term digital tact . By introducing and developing this concept, we hope to contribute to a better understanding of our digital embodied sociality and the possibilities for caring connections online.
The widespread adoption of video-conferencing has not only transformed communication at scale, but also increased feelings of Zoom fatigue among workers around the world. Although Zoom fatigue is well-documented, it is still unclear what aspects of video-conferencing contribute to this sense of exhaustion. This paper leveraged theory on computer-mediated communication (CMC) to investigate the causes of Zoom fatigue in an online convenience sample of 9787 participants. We provide empirical evidence that Zoom fatigue is influenced by the dynamics of individuals' video-conferencing usage and their psychological experience of the meeting. Specifically, our results support Bailenson's theory of nonverbal overload (2021) that video-conferences are exhausting because maintaining the nonverbal communication cues required in video-based calls (e.g., making eye contact with many people at once) can be draining. We found that people who used video-conferencing more frequently, for longer, and with fewer breaks reported more Zoom fatigue. However, people also experienced more Zoom fatigue when they experienced (1) mirror anxiety from seeing their self-image, (2) hyper-gaze from feeling watched by many faces, (3) feeling physically trapped, and challenges in (4) effort in producing nonverbal cues, and (5) effort in monitoring others' nonverbal cues, even when controlling for differences in usage dynamics. Relative to men, women also reported greater Zoom fatigue after video-conferencing because they experienced the above nonverbal mechanisms to a greater extent. This work advances theory on CMC by reflecting on how video-conferencing can recreate and reconfigure nonverbal cues present in face-to-face communication. We discuss practical strategies to combat Zoom fatigue to improve digital well-being.
Full-text available
Organizations are increasingly looking for ways to reap the benefits of cognitive diversity for problem solving. A major unanswered question concerns the implications of cognitive diversity for longer-term outcomes such as team learning, with its broader effects on organizational learning and productivity. We study how cognitive style diversity in teams—or diversity in the way that team members encode, organize and process information—indirectly influences team learning through collective intelligence, or the general ability of a team to work together across a wide array of tasks. Synthesizing several perspectives, we predict and find that cognitive style diversity has a curvilinear—inverted U-shaped—relationship with collective intelligence. Collective intelligence is further positively related to the rate at which teams learn, and is a mechanism guiding the indirect relationship between cognitive style diversity and team learning. We test the predictions in 98 teams using ten rounds of the minimum-effort tacit coordination game. Overall, this research advances our understanding of the implications of cognitive diversity for organizations and why some teams demonstrate high levels of team learning in dynamic situations while others do not.
Full-text available
Significance Many human endeavors—from teams and organizations to crowds and democracies—rely on solving problems collectively. Prior research has shown that when people interact and influence each other while solving complex problems, the average problem-solving performance of the group increases, but the best solution of the group actually decreases in quality. We find that when such influence is intermittent it improves the average while maintaining a high maximum performance. We also show that storing solutions for quick recall is similar to constant social influence. Instead of supporting more transparency, the results imply that technologies and organizations should be redesigned to intermittently isolate people from each other’s work for best collective performance in solving complex problems.
Full-text available
Collective intelligence (CI), a group's capacity to perform a wide variety of tasks, is a key factor in successful collaboration. Group composition, particularly diversity and member social perceptiveness, are consistent predictors of CI, but we have limited knowledge about the mechanisms underlying their effects. To address this gap, we examine how physiological synchrony, as an indicator of coordination and rapport, relates to CI in computer-mediated teams, and if synchrony might serve as a mechanism explaining the effect of group composition on CI. We present results from a laboratory experiment where 60 dyads completed the Test of Collective Intelligence (TCI) together online and rated their group satisfaction, while wearing physiological sensors. We find that synchrony in facial expressions (indicative of shared experience) was associated with CI and synchrony in electrodermal activity (indicative of shared arousal) with group satisfaction. Furthermore, various forms of synchrony mediated the effect of member diversity and social perceptiveness on CI and group satisfaction. Our results have important implications for online collaborations and distributed teams.
Full-text available
Significance Past research has focused primarily on demographic and psychological characteristics of group members without taking into consideration the biological make-up of groups. Here we introduce a different construct—a group’s collective hormonal profile—and find that a group’s biological profile predicts its standing across groups and that the particular profile supports a dual-hormone hypothesis. Groups with a collective hormonal profile characterized by high testosterone and low cortisol exhibit the highest performance. The current work provides a neurobiological perspective on factors determining group behavior and performance that are ripe for further exploration.
Behavioral synchrony, physically keeping together in time with others, is a widespread feature of human cultural practices. Emerging evidence suggests that the physical coordination involved in synchronizing one's behavior with another engages the cognitive systems involved in reasoning about others' mental states (i.e., mentalizing). In three experiments (N = 959), we demonstrate that physically moving in synchrony with others fosters some features of mentalizing – a core feature of human social cognition. In small groups, participants moved synchronously or asynchronously with others in a musical performance task. In Experiment 1, we found that synchrony, as compared to asynchrony, increased self-reported tendencies and abilities for considering others' mental states. In Experiment 2, we replicated this finding, but found that this effect did not extend to accuracy in mental state recognition. In Experiment 3, we tested synchrony's effects on diverse mentalizing measures and compared performance to both asynchrony and a no-movement control condition. Results indicated that synchrony decreased mental state attribution to socially non-relevant targets, and increased mental state attribution to specifically those with whom participants had synchronized. These results provide novel evidence for how synchrony, a common feature of cultural practices and day-to-day interpersonal coordination, shapes our sociality by engaging mentalizing capacities.
Conference Paper
This paper introduces flash organizations: crowds structured like organizations to achieve complex and open-ended goals. Microtask workflows, the dominant crowdsourcing structures today, only enable goals that are so simple and modular that their path can be entirely pre-defined. We present a system that organizes crowd workers into computationally-represented structures inspired by those used in organizations - roles, teams, and hierarchies - which support emergent and adaptive coordination toward open-ended goals. Our system introduces two technical contributions: 1) encoding the crowd's division of labor into de-individualized roles, much as movie crews or disaster response teams use roles to support coordination between on-demand workers who have not worked together before; and 2) reconfiguring these structures through a model inspired by version control, enabling continuous adaptation of the work and the division of labor. We report a deployment in which flash organizations successfully carried out open-ended and complex goals previously out of reach for crowdsourcing, including product design, software development, and game production. This research demonstrates digitally networked organizations that flexibly assemble and reassemble themselves from a globally distributed online workforce to accomplish complex work.
As virtual teams are becoming more frequently implemented within organizations, research examining the effect of virtual tool use on team functioning has correspondingly expanded. One primary focus of this literature is the impact of virtuality on team communication. However, findings remained mixed. Specifically, the impact of virtuality on the mechanisms between communication and performance as well as the simultaneous moderating effect of contextual factors on this relationship remains to be fully examined. One reason for this lack of clarity stems from ambiguity regarding the elements that constitute communication. To address this gap, this paper delineates which aspects of communication are most influential and should, consequently, be the primary focus of future research efforts. An overarching framework of the communication process with accompanying research propositions is also described to inform future research and the practice of virtual teams.
Organizations are increasingly turning to crowdsourcing to solve difficult problems. This is often driven by the desire to find the best subject matter experts, strongly incentivize them, and engage them, with as little coordination cost as possible, to pool their knowledge. A growing number of authors, however, are calling for increased collaboration in crowdsourcing settings, hoping to draw upon the advantages of teamwork observed in traditional settings. The question is how to effectively incorporate team-based collaboration in a setting that has traditionally been individual-based. We report on a large field experiment of team collaboration on an online platform, in which incentives and team membership were randomly assigned, to evaluate the influence of exogenous inputs (member skills and incentives) and emergent collaboration processes on performance of crowd-based teams. Building on advances in machine learning and complex systems, we leverage new measurement techniques to examine the content and timing of team collaboration. We find that temporal "burstiness" of team activity and the diversity of information exchanged among team members are strong predictors of performance, even when inputs such as incentives and member skills are controlled. We discuss implications for research on crowdsourcing and team collaboration.
In spite of the increasing demand for virtual cooperation, still relatively little is known about the knowledge, skills, abilities, and other characteristics (KSAOs) individuals need for virtual teamwork. Thus, the current paper aims at synthesizing the existing literature into a comprehensive model of virtual teamwork KSAOs. To this end, we review (a) existing frameworks of KSAO requirements for virtual teamwork, (b) challenges posed by different facets of virtuality, and (c) KSAOs particularly relevant for meeting the identified challenges. The results of this review are integrated into a holistic model of virtual teamwork KSAOs with distal characteristics (personality, experience) and more proximal qualities (knowledge, skills, and motivation). Research gaps as well as avenues for future research will be outlined and applications for virtual team staffing and training will be discussed.