ArticlePDF Available

Abstract and Figures

Synopsis: Multimodal signaling is common in communication systems. Depending on the species, individual signal components may be produced synchronously as a result of physiological constraint (fixed) or each component may be produced independently (fluid) in time. For animals that rely on fixed signals, a basic prediction is that asynchrony between the components should degrade the perception of signal salience, reducing receiver response. Male túngara frogs, Physalaemus pustulosus, produce a fixed multisensory courtship signal by vocalizing with two call components (whines and chucks) and inflating a vocal sac (visual component). Using a robotic frog, we tested female responses to variation in the temporal arrangement between acoustic and visual components. When the visual component lagged a complex call (whine + chuck), females largely rejected this asynchronous multisensory signal in favor of the complex call absent the visual cue. When the chuck component was removed from one call, but the robofrog inflation lagged the complex call, females responded strongly to the asynchronous multimodal signal. When the chuck component was removed from both calls, females reversed preference and responded positively to the asynchronous multisensory signal. When the visual component preceded the call, females responded as often to the multimodal signal as to the call alone. These data show that asynchrony of a normally fixed signal does reduce receiver responsiveness. The magnitude and overall response, however, depend on specific temporal interactions between the acoustic and visual components. The sensitivity of túngara frogs to lagging visual cues, but not leading ones, and the influence of acoustic signal content on the perception of visual asynchrony is similar to those reported in human psychophysics literature. Virtually all acoustically communicating animals must conduct auditory scene analyses and identify the source of signals. Our data suggest that some basic audiovisual neural integration processes may be at work in the vertebrate brain.
Content may be subject to copyright.
Perceived Synchrony of Frog Multimodal Signal Components
Is Influenced by Content and Order
Ryan C. Taylor,
Rachel A. Page,
Barrett A. Klein,
Michael J. Ryan
and Kimberly L. Hunter
*Department of Biological Sciences, Salisbury University, 1101 Camden Avenue, Salisbury, MD 21801, USA;
Smithsonian Tropical Research Institute, Balboa Ancon, 56292 Panama, Republic of Panama;
Department of Biology,
University of Wisconsin—La Crosse, La Crosse, WI 54601, USA;
Department of Integrative Biology, University of Texas
at Austin, Austin, TX 12330, USA
From the symposium “Integrating Cognitive, Motivational and Sensory Biases Underlying Acoustic and Multimodal
Mate Choice” presented at the annual meeting of the Society for Integrative and Comparative Biology, January 4–8, 2017
at New Orleans, Louisiana.
Synopsis Multimodal signaling is common in communication systems. Depending on the species, individual signal
components may be produced synchronously as a result of physiological constraint (fixed) or each component may be
produced independently (fluid) in time. For animals that rely on fixed signals, a basic prediction is that asynchrony
between the components should degrade the perception of signal salience, reducing receiver response. Male t
frogs, Physalaemus pustulosus, produce a fixed multisensory courtship signal by vocalizing with two call components
(whines and chucks) and inflating a vocal sac (visual component). Using a robotic frog, we tested female responses to
variation in the temporal arrangement between acoustic and visual components. When the visual component lagged a
complex call (whine þchuck), females largely rejected this asynchronous multisensory signal in favor of the complex
call absent the visual cue. When the chuck component was removed from one call, but the robofrog inflation lagged
the complex call, females responded strongly to the asynchronous multimodal signal. When the chuck component was
removed from both calls, females reversed preference and responded positively to the asynchronous multisensory
signal. When the visual component preceded the call, females responded as often to the multimodal signal as to the
call alone. These data show that asynchrony of a normally fixed signal does reduce receiver responsiveness. The
magnitude and overall response, however, depend on specific temporal interactions between the acoustic and visual
components. The sensitivity of t
ungara frogs to lagging visual cues, but not leading ones, and the influence of acoustic
signal content on the perception of visual asynchrony is similar to those reported in human psychophysics literature.
Virtually all acoustically communicating animals must conduct auditory scene analyses and identify the source of
signals. Our data suggest that some basic audiovisual neural integration processes may be at work in the vertebrate
Animal signals are complex, often consisting of in-
dividual components transmitted and received
through multiple sensory channels (Hebets and
Papaj 2005;Higham and Hebets 2013;Hebets et al.
2016). Signal complexity has been an area of intense
research for more than 15 years (Partan and Marler
1999), yet we understand little about how a signal
component in one sensory channel influences the
perception and corresponding behavioral response
to a component in another channel. In animal court-
ship signals, for example, do individual components
in the auditory and visual channels combine to in-
crease female responses in an additive fashion?
Alternatively, does the addition of a visual compo-
nent induce an exponentially stronger response in
receivers or even reduce their response relative to
the acoustic signal alone? Some recent work in ani-
mal communication indicates that the perception
and subsequent behavioral response to multisensory
Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology 2017.
This work is written by US Government employees and is in the public domain in the US.
Integrative and Comparative Biology
Integrative and Comparative Biology, pp. 1–8
doi:10.1093/icb/icx027 Society for Integrative and Comparative Biology
signals is not additive or easily predicted (Taylor and
Ryan 2013;Rubi and Stephens 2016;Stange et al.
2016). To date, the most comprehensive work on
audiovisual integration and non-additive effects has
been done in cats and primates, including work in
human psychophysics (for review see Stein 2012).
The human psychophysical work has been critical
for informing us about how the senses are integrated
and how this integration allows individuals to make
sense of a complex world around them. In particular
the recruitment of additional senses, such as vision,
is one mechanism that humans use to locate and
recognize acoustic signals, increasing the efficacy of
our auditory scene analyses (Sumby and Pollack
1954). Psychophysical techniques have been applied
to a number of taxa, but frogs are especially amenable
to these methods, allowing us to address questions
about the perception of complex signals (Bee and
Micheyl 2008;Bee 2015). Male frogs produce stereo-
typed advertisement (mating) calls and their neural
systems are “tuned” to properties of these calls
(Ryan 2001). In most species, females search out call-
ing males and approach them to initiate mating. If
the mating signals deviate too far from their species-
specific properties, female receivers fail to perceive
them as appropriate mating signals (Phelps et al.
2005). Because females readily respond to acoustic
playbacks of male signals, and engage in mate search-
ing behavior, they are easy to manipulate in behav-
ioral tests of signal perception. These perceptions are
directly relevant to understanding how communica-
tion signals evolve. In frogs, the sex ratio is typically
highly skewed and male reproductive success is like-
wise skewed. Therefore, female mate choice generates
strong selection on male signal evolution.
The t
ungara frog, Physalaemus pustulosus,isa
small frog found from northern South America
through southern Mexico. Like many frog species,
they breed in ephemeral pools of water and males
produce a conspicuous acoustic signal, the advertise-
ment call. In t
ungara frogs, this advertisement call
consists of two components. The first is the “whine”
and the second is the “chuck.” The whine is necessary
and sufficient for mate attraction and males always
produce this component. The chuck is neither neces-
sary nor sufficient for mate attraction but males can
facultatively append up to seven chucks onto the end
of the whine (usually one to three). Chucks make the
whine more attractive to females, and always follow
the whine as a result of morphological constraint
(Ryan and Guerra 2014). The advertisement call is
also accompanied by the synchronous inflation of a
conspicuous vocal sac that has been shown to make
the call more attractive (Taylor et al. 2008).
Thus, females assess both the call and the vocal sac
inflation as part of a multimodal signal.
The visual cue of an inflating vocal sac increases
the attractiveness of a call when it is added, but its
effect can easily be overridden by an alternative call
that contains more attractive properties. Thus, the
acoustic signal component has primacy for female
mate choice. The temporal arrangement of the call
and vocal sac movement are critically important,
however. If the vocal sac inflation is delayed, such
that it lags the call in time, females strongly reject
this asynchronous multisensory signal (Taylor et al.
2011). Alternatively, temporally sandwiching the vo-
cal sac movement between the whine and chuck can
restore the saliency of the overall mating signal
(Taylor and Ryan 2013). For an individual male,
temporal delays between the call and vocal sac move-
ment are impossible due to morphological con-
straints. Our previous experimental data show that
females strongly attend to temporal synchrony of the
signal components, yet are flexible about how they
perceive and respond to temporal variation.
Our current understanding of the t
ungara frog sys-
tem suggests that two simple rules may govern female
choice for multisensory signals. First, if the vocal sac
inflates following a call, then reject the signal. Second,
if the whine and chuck “book end” the vocal sac, then
accept the signal. Despite these data, we still have a
largely incomplete understanding of how all three
components—whine, chuck, and vocal sac—interact
to influence perception and female mate choice.
In this study we further probed how females re-
spond to asynchronous signals. Specifically, we asked
two questions. First, we asked if acoustic content
matters. Do females find an asynchronous multi-
modal signal aversive, when one or more of the calls
lack a chuck? This question is important because it
helps to shed additional light on the cognitive/per-
ceptual system that governs how the frog audiovisual
system processes complex signals. Second, we asked
if there is a syntactical order effect. That is, does a
vocal sac that leads a call in time influence female
choice as it would if it lags the call? This question is
intriguing because for males, order of call compo-
nents is fixed; vocal sac inflations always coincide
with the call and chucks always follow whines.
Females, however, show permissiveness for temporal
arrangement of chuck placement in tests with no
visual cue (Wilczynski et al. 1999).
We collected mated pairs of t
ungara frogs at cho-
ruses within 4 h after sunset. The frogs were collected
2R. C. Taylor et al.
at breeding sites near the Smithsonian Tropical
Research Institute, Gamboa, Republic of Panama.
We placed individual frog pairs into plastic bags
and stored the frogs in a light-safe cooler (total dark-
ness) for a minimum of 1 h prior to testing. This
ensured that the frogs’ eyes were dark-adapted after
collection using flashlights. After testing, the frogs
were toe-clipped, following guidelines of the
American Association of Ichthyologists and
Herpetologists, which allowed us to avoid using re-
captures on subsequent nights. We released all col-
lected frogs at their sites of capture at the end of the
night, ensuring that they could breed in the wild. All
procedures were approved by STRI IACUC (2011-
0825-2014-02) and conducted with permits from
Panama’s ANAM permit No. SE/A-30-12. ANAM
is now the Ministry of the Environment,
We conducted phonotaxis experiments in a hemi-
anechoic chamber (Acoustic Systems, ETS-Lindgren,
Austin, TX, USA) measuring 2.7 m 1.8 m 2m.
For the behavioral tests, we used a restraining funnel
placed in the center of the chamber. The funnel kept
the females equidistant (80 cm) from the two speak-
ers (Mirage Nanosat, Klipsch Audio, Indianapolis,
IN, USA) used to broadcast the male calls (Fig.
1a). Each speaker was separated by 80 cm and
formed a triangle with ca. 60separation relative
to the female’s release point. To generate a multisen-
sory signal, we placed a robotic frog (robofrog) with
an inflatable vocal sac in front of one speaker. We
inflated the vocal sac of the robotic frog remotely
using a pneumatic pump that was triggered by the
computer producing the acoustic stimulus. By using
a sound file to trigger the robofrog vocal sac infla-
tion, we were able to precisely control the timing of
the robofrog’s inflation/deflation sequence relative to
the calls produced at the speaker. Because the
speaker broadcast the call from the same location
as the robofrog, this closely matched the spatial lo-
cation of the natural visual and acoustic signal com-
ponents (Taylor et al. 2008;Klein et al. 2012).
We illuminated the test chamber with a single GE
nightlight (ca. 2.27 10
, model no.
55507; Fairfield, CT, USA). The spectrum and inten-
sity of light at nocturnal breeding sites varies tre-
mendously with location (forest cover vs. open),
moon phase, and cloud cover. The light environment
we provided was well within the range of what frogs
naturally experience (Cummings et al. 2008;Taylor
et al. 2008). For each trial, we placed a female under
the funnel and broadcast digitally synthesized male
vocalizations (see Ryan et al. [2003] for details on
call synthesis). The robofrog vocal sac was also
activated to inflate/deflate asynchronously with the
call broadcast at the speaker (for more details see
Taylor et al. 2011). These playbacks were broadcast
for 2 min, which allowed the female to acclimatize to
the playbacks while under the funnel. For all exper-
iments, we used a synthetic, simple (whine), or com-
plex (whine þone chuck) call broadcast at 82 dB SPL
(re. 20 mPa; RMS, fast, C weighting) measured at
the point of release for the females. We used Adobe
Audition software (ver. 3.0) for playbacks and each
call was played once every 3 s.
After the acclimation period, we lifted the funnel
so the female was free to move around the test arena.
We recorded a choice when a female approached to
within 5 cm of a speaker or speaker/robofrog com-
bination and remained there for 5 s. The 5 s rule
avoided false positives or negatives caused by females
simply walking by a stimulus. To control for side
bias, we systematically alternated the sides on which
the robofrog and calls were presented between trials.
If a female did not move for 2 min after the funnel
was raised or failed to enter a choice zone within
10 min, we discarded the trial from the data set
due to a lack of motivation. Response rates by fe-
males were typically around 65% each night. We
recorded female behavior using an infrared sensitive
camera (Everfocus EHD500IR, Everfocus Electronics,
Duarte, CA, USA) mounted on the ceiling of the
chamber. A video feed allowed us to view the fe-
male’s behavior in real time from outside the sound
chamber, while simultaneously recording video
Rrecording program).
Following these general procedures, we conducted
three experiments. In Experiment 1, we presented
females with a complex call (whine plus one chuck,
hereafter “WC”) versus a simple call (the whine
alone, hereafter “W”). The WC had the visual com-
ponent of a robofrog with inflating vocal sac added,
but the vocal sac inflation lagged the call. The call
and robofrog inflation were 100% out of phase such
that the inflation began immediately following the
terminus of the call (Fig. 1b). The temporal sequence
of this stimulus was: whine, then chuck, then vocal
sac inflation, hereafter abbreviated as (WC-robo). In
Experiment 2, we presented females with the identi-
cal W call at each speaker. To one speaker we also
added a robofrog with inflation following the whine,
hereafter abbreviated as (W-robo). Here also, the
inflation occurred 100% out of phase, immediately
following the call (Fig. 1b). In the third experiment,
we presented females with two identical WC calls,
but one speaker again had a robofrog added. The
robofrog vocal sac inflation preceded the call yielding
a temporal sequence of: vocal sac inflation, then
Synchrony of frog multimodal signals 3
whine, then chuck, hereafter abbreviated as (robo-
WC). Although the inflation preceded the call, the
inflation still occurred 100% out of phase; the call
began immediately following the deflation of the
robofrog vocal sac (Fig. 1b).
Statistical analysis
All experiments consisted of a two-choice test, where
females had the option of responding to a unimodal
call (speaker only) or a multimodal signal (speaker
plus the visual cue of a robofrog). The data were
analyzed using a binomial exact test and the mid-P
value (Agresti 2001). We previously showed that
when the robofrog’s vocal sac inflation temporally
lagged the complex WC call by either 50% or
100%, females chose the multisensory signal only
25% of the time (Taylor et al. 2011). The timing
of the lagging vocal sac in the current study matched
the timing of the 100% from previous experiments.
These experiments were later repeated (unpublished
data), confirming the results. Given the repeatable
and robust nature of the female preference function
for a lagging visual component, we set our a priori
expected binomial response to this asynchronous
multisensory signal at 0.25 (Experiments 1 and 2).
In Experiment 3, where the vocal sac inflation led
the call, we had no prior data to suggest how females
would respond to this particular temporal arrange-
ment. Therefore, we set our a priori expected re-
sponse rate at random choice ¼0.5.
In our first experiment, we presented females with a
WC versus W call, but the robofrog was added to
the speaker playing the WC and the vocal sac was
inflated asynchronously following the call (WC-
robo). Females chose the WC-robo in 75% of trials
(n¼24; binomial test, expected ¼0.25; P<0.0001;
Fig. 2). This reversed the general avoidance of the
asynchronous multisensory signal when the calls
broadcast from alternative speakers were held con-
stant (both WC). This distribution is similar to fe-
male behavior in a standard WC versus W
experiment when no robofrog is present (Gridi-
Papp et al. 2006). Thus, the presence of the chuck
at one call was enough to overcome the unattractive-
ness of the asynchronous signal when the alternative
call was just the whine.
In the second experiment, we presented females
with two identical calls consisting of the whine
only. The speaker with the robofrog lagged the call
(W-robo). Here, females also did not exhibit an
overall aversion to the asynchronous multisensory
Fig. 1 (A) Diagram of two-choice test arena. Females could
choose between two stimuli, a call only or a call with the asyn-
chronously inflating robofrog placed in front of the speaker. (B)
Detail of female choice tests. The asynchronous multimodal signals
are depicted on the left side; the calls only are depicted by the
sonograms on the right. In Experiment 1, the robofrog vocal sac
inflation lagged the call (depicted in the timeframe above the
whine–chuck sonogram). The alternative was a whine only. In
Experiment 2, the robofrog vocal sac inflation lagged the call
(depicted in the timeframe above the whine only sonogram). The
alternative was also a whine only. In Experiment 3, the robofrog
vocal sac inflation led the call (depicted in the timeframe above the
whine–chuck sonogram). The alternative was also a whine chuck.
4R. C. Taylor et al.
signal. They chose it 60% of the time, significantly
more often than expected (n¼40; binomial test, ex-
pected ¼0.25; P<0.0001; Fig. 2).
In the final experiment, we presented females
again with two identical calls consisting of a WC.
This time, the speaker with the robofrog inflated
before the call (robo-WC). Females chose the asyn-
chronous signal 40% of the time (n¼40; binomial
test, expected ¼0.5; P¼0.21; Fig. 2). Thus, females
did not choose either the visually leading asynchro-
nous multimodal signal or the unimodal call more
often that expected from random chance.
All else being equal, the presence of a synchronously
inflating vocal sac makes a male’s call more attractive
to females (Taylor et al. 2008). Further, females tend
to reject an asynchronous signal when the vocal sac
inflation lags the call (Taylor et al. 2011). Male
ungara frogs often call in dense choruses and due
to physiological constraints cannot alter the timing
of vocal sac inflation and call production. Taylor
et al. (2011) suggested that the assessment of the
vocal sac by females may provide a means of iden-
tifying individual callers with a chorus, much like a
human reads lips at noisy parties (Sumby and
Pollack 1954).
In this study, we show that the acoustic and visual
signal components of the t
ungara frog’s mating sig-
nal interact in complex ways to influence female
choice. Since males cannot alter the timing of their
audiovisual signals in nature, it seems intuitive that
females would recognize any incongruency and
adopt a simple rule that rejects any combination
that does not match the natural template.
Interestingly, there does not appear to be a set “rule”
that governs a simple template recognition of signal
synchrony by females (Taylor and Ryan 2013).
In our first experiment, where we played an asyn-
chronous multimodal WC versus a unimodal W, fe-
males showed virtually no aversion to the
asynchronous signal and responded to the WC al-
most as strongly as the same experiment, absent the
visual component (85% preference Gridi-Papp et al.
2006; 75% this study). This indicates that although
the asynchronous audio-visual signal is generally
aversive, if one call contains a chuck, the asynchrony
is still more attractive than an isolated whine.
In nature, chucks always follow whines.
Wilczynski et al. (1999), however, showed that fe-
male t
ungara frogs are permissive to the temporal
order of whines and chucks. In particular, they
found that in stimuli where a chuck artificially
preceded a whine, females found this as attractive
as one that followed in the natural position. Given
the difficult task that females have assigning calls to
their source when many males are calling within a
small area, one prediction might be that females use
the chucks to determine when a call is finished. This
should improve a female’s ability to assign calls to
their source. The data from Wilczynski et al. (1999)
suggest that this is not true, at least when a female is
presented with only two, spatially separated calling
males. So for the acoustic component of the signal,
syntax for female receivers is flexible. Farris and
Ryan (2011,2017) also demonstrated that female
ungara frogs make relative comparisons when iden-
tifying callers acoustically. In a series of experiments,
they showed that females perceptually group whines
and chucks that are temporally and spatially sepa-
rated, effectively responding as if the disparate com-
ponents belong to the same source. Here again, the
females show permissiveness for signal variation in
time and space. They showed that females more
readily group calls that have a smaller spatial sepa-
ration and non-natural sequence relative to calls with
a greater spatial separation but natural sequence
(Farris and Ryan 2017). Although females perceptu-
ally weight spatial cues more, when multiple cues
become available, females integrate these into their
perceptual and decision making processes (Farris and
Ryan 2011).
When the visual component is added to the signal,
syntax becomes more important. In our second ex-
periment where we removed the chucks altogether
and just presented females with whines in the acous-
tic domain, the asynchronous multisensory signal
(W-robo) was no longer aversive, and females chose
this signal more often than expected. In the absence
of the chuck, females are less likely to be influenced
by the incongruency. This may indicate that when
females are simultaneously evaluating acoustic and
visual components, the chuck indicates that the call
is finished, and any vocal sac inflation following this
call does not belong. Thus like relative comparisons
within the auditory domain (Farris and Ryan 2011,
2017), female t
ungara frogs also appear to make rel-
ative comparisons when integrating visual and
acoustic cues (for other cross-modal comparisons,
see also Halfwerk et al. 2014).
In our final experiment, we presented females with
a pair of identical WCs, but at one speaker, the
robofrog inflation preceded the call (robo-WC).
Females responded to the asynchronous signal statis-
tically as often as the unimodal call only. This sug-
gests that females do not recognize the temporal
asynchrony or that their perception of the leading
Synchrony of frog multimodal signals 5
visual signal is less aversive than when it lags the call.
Interestingly, this behavior coincides with audiovi-
sual discrepancy detection in human listeners.
Human listeners, like many vertebrates, integrate au-
ditory and visual signals and generate perceptions of
synchrony as part of their overall auditory scene anal-
ysis (Stein 2012;Farris and Ryan 2017). Humans
more easily detect asynchrony when a visual cue lags
an acoustic signal versus one that leads (Dixon and
Spitz 1980). Given that light travels dramatically faster
than sound, audiovisual discrepancies occur in nature
with increasing communication distances. Specifically,
since sound naturally lags a visual stimulus, it might
be expected that receivers, humans or otherwise, are
somewhat permissive of lagging sound. For example,
Navarra et al. (2009) showed that human listeners
increased reaction times to audio signals that lagged
the visual cue, but were unable to do this for lagging
visual signals. They suggested this effect may result
from auditory processing plasticity that can compen-
sate for the normal temporal lag that occurs in nature,
thereby improving the ability of the brain to bind
relevant audiovisual cues into a coherent stimulus
(also see Sugita and Suzuki 2003). Given stimulus
transmission and neural transduction speeds, commu-
nication distances need to exceed about 10 m before
audio signals begin to perceptually lag visual signals
(Po¨ppel and Artin 1988) and human listeners remain
unaware of asynchronies until the audio stimulus lags
the visual by about 250 ms (Dixon and Spitz 1980).
Our results have important implications for our
understanding of sensory ecology, perception, and
multimodal signal evolution. First, for nocturnally
communicating frogs that use multimodal signals,
the evaluation distance is nearly always less than 10
m (personal observation). Thus, a female receiver is
unlikely to experience a noticeable audiovisual asyn-
chrony produced by a particular calling male. In
light of this, there is no ecological reason why female
frogs should be more sensitive to a lagging visual
signal versus a lagging audio signal. Our data show
that they are, however. One explanation may be that
neural integration of auditory and visual signals, par-
ticularly the perception of synchrony, is a conserved
process across many vertebrate taxa. In particular, if
the vertebrate auditory processing is more plastic
than the visual system (Navarra et al. 2009), then
this may constrain receivers to be permissive of lag-
ging audio signals, irrespective of whether they ex-
perience them in nature.
The second implication of our results is that con-
textual aspects of audiovisual integration may be as
important as temporal structure per se. For t
frogs, the chuck component of the call must be ac-
companied by the whine in order for females to even
recognize it as a salient signal (Ryan 1985). Even so,
once the context is set (e.g., the presence of the
whine), the chuck strongly modulates female
attraction, making the complex call five times more
attractive as the whine only (Gridi-Papp et al. 2006).
Fig. 2 Proportion of females choosing an asynchronous multimodal signal (audioþvisual) versus an alternative unimodal signal (call
only). The far left experiment separated by a vertical line is from Taylor et al. (2011) and was used to set prior expectation of
asynchrony response at 0.25 (horizontal line). For Experiment 3 on the far right, the expected response was set at 0.5 (horizontal line).
The x-axis legends refer to the temporal sequence of the stimuli. WC-robo ¼whine, then chuck, then robotic frog inflation.
W-robo ¼whine, then robotic frog inflation. Robo-WC ¼robotic frog inflation, then whine, then chuck. The graphic of the timing
of the robofrog inflation/sonogram follows from Fig. 1b.
6R. C. Taylor et al.
The presence of the chuck also overrides the aversive
nature of the lagging visual signal. Likewise, when the
chucks are removed completely, females are no longer
averse to the temporal asynchrony. In sum, females
are permissive to variation in call syntax when pre-
sented with a call only (e.g., chuck precedes whine)
and they are permissive of multisensory asynchrony
when chucks are absent. The presence of the chuck,
however, alerts females to the asynchrony of the mul-
tisensory signal (when the visual cue lags a standard
complex call), and modulates their behavior.
We suggest that future studies of multimodal sig-
naling should include experiments that are not only
signal isolation tests (sensu Partan and Marler 2005),
but also explore how different arrangements of both
context and timing influence receiver behavior.
Doing so is likely to reveal the full range of multi-
sensory space over which receivers recognize and re-
spond to conspecific signals (Smith and Evans 2013),
including variations that don’t naturally occur. This
will provide insights into how neural integration and
sensory perception can promote or constrain the
evolution of complex signal design.
Joey Stein and Moey Inc. developed the robotic frog
control system. We thank Nic Stange, Kyle Wilhite,
and Kelsey Mitchell for help with data collection. We
are grateful to the Smithsonian Tropical Research
Institute for logistical support. Constructive criticism
from one anonymous reviewer improved the quality
of the manuscript. The work was conducted under
STRI IACUC protocol No. 2011-0825-2014-02 and
collecting permit from Panama’s Autoridad
Nacional del Ambiente (ANAM).
This work was supported by a National Science
Foundation grant [IOS 1120031 to R.C.T., M.J.R.,
and R.A.P.].
Agresti A. 2001. Exact inference for categorical data: recent ad-
vances and continuing controversies. Stat Med 20:2709–22.
Bee MA. 2015. Treefrogs as animal models for research on
auditory scene analysis and the cocktail party problem. Int
J Psychophysiol 95:216–37.
Bee MA, Micheyl C. 2008. The cocktail party problem: what
is it? How can it be solved? And why should animal be-
haviorists study it? J Comp Psychol 122:235–251.
Cummings ME, Bernal XE, Reynaga R, Rand AS, Ryan MJ.
2008. Visual sensitivity to a conspicuous male cue varies by
reproductive state in Physalaemus pustulosus females. J Exp
Biol 211:1203–10.
Dixon NF, Spitz L. 1980. The detection of auditory visual
desynchrony. Perception 9:719–21.
Farris HE, Ryan MJ. 2011. Relative comparisons of call param-
eters enable auditory grouping in frogs. Nat Commun 2:410.
Farris HE, Ryan MJ. 2017. Schema vs. primitive perceptual
grouping: the relative weighting of sequential vs. spatial
cues during an auditory grouping task in frogs. J Comp
Physiol A 203:175–82.
Gridi-Papp M, Rand AS, Ryan MJ. 2006. Animal communi-
cation: complex call production in the t
ungara frog. Nature
Halfwerk W, Page RA, Taylor RC, Wilson PS, Ryan MJ. 2014.
Crossmodal comparisons of signal components allow for
relative-distance assessment. Curr Biol 24:1751–5.
Hebets EA, Barron AB, Balakrishnan CN, Hauber ME, Mason
PH, Hoke KL. 2016. A systems approach to animal com-
munication. Proc R Soc B Biol Sci 283:20152889.
Hebets EA, Papaj DR. 2005. Complex signal function: devel-
oping a framework of testable hypotheses. Behav Ecol
Sociobiol 57:197–214.
Higham JP, Hebets EA. 2013. An introduction to multimodal
communication. Behav Ecol Sociobiol 67:1381–8.
Klein BA, Stein J, Taylor RC. 2012. Robots in the service of
animal behavior. Commun Integr Biol 5:466–72.
Navarra J, Hartcher-O’Brien J, Piazza E, Spence C. 2009.
Adaptation to audiovisual asynchrony modulates the speeded
detection of sound. Proc Natl Acad Sci U S A 106:9169–73.
Partan S, Marler P. 1999. Behavior–communication goes mul-
timodal. Science 283:1272–3.
Partan SR, Marler P. 2005. Issues in the classification of mul-
timodal communication signals. Am Nat 166:231–45.
Phelps SM, Rand AS, Ryan MJ. 2005. A cognitive framework
for mate choice and species recognition. Am Nat 167:28–42.
Po¨ ppel E, Artin TT. 1988. Mindworks: time and conscious
experience. Harcourt Brace Jovanovich.
Rubi TL, Stephens DW. 2016. Why complex signals matter,
sometimes. In: Bee MA, Miller C, editors. Psychological
mechanisms in animal communication. New York (NY):
Springer. p. 119–35.
Ryan MJ. 1985. The t
ungara frog: a study in sexual selection
and communication. Chicago: University of Chicago Press.
Ryan MJ. 2001. Anuran communication. Washington, DC:
Smithsonian Institution Press.
Ryan MJ, Guerra MA. 2014. The mechanism of sound pro-
duction in tungara frogs and its role in sexual selection and
speciation. Curr Opin Neurobiol 28:54–59.
Ryan MJ, Rand W, Hurd PL, Phelps SM, Rand AS. 2003.
Generalization in response to mate recognition signals.
Am Nat 161:380–94.
Smith CL, Evans CS. 2013. A new heuristic for capturing the
complexity of multimodal signals. Behav Ecol Sociobiol
Stange N, Page RA, Ryan MJ, Taylor RC. 2016. Interactions
between complex multisensory signal components result in
unexpected mate choice responses. Anim Behav 116:83–7.
Stein BE. 2012. The new handbook of multisensory process-
ing. Cambridge (MA): MIT Press.
Sugita Y, Suzuki Y. 2003. Audiovisual perception: implicit
estimation of sound-arrival time. Nature 421:911.
Sumby WH, Pollack I. 1954. Visual contribution to speech
intelligibility in noise. J Acoust Soc Am 26:212–5.
Synchrony of frog multimodal signals 7
Taylor RC, Klein BA, Stein J, Ryan MJ. 2008. Faux frogs:
multimodal signalling and the value of robotics in animal
behaviour. Anim Behav 76:1089–97.
Taylor RC, Klein BA, Stein J, Ryan MJ. 2011. Multimodal
signal variation in space and time: how important is
matching a signal with its signaler? J Exp Biol 214:815–20.
Taylor RC, Ryan MJ. 2013. Interactions of multisensory com-
ponents perceptually rescue t
ungara frog mating signals.
Science 341:273–4.
Wilczynski W, Rand AS, Ryan MJ. 1999. Female preferences
for temporal order of call components in the tungara frog:
a Bayesian analysis. Anim Behav 58:841–51.
8R. C. Taylor et al.
... This phenomenon is seen in the McGurk effect in human speech perception, whereby an auditory component is integrated with a visual component, forming a novel percept of the auditory component (McGurk and Macdonald 1976). For speech perception, temporal synchronization of visual and acoustic components is often necessary for multisensory integration (McGurk and Macdonald 1976;Ghazanfar et al. 2005), and temporal asynchrony can drastically reduce the attractiveness of a signal (e.g., Taylor et al. 2011Taylor et al. , 2017. ...
... The túngara frogs is an excellent candidate for studying multimodal signaling and noise; previous studies have documented multimodal interactions between male advertisement calls, vocal sac inflations, and female mate choice (e.g., Taylor et al. 2008Taylor et al. , 2011Taylor and Ryan 2013;Taylor et al. 2017, James et al. 2021. Vocal sac inflations, combined with advertisement calls, make túngara frog signals more attractive to females (Taylor et al. 2008). ...
... In the brush-legged wolf spider (Schizocosa ocreata), females are significantly less receptive to multimodal signaling components (visual and vibratory) when they are asynchronous compared to when they are synchronous (e.g., Kozak and Uetz 2016). In túngara frogs, females respond less to asynchronous multimodal signals if the vocal sac inflates after the end of the call but will respond similarly to an acoustic-only signal if it inflates before the beginning of the call, demonstrating not only the importance of synchrony but also sequence of individual signal components (e.g., Taylor et al. 2017). In the presence of multimodal noise, a female's perception of one male's display may be altered by calls and vocal sac movements from nearby frogs. ...
Females of many species choose mates using multiple sensory modalities. Multimodal noise may arise, however, in dense aggregations of animals communicating via multiple sensory modalities. Some evidence suggests multimodal signals may not always improve receiver decision-making performance. When sensory systems process input from multimodal signal sources, multimodal noise may arise and potentially complicate decision-making due to the demands on cognitive integration tasks. We tested female túngara frog, Physalaemus (=Engystomops) pustulosus, responses to male mating signals in noise from multiple sensory modalities (acoustic and visual). Noise treatments were partitioned into three categories: acoustic, visual, and multimodal. We used natural calls from conspecifics and heterospecifics for acoustic noise. Robotic frogs were employed as either visual signal components (synchronous vocal sac inflation with call) or visual noise (asynchronous vocal sac inflation with call). Females expressed a preference for the typically more attractive call in the presence of unimodal noise. However, during multimodal signal and noise treatments (robofrogs employed with background noise), females failed to express a preference for the typically attractive call in the presence of conspecific chorus noise. We found that social context and temporal synchrony of multimodal signaling components are important for multimodal communication. Our results demonstrate that multimodal signals have the potential to increase the complexity of the sensory scene and reduce the efficacy of female decision making.
... One way of investigating the importance of the association between modalities is to experimentally disrupt their spatial or temporal relatedness (Halfwerk et al. 2019). For instance, a robotic male túngara frog (Physalaemus pustulosus) has been used to present females with different temporal combinations of visual (inflated vocal sac) and auditory (whine and chuck) courtship signals, showing that female response was reduced when calls and sac inflation were temporally interleaved (Taylor et al. 2017). Another study in the same species showed that females did not prefer a synchronized over a unimodal signal, but would strongly reject an asynchronous one (Taylor et al. 2011). ...
... To formally test this hypothesis, we would need to compare female response to unimodal acoustic and visual stimuli versus multimodal stimuli in a future playback experiment. In túngara frogs, females preferred asynchronous multimodal signals over unimodal signals (Taylor et al. 2017) and this could be true for doves as well. ...
Full-text available
Some multimodal signals ‐ i.e. occurring in more than one sensory modality ‐ appear to carry additional information which is not present when component signals are presented separately. To understand the function of male ring dove's (Streptopelia risoria) multimodal courtship, we used audiovisual playback of male displays to investigate female response to stimuli differing in their audiovisual timing. From natural courtship recordings, we created a shifted stimulus where audio was shifted relative to video by a fixed value, and a jittered stimulus where each call was moved randomly along the visual channel. We presented three groups of females with the same stimulus type, i.e. control, shifted, and jittered, for seven days. We recorded their behavior and assessed pre‐ and post‐test blood estradiol concentration. We found that playback exposure increased estradiol levels, confirming that this technique can be efficiently used to study doves’ sexual communication. Additionally, chasing behavior (indicating sexual stimulation) increased over experimental days only in the control condition, suggesting a role of multimodal timing on female response. This stresses the importance of signal configuration in multimodal communication, as additional information is likely to be contained in the temporal association between modalities. This article is protected by copyright. All rights reserved
... They hypothesized that this occurs because the displaced vocal sac inflation results in perceptual continuity between the whine and the chuck similar to the phenomenon of auditory continuity that Bregman (1994) and others have shown in humans. Interestingly, pure auditory continuity was not verified when tested in túngara frogs (Baugh, Ryan, Bernal, Rand, & Bee, 2016;Taylor, Page, Klein, Ryan, & Hunter, 2017). ...
... The interaction between these signal components is not a simple linear relationship, and signal perception is more complex than a simple template match (either the signal matches a prewired template or it does not). Taylor et al. (2017) further teased apart this relationship by using asynchronous multimodal signals that cannot occur in nature. Generally, if the vocal sac inflation follows the call, females reject the signal. ...
Full-text available
Choosing a mate is one of the most important decisions an animal can make. The fitness consequences of mate choice have been analysed extensively, and its mechanistic bases have provided insights into how animals make such decisions. Less attention has been given to higher-level cognitive processes. The assumption that animals choose mates predictably and rationally is an important assumption in both ultimate and proximate analyses of mate choice. It is becoming clear, however, that irrational decisions and unpredictable nonlinearities often characterize mate choice. Here we review studies in which cognitive analyses seem to play an important role in the following contexts: auditory grouping; Weber's law; competitive decoys; multimodal communication; and, perceptual rescue. The sum of these studies suggest that mate choice decisions are more complex than they might seem and suggest some caution in making assumptions about evolutionary processes and simplistic mechanisms of mate choice.
... We hypothesize that these processes occurred in our experiments with túngara frogs, where cross-modal stimuli prime females to temporal and spatial aspects of the acoustic stimuli. Indeed, previous research on multi-sensory preferences in female túngara frogs has found that the temporal and spatial alignment of the visual and acoustic stimuli are important for whether females prefer or even recognize the visual stimulus [38,44,45]. ...
Full-text available
Stimulation in one sensory modality can affect perception in a separate modality, resulting in diverse effects including illusions in humans. This can also result in cross-modal facilitation, a process where sensory performance in one modality is improved by stimulation in another modality. For instance, a simple sound can improve performance in a visual task in both humans and cats. However, the range of contexts and underlying mechanisms that evoke such facilitation effects remain poorly understood. Here, we demonstrated cross-modal stimulation in wild-caught túngara frogs, a species with well-studied acoustic preferences in females. We first identified that a combined visual and seismic cue (vocal sac movement and water ripple) was behaviourally relevant for females choosing between two courtship calls in a phonotaxis assay. We then found that this combined cross-modal stimulus rescued a species-typical acoustic preference in the presence of background noise that otherwise abolished the preference. These results highlight how cross-modal stimulation can prime attention in receivers to improve performance during decision-making. With this, we provide the foundation for future work uncovering the processes and conditions that promote cross-modal facilitation effects.
... The distance between the two video playback areas also was 1 m, resulting in a 60° angle between monitors with respect to the marked position. This allowed females to easily see the vocal sac and body of the male on both screens (Taylor et al., 2008;Taylor et al., 2017). We observed female behavior on a monitor using a video system with an infrared light source. ...
Full-text available
Diverse animal species use multimodal communication signals to coordinate reproductive behavior. Despite active research in this field, the brain mechanisms underlying multimodal communication remain poorly understood. Similar to humans and many mammalian species, anurans often produce auditory signals accompanied by conspicuous visual cues (e.g., vocal sac inflation). In this study, we used video playbacks to determine the role of vocal-sac inflation in little torrent frogs (Amolops torrentis). Then we exposed females to blank, visual, auditory, and audiovisual stimuli and analyzed whole brain tissue gene expression changes using RNA-seq. The results showed that both auditory cues (i.e., male advertisement calls) and visual cues were attractive to female frogs, although auditory cues were more attractive than visual cues. Females preferred simultaneous bimodal cues to unimodal cues. The hierarchical clustering of differentially expressed genes showed a close relationship between neurogenomic states and momentarily expressed sexual signals. We also found that the Gene Ontology terms and KEGG pathways involved in energy metabolism were mostly increased in blank contrast versus visual, acoustic, or audiovisual stimuli, indicating that brain energy use may play an important role in response to these stimuli. In sum, behavioral and neurogenomic responses to acoustic and visual cues are correlated in female little torrent frogs. © 2021, Asiatic Herpetological Research Society. All rights reserved.
... Moreover, SM advertising can give consumers with large timely, comprehensive, up-todate knowledge in a large convenient way from the consumer's expectations; (R. C. Taylor, Page, Klein, Ryan, & Hunter, 2017).Accordingly, consumers are more responsive to control time and try in the information research process (Roper, Logan, & Tierney, 2000). In the matched literature, many studies have helped the role of Informativeness, such as (Bale et al., 2016). ...
Full-text available
Social media is continuously used as a platform for marketing and advertising. Firms have spent a lot of seasons, cash and property on Social media ads. However, it is all the time stimulating how Firms can prepare Social media advertising to fortunately engage and inspire a consumer to purchase their brands. The purpose of this research is consequently to describe and check the key elements of Social media advertising that force anticipate the buy intention. The theoretical model was expected on the foundation of three factors from the expansion of the Unified Theory Acceptance and Use of Technology (UTAUT2) (Performance expectancy, Hedonic motivation and Habit) along with Interactivity, Informativeness and Perceived relevance. The data was composed using a questionnaire survey of 260 participants. The most important results of structural equation modelling (SEM) mainly sustained the validity of the current model and the significant impact of Performance expectancy, Hedonic motivation and Interactivity, Informativeness, and Perceived relevance on purchase intentions. Confidently, this study will produce a set of theoretical and practical instruction on how marketers can successfully plan and apply their ads through Social media platforms.
... While some work has been done in studying interactions between robots and land or amphibious animals, e.g. squirrels [14,15], guinea fowl [16], frogs [17], lizards [18,19], and others, fish have become a staple for studying what factors might be responsible for successful social interaction between robots and biological organisms. For example, several studies have looked at the effects of robot tail-beat frequency and other factors on interactions between fish facsimiles and Zebrafish [20][21][22][23][24], Golden Shiners [25,26], Mackerel [27], and Elephantfish [28]. ...
Full-text available
This paper presents the design, construction, operation, and validation of a robotic gantry platform specifically designed for studying fish-robot interaction. The platform has five degrees of freedom to manipulate the three-dimensional position, yaw angle, and the pitch of a lure. Additionally, it has a four-conductor slip ring that allows power and data to be transmitted to the lure for the operation of fins and other actuators that increase realism or act as stimuli to focal fish during an ethorobotic experiment. The design is open-source, low-cost, and includes purpose-built electronics, software, and hardware to make it extensible and customizable for a number of applications with varying requirements.
... In Túngara frogs (Physalaemus pustulosus), females reject the courtship when the multimodal elements of the mating signal lack synchrony (Taylor, Klein, Stein, & Ryan, 2011). In the same species, synchronized visual and acoustic displays are more attractive to females than asynchronous signals (Taylor, Page, Klein, Ryan, & Hunter, 2017). However, an asynchronous multimodal signal is still more attractive than a unimodal signal. ...
Full-text available
Courtship displays are behaviours aimed to facilitate attraction and mating with the opposite sex and are very common across the animal kingdom. Most courtship displays are multimodal, meaning that they are composed of concomitant signals occurring in different sensory modalities. Although courtship often strongly influences reproductive success, the question of why and how males use multimodal courtship to increase their fitness has not yet received much attention. Very little is known about the role of different components of male courtship and their relative importance for females. Indeed, most of the work on courtship displays have focused on effects on female choice, often neglecting other possible roles. Additionally, a number of scientists have recently stressed the importance of considering the complexity of a display and the interactions between its different components in order to grasp all the information contained in those multimodal signals. Unfortunately, these methods have not yet been extensively adapted in courtship studies. The aim of this study was to review what is currently known about the functional significance of courtship displays, particularly about the role of multimodality in the courtship communication context. Emphasis is placed on those cases where a complete picture of the communication system can only be assessed by taking complexity and interaction between different modalities into account.
... The tight synchronization of horizontal velocity, sound and iridescent color display (to~300 ms) raises the question of whether the putative visual and acoustic signals in the hummingbird's dive are 'fixed' or 'fluid'-that is, whether they are produced synchronously due to physiological constraint (fixed) or whether their timing is independent of constraints (fluid) 50 . Because of the geometry of the U-shaped dive (Fig. 1c), the timing of gorget visibility and the accompanying color shift (along with any other visual components arising from the gorget) are likely to be fixed to the nadir. ...
Full-text available
Many animal signals are complex, often combining multimodal components with dynamic motion. To understand the function and evolution of these displays, it is vital to appreciate their spatiotemporal organization. Male broad-tailed hummingbirds (Selasphorus platycercus) perform dramatic U-shaped courtship dives over females, appearing to combine rapid movement and dive-specific mechanical noises with visual signals from their iridescent gorgets. To understand how motion, sound and color interact in these spectacular displays, we obtained video and audio recordings of dives performed by wild hummingbirds. We then applied a multi-angle imaging technique to estimate how a female would perceive the male’s iridescent gorget throughout the dive. We show that the key physical, acoustic and visual aspects of the dive are remarkably synchronized—all occurring within 300 milliseconds. Our results highlight the critical importance of accounting for motion and orientation when investigating animal displays: speed and trajectory affect how multisensory signals are produced and perceived.
Multimodal communication signals consist of two or more distinct components produced in different sensory modalities and transduced by receivers using multiple sensory systems. One evolutionary trajectory by which incipient multimodal signals may arise is when receivers are selected to attend both to a well-established signal and a cue in a different sensory modality associated with that signal's production. Previous studies of frogs suggest movement of the male's vocal sac, which is inextricably tied to vocal production in most species, functions as the dynamic visual component of a multimodal mate attraction signal that modulates female responses to sexually advertising males. Most of this work, however, has presented multimodal stimuli using video playbacks or artificially illuminated robots in laboratory settings, which leaves open the question of whether the vocal sac functions in multimodal signalling under more natural nocturnal illumination. In this study of Cope's grey treefrog, Hyla chrysoscelis, a nocturnally breeding species, we tested the hypothesis that vocal sacs are a dynamic visual component of a multimodal mate attraction signal that influences female responses to sexually advertising males. Using robotic frogs as stimuli, we performed multimodal playback experiments outdoors under nocturnal illumination. We found no evidence that vocal sacs were attractive to females or that they influenced the responses of females when acoustic information was rendered less certain due to a degraded signal structure or background noise. While these negative results may reflect genuine species differences, they also corroborate a negative result from one of the only previous studies conducted under natural nocturnal illumination to investigate frog vocal sacs as the visual component of a putative multimodal mate attraction signal (Taylor et al., 2007, Animal Behaviour, 74, 1753–1763). We consider possible proximate and ultimate explanations for our results and critically review previous research on multimodal mate attraction in nocturnal frogs.
Full-text available
Perceptually, grouping sounds based on their sources is critical for communication. This is especially true in túngara frog breeding aggregations, where multiple males produce overlapping calls that consist of an FM 'whine' followed by harmonic bursts called 'chucks'. Phonotactic females use at least two cues to group whines and chucks: whine-chuck spatial separation and sequence. Spatial separation is a primitive cue, whereas sequence is schema-based, as chuck production is morphologically constrained to follow whines, meaning that males cannot produce the components simultaneously. When one cue is available, females perceptually group whines and chucks using relative comparisons: components with the smallest spatial separation or those closest to the natural sequence are more likely grouped. By simultaneously varying the temporal sequence and spatial separation of a single whine and two chucks, this study measured between-cue perceptual weighting during a specific grouping task. Results show that whine-chuck spatial separation is a stronger grouping cue than temporal sequence, as grouping is more likely for stimuli with smaller spatial separation and non-natural sequence than those with larger spatial separation and natural sequence. Compared to the schema-based whine-chuck sequence, we propose that spatial cues have less variance, potentially explaining their preferred use when grouping during directional behavioral responses.
Full-text available
Multimodal (multisensory) signalling is common in many species and often facilitates communication. How receivers integrate individual signal components of multisensory displays, especially with regard to variance in signal complexity, has received relatively little attention. In nature, male túngara frogs, Physalaemus pustulosus, produce multisensory courtship signals by vocalizing and presenting their inflating and deflating vocal sac as a visual cue. Males can produce a simple call (whine only) or a complex call (whine + one or more chucks). In a series of two-choice experiments, we tested female preferences for variation in acoustic call complexity and amplitude (unimodal signals). We then tested preferences for the same calls when a dynamic robotic frog was added to one call, generating a multimodal stimulus. Females preferred a complex call to a simple call; when both calls contained at least one chuck, additional numbers of chucks did not further increase attractiveness. When calls contained zero or one chuck, the visual stimulus of the robofrog increased call attractiveness. When calls contained multiple chucks, however, the visual component failed to enhance call attractiveness. Females also preferred higher amplitude calls and the addition of the visual component to a lower amplitude call did not alter this preference. At relatively small amplitude differences, however, the visual signal increased overall discrimination between the calls. These results indicate that the visual signal component does not provide simple enhancement of call attractiveness. Instead, females integrate multisensory components in a nonlinear fashion. The resulting perception and behavioural response to complex signals probably evolved in response to animals that communicate in noisy environments.
Full-text available
Why animal communication displays are so complex and how they have evolved are active foci of research with a long and rich history. Progress towards an evolutionary analysis of signal complexity, however, has been constrained by a lack of hypotheses to explain similarities and/or differences in signalling systems across taxa. To address this, we advocate incorporating a systems approach into studies of animal communication—an approach that includes comprehensive experimental designs and data collection in combination with the implementation of systems concepts and tools. A systems approach evaluates overall display architecture, including how components interact to alter function, and how function varies in different states of the system. We provide a brief over- view of the current state of the field, including a focus on select studies that highlight the dynamic nature of animal signalling. We then introduce core con- cepts from systems biology (redundancy, degeneracy, pluripotentiality, and modularity) and discuss their relationships with system properties (e.g. robust- ness, flexibility, evolvability). We translate systems concepts into an animal communication framework and accentuate their utility through a case study. Finally, we demonstrate how consideration of the system-level organization of animal communication poses new practical research questions that will aid our understanding of how and why animal displays are so complex.
Full-text available
Animals have multiple senses through which they detect their surroundings and often integrate sensory information across different modalities to generate perceptions [1 and 2]. Animal communication, likewise, often consists of signals containing stimuli processed by different senses [3, 4, 5 and 6]. Stimuli with different physical forms (i.e., from different sensory modalities) travel at different speeds [7]. As a consequence, multimodal stimuli simultaneously emitted at a source can arrive at a receiver at different times. Such differences in arrival time can provide unique information about the distance to the source [8 and 9]. Male túngara frogs (Physalaemus pustulosus) call from ponds to attract females and to repel males. Production of the sound incidentally creates ripples on the water surface, providing a multimodal cue [ 10]. We tested whether male frogs attend to distance-dependent cues created by a calling rival and whether their response depends on crossmodal comparisons. In a first experiment, we showed distance-dependent changes in vocal behavior: males responded more strongly with decreasing distance to a mimicked rival. In a second experiment, we showed that males can discriminate between relatively near and far rivals by using a combination of unimodal cues, specifically amplitude changes of sound and water waves, as well as crossmodal differences in arrival time. Our data reveal that animals can compare the arrival time of simultaneously emitted multimodal cues to obtain information on relative distance to a source. We speculate that communicative benefits from crossmodal comparison may have been an important driver of the evolution of elaborate multimodal displays [ 11 and 12].
Full-text available
Though it has long been known that animal communication is complex, recent years have seen growing interest in understanding the extent to which animals give multicomponent signals in multiple modalities, and how the different types of information extracted by receivers are interpreted and integrated in animal decision-making. This interest has culminated in the production of the present special issue on multimodal communication, which features both theoretical and empirical studies from leading researchers in the field. Reviews, comparative analyses, and species-specific empirical studies include manuscripts on taxa as diverse as spiders, primates, birds, lizards, frogs, and humans. The present manuscript serves as both an introduction to this special issue, as well as an introduction to multimodal communication more generally. We discuss the history of the study of complexity in animal communication, issues relating to defining and classifying multimodal signals, and particular issues to consider with multimodal (as opposed to multicomponent unimodal) communication. We go on to discuss the current state of the field, and outline the contributions contained within the issue. We finish by discussing future avenues for research, in particular emphasizing that ‘multimodal’ is more than just ‘bimodal’, and that more integrative frameworks are needed that incorporate more elements of efficacy, such as receiver sensory ecology and the environment.
Mating decisions contribute to both the fitness of individuals and the emergence of evolutionary diversity, yet little is known about their cognitive architecture. We propose a simple model that describes how preferences are translated into decisions and how seemingly disparate patterns of preference can emerge from a single perceptual process. The model proposes that females use error‐prone estimates of attractiveness to select mates based on a simple decision rule: choose the most attractive available male that exceeds some minimal criterion. We test the model in the túngara frog, a well‐characterized species with an apparent dissociation between mechanisms of mate choice and species recognition. As suggested by our model results, we find that a mate attraction feature alters assessments of species status. Next, we compare female preferences in one‐choice and two‐choice tests, contexts thought to emphasize species recognition and mate choice, respectively. To do so, we use the model to generate maximum‐likelihood estimators of preference strengths from empirical data. We find that a single representation of preferences is sufficient to explain response probabilities in both contexts across a wide range of stimuli. In this species, mate choice and species recognition are accurately and simply summarized by our model. While the findings resolve long‐standing anomalies, they also illustrate how models of choice can bridge theoretical and empirical treatments of animal decisions. The data demonstrate a remarkable congruity of perceptual processes across contexts, tasks, and taxa.
The basic building blocks of communication are signals, assembled in various sequences and combinations, and used in virtually all inter- and intra-specific interactions. While signal evolution has long been a focus of study, there has been a recent resurgence of interest and research in the complexity of animal displays. Much past research on signal evolution has focused on sensory specialists, or on single signals in isolation, but many animal displays involve complex signaling, or the combination of more than one signal or related component, often serially and overlapping, frequently across multiple sensory modalities. Here, we build a framework of functional hypotheses of complex signal evolution based on content-driven (ultimate) and efficacy-driven (proximate) selection pressures (sensu Guilford and Dawkins 1991). We point out key predictions for various hypotheses and discuss different approaches to uncovering complex signal function. We also differentiate a category of hypotheses based on inter-signal interactions. Throughout our review, we hope to make three points: (1) a complex signal is a functional unit upon which selection can act, (2) both content and efficacy-driven selection pressures must be considered when studying the evolution of complex signaling, and (3) individual signals or components do not necessarily contribute to complex signal function independently, but may interact in a functional way.
Sexual communication can evolve in response to sexual selection, and it can also cause behavioral reproductive isolation between populations and thus drive speciation. Anurans are an excellent system to investigate these links between behavior and evolution because we have detailed knowledge of how neural mechanisms generate behavioral preferences for calls and how these preferences then generate selection on call variation. But we know far less about the physical mechanisms of call production, especially how different laryngeal morphologies generate call variation. Here we review studies of a group of species that differ in the presence of a secondary call component that evolved under sexual selection. We discuss how the larynx produces this call component, and how laryngeal morphology generates sexual selection and can contribute to speciation.
The perceptual analysis of acoustic scenes involves binding together sounds from the same source and separating them from other sounds in the environment. In large social groups, listeners experience increased difficulty performing these tasks due to high noise levels and interference from the concurrent signals of multiple individuals. While a substantial body of literature on these issues pertains to human hearing and speech communication, few studies have investigated how nonhuman animals may be evolutionarily adapted to solve biologically analogous communication problems. Here, I review recent and ongoing work aimed at testing hypotheses about perceptual mechanisms that enable treefrogs in the genus Hyla to communicate vocally in noisy, multi-source social environments. After briefly introducing the genus and the methods used to study hearing in frogs, I outline several functional constraints on communication posed by the acoustic environment of breeding "choruses". Then, I review studies of sound source perception aimed at uncovering how treefrog listeners may be adapted to cope with these constraints. Specifically, this review covers research on the acoustic cues used in sequential and simultaneous auditory grouping, spatial release from masking, and dip listening. Throughout the paper, I attempt to illustrate how broad-scale, comparative studies of carefully considered animal models may ultimately reveal an evolutionary diversity of underlying mechanisms for solving cocktail-party-like problems in communication.