Content uploaded by Jason Albertson Whitfield
Author content
All content in this area was uploaded by Jason Albertson Whitfield on Sep 14, 2023
Content may be subject to copyright.
• •
Research Note
Order Affects Clear and Loud Speech Response
Jason A. Whitfield,
a
Adam M. Fullenkamp,
b
and Zoe Kriegel
c
a
Department of Communication Sciences and Disorders, Bowling Green State University, OH
b
School of Human Movement, Sport, and Leisure
Studies, Bowling Green State University, OH
c
Division of Communication Disorders, University of Wyoming, Laramie
AR T I CLE I N F O
Article History:
Received January 12, 2023
Revision received May 19, 2023
Accepted June 30, 2023
Editor-in-Chief: Cara E. Stepp
Editor: Kathleen F. Nagle
https://doi.org/10.1044/2023_JSLHR-23-00028
Correspondence to Jason A. Whitfield: jawhitf@bgsu.edu. Disclosure:
The first and second authors are employed by Bowling Green State
University. The third author is employed by the University of
Wyoming. There are no other financial or nonfinancial conflic ts of
interest to disclose.
AB ST R A C T
Purpose: The purpose of this investigation was to examine the impact of
instruction order on the speech production response when adopting higher
effort speaking styles, specifically loud and clear speech.
Method: Speech intensity, lip aperture range, and speech rate data were col-
lected from 24 talkers who repeated the utterance “Buy Bobby a puppy” using
habitual, clear, and loud speech. Participants were assigned in quasi-random
fashion to one of two groups: a Clear–Loud Group (11 participants; order:
habitual-clear-loud) or a Loud–Clear Group (13 participants; order: habitual-
loud-clear).
Results: Talkers in the Clear–Loud Group exhibited higher speech intensity dur-
ing the loud style compared with those who performed the Loud–Clear Group.
Furthermore, talkers in the Clear–Loud Group retained the increases lip aperture
range and reductions in speech rate associated with the clear style when pro-
ducing the loud style. Conversely, talkers in the Loud–Clear Group exhibited
significant increases in lip aperture range between the habitual and loud styles
and between the loud and clear styles. Additionally, the Loud–Clear Group
exhibited a reduction in speech rate only during the clear style, as no differ-
ences in speech rate were observed between the habitual and loud styles.
Conclusions: These findings may suggest that producing a higher effort style
leads to carry-over effects in subsequent styles. Future research should investi-
gate factors that moderate the degree of order effects for both research and
clinical purposes. For instance, if generalizable, the compounding carry-over
effects may prove advantageous for certain clinical populations.
Instructions to speak clearer or louder than usual
result in relatively reliable modifications to the speech
signal (e.g., Kearney et al., 2017; Lam & Tjaden, 2016;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Tjaden et al., 2013; Tjaden & Wilding, 2004; Whitfield
et al., 2021). Clear speech, for example, can be naturally
elicited when a listener has difficul ty hearing or under-
standing. Loud speech can be elicited in noisy environ-
ments or when a talker is interacting with interlocutors
located at further than typical distances (e.g., Koenig &
Fuchs, 2019; Picheny et al., 1986; Smiljanić & Gilbert,
2017; Weis ser & Buchholz, 2019). Additionally, clear and
loud speech forms can be explicitly elicited in labora-
tory or clinical environments by instructing talkers to
speak clearer or louder than usual (e.g., Lam et al., 2012;
Smiljanić & Gilbert, 2017; Whitfield et al., 2018). The
speech production changes, typically associated with
adopting a clear or loud speech style, enhance the under-
standability of the speech signal, making these styles com-
mon talker-oriented strategies utilized in the treatment of
dysarthria (e.g., Kearney et al., 2017; Lam & Tjaden,
2016; Ramig et al., 2018; Tjaden et al., 2013).
Instructions to speak clearer than usual tend to yield
robust changes in acoustic and kinematic measures of
articulation and speaking rate that reflect the larger artic-
ulatory displacements and slower syllable rates (e.g., 10%–
50% reduction in syllable rate when adopting a clear
speech style; Kearney et al., 2017; Lam & Tjaden, 2016;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Smiljanic & Gilbert, 2017; Tjaden et al., 2013; Whitfield
Journal of Speech, Language, and Hearing Research 1–11 Copyright © 2023 American Speech-Language-Hearing Association 1
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
•
et al., 2021; Whitfield & Mehta, 2019). Instructions to
speak louder than usual yield robust increases in speech
intensity (e.g., 8–15 dB increase in speech intensity when
adopting a louder speech style; Huber & Chandrasekaran,
2006; Kearney et al., 2017; Mefferd, 2017; Tjaden et al.,
2013; Whitfield et al., 2021). Therefore, existing data sug-
gest that talkers employ instruction-specific adjustments
when producing higher effort speaking styles.
However, data from these investigations also suggest
that adopting a higher effort speaking style is associated
with an overall increase in speech motor drive, leading to
changes across multiple speech subsystems, including res-
piration, phonation, articulation, and prosody. For exam-
ple, in addition to robust changes in articulatory kinemat-
ics and speaking rate, adopting a clearer-than-normal
speech style may lead to phonatory adjustments and
increases in speech intensity (e.g., 1.5–6 dB increase in
speech intensity; Kearney et al., 2017; Mefferd, 2017;
Whitfield et al., 2021). Additionally, adopting a louder-
than-comfortable intensity leads to changes in articulatory
kinematics characterized by larger articulatory displace-
ments, in addition to robust increases in speech intensity
(e.g., Huber & Chandrasekaran, 2006; Kearney et al.,
2017; Mefferd, 2017; Mefferd & Green, 2010; Whitfield
et al., 2018, 2021). Therefore, although specific instruc-
tions to speak clearer or louder than usually produced
robust changes in the targeted systems, general and
system-wide changes are also observed for these higher
effort speaking styles (e.g., Huber & Chandrasekaran,
2006; Kearney et al., 2017; Lam et al., 2012; Mefferd,
2017; Whitfield et al., 2018, 2021).
Many authors have used a within-subject design that
involves performing several different speech style modifi-
cations in succession when examining these higher effort
speech styles (e.g., Huber & Chandrasekaran, 2006;
Kearney et al., 2017; Lam & Tjaden, 2016; Lam et al.,
2012; Mefferd, 2017; Mefferd & Green, 2010; Smiljanic &
Gilbert, 2017; Tjaden et al., 2013; Tjaden & Wilding,
2004; Whitfield et al., 2021). Although these studies report
consistent differences among higher effort styles, there is
little uniformity in the methodological choices used to
address the potential carry-over effects that may arise
from producing multiple speech styles in succession.
Although all authors typically begin with the habitual,
conversational, or plain style, some proceed through a
fixed instruction order to elicit subsequent speaking styles
(e.g., Mefferd, 2017; Smiljanic & Gilbert, 2017), others
adopt a randomized order (e.g., Kearney et al., 2017), and
others counterbal ance the order of higher effort styles
among participants (e.g., Huber & Chandrasekaran, 2006;
Whitfield et al., 2021). For instance, in a study by Lam
et al. (2012), participants initially produced sentences in
their habitual style and then were instructed to read the
stimuli “while speaking clearly.” Following these instruc-
tions, the researchers counter-balanced instructions to
“speak as if speaking to someone who has a hearing
impairment” and “overenunciate each word” to examine
the impact of more descriptive and explicit clear speech
instructions. The researcher engaged participants in con-
versation between each speaking style to minimize carry-
over effects (Lam et al., 2012). However, despite the dif-
ferent approaches used to address the potential carry-
over effects of performing multiple higher effort speaking
styles, no investigation has examined the extent of the
carry-over phenomenon to date.
In summary, higher effort speaking styles, such as
clear or loud speech styles, yield robust changes in speech
production that vary to some extent with the specific
instruction used to elicit the speaking style (e.g., Lam
et al., 2012; Mefferd, 2017; Whitfield et al., 2021). How-
ever, adopting a clear and loud speech style also yields
system-wide changes that impact speech motor subsystems
that are not the direct target of the instruction (e.g.,
Huber & Chandrasekaran, 2006; Kearney et al., 2017;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Whitfield et al., 2018, 2021). Thus, carry-over effects are
likely when a talker performs multiple higher effort styles
in succession. Although authors have employed various
methodological choices that attempt to minimize carry-
over effects, little published data are available to deter-
mine the extent to which instruction order impacts the
response to subsequently performed speaking styles.
The current investigation aimed to examine the
extent to which instruction order impacts changes in
speech and voice production associated with adopting
clear and loud speech. Acoustic and kinematic measures
extracted from repetitions of the sentence, “Buy Bobby a
puppy,” were analyzed to address the aim of the study.
The key outcome variables used to examine adjustments
in phonation, articulation, and speech timing associated
with adopting these higher effort styles were speech inten-
sity, lip aperture range, and speech rate. Participants were
sorted into two equal counter-balanced groups in a quasi-
random fashion. Both talker groups repeated the sentence
using a habitual speaking style first. Next, one group pro-
duced the sentence using a clear style and then a loud
style, whereas the other group produced the sentence using
a loud style and then a clear style.
Based on the production–orientation adjustments
associated with clear and loud speech reported in prior
studies, we hypothesized that speech production changes
observed in one style would carry over to the subsequent
style. For instance, we expected that participants who per-
formed clear speech before loud speech would retain the
reductions in speech rate and increases in lip aperture
2Journal of Speech, Language, and Hearing Research 1–11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
range when transitioning to the loud style. Therefore, we
anticipated that participants who performed the habitual-
clear-loud order would exhibit slower articulation rates
and greater lip aperture ranges when producing the loud
stylecomparedwiththose whofollowedthe habitual-
loud-clear order. Relative to speech intensity, we pre-
dicted that increases in vocal loudness would be greater
in the loud style than in the clear style. This expectation
is primaril y based on the instruction used to elicit clear
speech, which emphasizes articulatory changes rather than
changes in phonation (e.g., Lam et al., 2012). Addition-
ally, we hypothesized that performing the habitual-clear-
loud order would result in greater intensity in the loud
style compared with the habitual-loud-clear order, as par-
ticipants may experience a constructive or additive effect,
leading to an increase in vocal loudness across each suc-
cessive style performed. Alternatively, performing the loud
style before the clear style might lead participants to carry
over the increases in vocal loud ness to the clear speech
style. However, increasing vocal loudness may not be the
primary goal of clea r speech, as some studies have
reported only slight-to-modest increases in intensity when
adopting clear speech (e.g., Kearney et al., 2017; Lam
et al., 2012; Searl & Evitts, 2013; Tjaden et al., 2014).
Method
Participants
Twenty-five student volunteers (23 women and two
men) were recruited from the Bowling Green State Uni-
versity (BGSU) student body to serve as participants in
this study. All protocols were approved by the Institu-
tional Review Board at BGSU.
Participants were assigned to one of two groups, a
Loud–Clear Group or Clear–Loud Group, in a quasi-
random fashion so that the groups were relatively even.
However, kinematic data from one participant were cor-
rupted, which yielded a final sample of 24 participants
that included 13 participants in the Loud–Clear Group
and 11 in the Clear–Loud Group. Participants ranged in
age from 19 to 23 years old (M = 21.1 years; SD =
0.97 years). All participants indicated they were healthy at
the time of recording. Additionally, participants self-
reported a history free of speech-language-hearing impair-
ment and did not have any neurological diagnoses that
can affect speech. During scheduling, participants were
instructed to arrive for testing with a freshly shaven face
or short facial hair (where applicable), and they were
instructed to arrive with minimal makeup or lotion on
their faces. These instructions were intended to facilitate
improved motion capture marker adhesion during data
collection. Each participant completed a single data collec-
tion session. Upon arrival for the session, an informed con-
sent document was provided to the participant for review.
Instrumentation
Participants were seated at a table in a sound-
attenuated booth during data collection. Speech intensity
(i.e., sound pressure level, SPL) was measured using a
Brüel & Kjær Type 2250-S sound level meter (SLM) that
was mounted to the table approximately 0.5 m in front
of the participant (Brüel & Kjær E ngineering Co.).
Speech audio was captured using a h ead-mounted micro-
phone (Shure Beta 53) and a high-quality microphone
pre-amplifier (Millennia HV-3D). A DC signal that was
proportional to the fast-weighted dB-A from the SLM
and the pre-amplified microphone signal were both
recorded using the ADInstruments PowerLab A-to-D
converter and LabChart software (ADInstruments, 2016).
Speech kinematics were captured via a seven-camera,
passive optical Qualysis Miquis motion capture system
(Qualisys AB, Göteborg, Sweden; Sampling Rate: 80 Hz).
The camera system was calibrated to track 4-mm spherical
markers covered with retroreflective tape (3M Manufactur-
ing Co.) within a volume approximately 3 ft wide × 3 ft deep
× 3 ft high. Seven passive reflective markers were placed on
the upper lip, lower lip, each oral commissure, and chin
(three markers) prior to data collection to track lip and jaw
movement. A local skull-based reference coordinate system
was created from three markers affixed to the midfrontal
ridge and each zygomatic arch to account for the influence
of head and body movements when tracking lip and jaw
kinematics (Whitfield et al., 2021). Three additional markers
were placed on the end of the SLM so that the exact mouth-
to-SLM distance could be derived from each sampling inter-
val. Figure 1 depicts a schematic of the marker placement.
The acoustic and kinematic data were synchronized
using a MOTU Audio Express sound card (Mark of the
Unicorn Co.), which streamed a common, analog Society
of Motion Picture and Television Engineers (SMPTE)
time-stamp signal to both the acoustic and motion capture
recording systems. The motion capture marker positions
and the analog SMPTE signal were recorded directly
through the motion capture system, which included a ded-
icated, synchronized analog-to-digital (A/D) device, and
the analog SLM and SMPTE signals were recorded using
a separate ADInstruments A/D board (ADInstruments
Co.). Motion capture marker positions were recorded at a
sampling frequency of 80 Hz, SLM signals were recorded
at a sampling frequency of 20 kHz (i.e., 250× motion cap-
ture), and SMPTE signals were recorded on each system
at a sampling frequency of 20 kHz to facilitate temporal
alignment of the recorded data.
Whitfield et al.: Order Affects Clear and Loud Speech 3
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
•
Figure 1. Schematic diagram showing the marker placement
configuration.
Protocol
During testing, participants repeated the sentence
“Buy Bobby a puppy” using habitual, loud, and clear
speech. The sentence was selected because it is short,
grammatically simple, and contains a sequence of bilabial
consonants and various vowels, making it ideal for
examining changes to lip and jaw movements associated
with adopting different speaking styles. For the habitual
style, participants were instructed to recite the phrase at
a “comfortable rate and loudness, as if you were speak-
ing to someone seated across the table.” For the clear
speech style, participants were instructed to “over-
enunciate each word in the phrase.” The over-enunciate
instruction was selected to elicit clear speech based on
research by Lam et al. (2012), which shows that an
instruction to over-enunciate seems to elicit changes in
speech articulation. For the loud speech style, partici-
pants were instructed to “speak twice as loud as comfort-
able.” All participan ts repeated the phrase in the habitual
style first.
Following the habitual style, participants completed
one of two different orders based on group assignment.
Participants in the Clear–Loud Group performed the repe-
tition task in the clear style and then in the loud style.
Those assigned to the Loud–Clear Group performed the
repetition task in the loud style and followed in the clear
style. Group assignment was quasi-random, with 11 par-
ticipants in the Clear–Loud Group and 13 participants in
the Loud–Clear Group. Under each of the three speech
style conditions, participants repeated the sentence consis-
tently in the specified style for approximately 15–20 s until
the experimenter asked them to stop. Given the differences
in speech rate between speakers, each talker produced a
different number of utterances. For this investigation, the
first 10 repetitions produced in each style were analyzed
to ensure a balanced number of tokens. Therefore, speech
intensity, lip aperture range, and speech rate (described
below) were measured for 30 tokens per talker, 10 in each
of the specified styles.
Data Processing
Custom software was developed using LabVIEW
2018 (National Instruments Co., 2018) to analyze the
common SMPTE time stamp recorded on each system
and to temporally align the SLM and motion capture
data. The average voltage of the fast dBA-weighted signal
from the SLM captured during the spoken phrase was
used to estimate speech intensity (SPL
mean
). The average
distance between the virtual midpoint of the upper lip and
lower lip markers and the SLM, d, was computed for each
utterance produced. This distance, d, was used to correct the
SPL
mean
estimates to a constant virtual distance of 0.5 m
using the following equation: Level = L
p@d
− 20*log
10
(0.5/d),
where L
p@d
was the average dB value, and d was the com-
puted distance between the SLM and the mouth (e.g., Švec
& Granqvist, 2018; Whitfield et al., 2021). During the spo-
ken utterance, the average mouth-to-SLM distance was rel-
atively stable, as the average of the standard variation in d
for each participant was 1.957 mm (SD = 1.130 mm).
Across all tokens, the mean mouth-to-SLM distance, d, for
each participant was 0.627 m (SD = 0.077 m; Range =
0.423 to 0.806 m). Descriptive statistics suggested that the
average distance, d, for each participant was not different
across the habitual (M = 0.627, SD = 0.081), loud (M =
0.630, SD = 0.087), or clear (M = 0.624, SD = 0.073) style.
The mean absolute difference between the original and cor-
rected values was 3.353% (SD = 1.576) or 2.012 dB SPL
(SD = 0.888).
The motion capture data were also analyzed using
custom NI LabVIEW 2018 software to evaluate speech
kinematics during each trial. All motion capture time
series data were first processed using a low-pass, second-
order Butterworth filter with phase correction and a
20 Hz cutoff frequency. Although this filtering approach
is conventional for human movement data processing,
additional visual inspection of the motion capture spectra
was conducted prior to filtering to verify that there was
no significant frequency response above 10 Hz. The three
motion capture markers affixed to the upper skull (i.e.,
midfrontal ridge and two zygomatic arches) were used to
develop a unique three-dimensional (3D) coordinate sys-
tem for the head, which would serve as a local spatial ref-
erence for movements of the markers surrounding the
4Journal of Speech, Language, and Hearing Research 1–11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
mouth. Without this consideration, movements of the lip
and jaw are combined with global movements of the skull,
obfuscating speech-specific kinematics and conflating head
mannerisms with functional movements of the mouth. The
3D positions of the markers surrounding the mouth (i.e.,
upper and lower lips and right and left mouth corners)
were rotated via matrix transformation into the head local
coordinate system, so that all speech kinematic measures
were relative to the skull. Next, lip aperture range of
motion (LA
ROM
) was determined for each trial by calculat-
ing the range of distances between the upper and lower lip
markers (i.e., lip aperture
max
– lip aperture
min
).
Speech rate was derived from the acoustic signal
using the waveform display in LabChart. The duration in
seconds of each utterance produced was extracted. The
speech rate of each utterance was computed as the num-
ber of syllables produced during each utterance of “Buy
Bobby a puppy” (i.e., 6) divided by the measured utter-
ance duration in seconds (i.e., syl/s).
Statistical Analysis
Three linear mixed model (LMM) analyses were
conducted to evaluate global changes in the primary
dependent variables associated with speech style and
order. The use of LMM analyses is preferable to tradi-
tional statistical analyses because of reduced sensitivity to
sphericity and homogeneity of variance (Quené & Van
den Bergh, 2004). Additionally, these models are better
suited for repeated-measures analysis of nested data struc-
tures than general linear models. Models were constructed
using SPL
mean
,LA
ROM
, and speech rate as the dependent
variables. For each model, the independent variables of
Group (Clear–Loud vs. Loud–Clear) and Speech Style
(habitual, loud, and clear) were specified as fixed effects,
including the interaction between Group and Style. Style
was specified as a random slope term, and participant was
specified as a random intercept. The intercept for each
model was initially mapped to the habitual speaking style
condition performed by the Loud–Clear Group. Relevel-
ing involves mapping the intercept to a different group or
style to examine a particular contrast. For this study, we
decided to relevel the models to quantify potential differ-
ences between groups and the pattern of change between
speaking styles. Random effects were examined, and indi-
vidual means were plotted to characterize individual varia-
tion in the pattern of results for each dependent variable.
Statistical analyses for this study were conducted using the
lmer and lmerTest packages in R (Version 4.2.1).
Prior to constructing the statistical models, the data
were screened for skewness and outliers. Measurement
observations that were greater than three standard devia-
tions from the mean were removed to ensure they did not
influence the model. Across the three dependent variables,
less than 3% of observations were excluded. After con-
structing each LMM, q-q, and residual plots were exam-
ined and confirmed an adequate fit of the model. An
alpha level of α < .05 and the 95% confidence intervals of
the estimates were used to guide statistical interpretation.
Results
SPL
To determine the extent to which style order
affected the speech intensity response, an LMM was con-
structed using SPL
mean
as the dependent variable. Parame-
ter estimates and associated model elements are depicted
in Figure 2a. The model revealed no differences between
the Clear–Loud Group and Loud–Clear Group for the
habitual style (Est. = 1.472 dB SPL, SE = 1.537, t[22.0] =
.958, p = .349). As expected, talkers in the Loud–Clear
Group exhibited increases in speech intensity for both the
loud (Est. = 8.772 dB SPL, SE = 1.201, t[22.0] = 7.302,
p < .001) and clear speech (Est. = 5.832 dB SPL, SE =
1.153, t[22.0] = 4.668, p < .001) styles compared with the
habitual style. Parameter estimates revealed that the rela-
tive effects in the Clear–Loud Group did not statistically
differ (p > .05). Releveling the model to the loud style
revealed that participants in the Clear–Loud Group exhib-
ited greater speech intensity for the loud style than partici-
pants in the Loud–Clear Group (Est. = 4.261 dB SPL,
SE = 1.468, t[22.0] = 2.903, p = .008). Finally, the rele-
veled LMM revealed that the intensity difference between
the loud and the clear style was greater for the partici-
pants in the Clear–Loud Group than those in the Loud–
Clear Group (Est.= −4.400 dB SPL, SE = 1.26, t[22.0] =
−3.483, p = .002).
This finding is confirmed in the percent difference
calculations presented in Table 1, which reports the aver-
age percent difference between the habitual and loud style,
habitual and clear style, and loud and clear style for each
group. Note that talkers in the Clear–Loud Group exhib-
ited a greater difference in dB SPL between the clear
and loud styles compared with talkers in the Loud–Clear
Group. Random effects estimates revealed that there was
significant variation in the style-related change between
participants (p < .001). The mean intensity for each partici-
pant in each style is plotted in Figure 3. This plot high-
lights the between-participant variability in these trends. In
general, the speech intensity values for the Clear–Loud
Group were more tightly clustered than in the Loud–Clear
Group. All talkers in the Clear–Loud group exhibited the
greatest speech intensity in the loud style, which was per-
formed last. Additionally, the difference in speech intensity
Whitfield et al.: Order Affects Clear and Loud Speech 5
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
•
between the loud and clear styles was greater for talkers
in the Clear–Loud Group compared with the Loud–Clear
Group. The majority of participants in the Loud–Clear
group exhibited a greater speech intensity in the loud
compared with the habitual style. However, two partici-
pants in the Loud–Clear Group exhibited speech intensity
in the Clear Style that was comparable with or greater
than their intensity in the loud style.
Figure 2. Estimated means and 95% confidence i nter vals for (a)
the mean of the fast, a-weighted sound pressure level (SPL
AF
),
(b) lip aperture (LA) range of motion in mm, and (c) speech
rate in syllables per second (syl/s) for the habitual, loud, and
clear speec h styles performed by the Loud–Clear Group (order:
habitual-loud-clear) and the Clear–Loud Group (order: habitual-
clear-loud).
These trends, along with fixed effects estimates, sug-
gest that the gains in speech intensity were greater for par-
ticipants who were instructed to overenunciate and then
speak twice as loud as comfortable compared with those
who received the instruction to speak twice as loud as
comfortable, followed by the instruction to overenunciate.
Individual level data confirmed this overall trend but also
indicated that two talkers in the Loud–Clear Group exhib-
ited similar or higher speech intensity in the clear and
loud styles.
Lip Aperture Range of Motion
An LMM analysis was completed using LA
ROM
as
the dependent variable to examine changes in articulatory
range of motion between styles. Parameter estimates are
plotted in Figure 2b. Fixed effects estimates revealed that
there was no difference in LA
ROM
between groups for the
habitual style (p = .492). As expected, LA
ROM
was signifi-
cantly affected by speaking style. The Loud–Clear Group
exhibited greater LA
ROM
for the loud (Est. = 4.554 mm,
SE = 1.139, t[22.0] = 3.998, p < .001) and clear speech
(Est. = 9.640 mm, SE = 1.656, t[22.0] = 5.821, p < .001)
styles compared with the habitual style. Parameter esti-
mates revealed that the relative changes between the habit-
ual and clear style for the Clear–Loud Group were statis-
tically similar (p > .05). Releveling the model revealed
that the talkers in the Loud–Clear Group exhibited a sig-
nificant increase in LA
ROM
from the loud to clear Speech
style (Est. = 5.086 mm, SE = 1.345, t[22.0] = 3.780, p =
.001). Conversely, talkers in the Clear–Loud Group exhib-
ited no differences in LA
ROM
between the clear and loud
styles (p= .517).
This finding is generally reflected in Table 1, which
indicates that the percent difference between the clear and
loud styles was larger, on average, for the Loud–Clear
Group compared with the Clear–Loud Group, although
between-participant variation was high. Random effect
estimates confirmed there was significant variation in the
style-related change in LA
ROM
between participants (p <
.001). The mean LA
ROM
for each participant in each style
is plotted in Figure 3 to highlight between-participant var-
iability in these trends. For all but one participant in the
Loud–Clear Group, LA
ROM
increased or was similar
between the loud and clear styles. The pattern in the
6Journal of Speech, Language, and Hearing Research 1–11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Clear–Loud Group was more variable than the Loud–
Clear Group. One participant in the Clear–Loud Group
exhibited little-to-no change in LA
ROM
across styles, and
one exhibited a reduction in LA
ROM
between the habitual
and loud styles. Seven of the 11 participants in the Clear–
Loud Group exhibited greater or comparable LA
ROM
in
the clear compared with the loud style. In comparison,
four exhibited greater LA
ROM
in the loud style.
Table 1. M and SD of the percent difference between each speaking style produced.
Measure Habitual vs. Loud Habitual vs. Clear Clear vs. Loud
dB SPL
Loud–Clear Group 15.7 (9.27) 9.65 (8.39) −5.09 (4.74)
Clear–Loud Group 19.81 (7.28) 6.52 (6.65) −11.02 (4.14)
LA
ROM
Loud–Clear Group 27.83 (16.35) 58.85 (45) 23.42 (25.39)
Clear–Loud Group 31.19 (32.94) 34.71 (28.51) 4.53 (17.09)
Speech rate
Loud–Clear Group −2.15 (8.19) −29.9 (17.87) −27.96 (18.25)
Clear–Loud Group −15.26 (14.16) −22.3 (17.54) −9.17 (10.87)
Note. The Loud–Clear Group performed the task in the habitual-loud-clear order, and the Clear–Loud Group performed the task in the
habitual-clear-loud order. SPL = sound pressure level; LA
ROM
= lip aperture range of motion.
In summary, talkers in the Loud–Clear Group, who
were instructed to produce the sentence first in the habit-
ual style, then in the loud style, and finally in the clear
style, exhibited significant increases in LA
ROM
with each
successive style. Parameter estimates indicated that talkers
in the Clear–Loud Group, who produced the phrase first
in the habitual style, then in the clear style, and finally in
the loud style, may have retained the increases in LA
ROM
associated with clear speech in the loud style. However,
examination of individual participant data indicated that
participants in the Clear–Loud Group exhibited a more
variable pattern, with some exhibiting the largest LA
ROM
in the loud style and others in the clear style.
Speech Rate
A final LMM analysis was conducted using speech
rate as the dependent variable to quantify changes in
speech timing. Parameter estimates are plotted in Figure
2c. No differences in speech rate were observed between
groups for the habitual style (p = .722). As expected,
speech rate significantly differed with speaking style. For
the Loud–Clear Group, a significant reduction in speech
rate was observed during the clear style (Est.= −1.47 syl/s,
SE =0.270, t[22.0] = −5.466, p <.001), compared with
the habitual style. No differences in speech rate were
observed between the loud and habitual styles (p =.530)
for participants in the Loud–Clear Group. This trend was
different from the Clear–Loud Group, who exhibited a sig-
nificantly slower speech rate in the loud style compared
with the Loud–Clear Group (Est.= −0.692 syl/s, SE =
0.265, t[22.0] = −2.611, p = .016). Releveling the model
revealed that compared with the habitual style, speech rates
produced by the Clear–Loud Group were slower for both
the clear (Est.= −1.154 syl/s, SE =0.293, t[22.0] = −3.934,
p < .001) and the loud (Est.= −0.806 syl/s, SE =0.195,
t[22.0] = −4.135, p < .001) styles.
This finding is confirmed in Table 1, which indicates
that the percent decrease in speech rate between the clear
and loud styles was larger for the Loud–Clear Group
compared with the Clear–Loud Group. Random effect
estimates revealed there was significant variation in the
style-related change in speech rate between participants
(p < .001). The mean speech rate for each participant in
each style is plotted in Figure 3c to highlight between-
participant variability in these trends. Seven of 13 partici-
pants in the Loud–Clear Group exhibited speaking rates
in the loud style that were within (±) 2% of their habitual
speech rates, and one exhibited a substantially faster rate
in the loud compared with the habitual style. The rema in-
der exhibited slower speaking rates in the loud style com-
pared with the habitual style. Additionally, 12 of 13 par-
ticipants in the Loud–Clear Group exhibited a substan-
tially slower speech rate in the clear compared with the
loud style. Alternatively, all but one participant in the
Clear–Loud Group exhibited a speaking rate in the loud
condition that was slower than their rate in the habitual
style. Additionally, the speaking rate differences between
the loud and clear styles were smaller for the Clear–Loud
Group compared with the Loud–Clear Group.
In summary, talkers who were first instructed to
speak twice as loud as comfortable and then over-
enunciate exhibited only a slight reduction, on average, in
speech rate from the loud to the habitual style and a sub-
stantial reduction in speech rate between the loud and
clear styles. Conversely, talkers who were first instructed
to over-enunciate and then to speak twice as loud as com-
fortable exhibited significantly slower rates in the loud
Whitfield et al.: Order Affects Clear and Loud Speech 7
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
•
style, likely reflecting a carry-over effect in the rate reduc-
tion associated with the clear style. Although there was
some intertalker variability, most participants followed
these trends.
Figure 3. Individual trends showing changes in (a) the mean of the
fast, a-weighted sound pressure level (SPL
AF
), (b) lip aperture (LA)
range of motion in mm, and (c) speech rate in syllables per second
(syl/s) for the habitual, loud, and clear speech styles for each par-
ticipant in the Loud–Clear Group (order: habitual-loud-clear) and
the Clear–Loud Group (order: habitual-clear-loud).
Discussion
The purpose of this investigation was to determine
the extent to which the order of speaking style perfor-
mance impacts the speech production response associated
with adopting higher effort speaking styles, namely, clear
and loud speech. In the current investigation, talkers were
assigned to two experimental groups. Participants assigned
to the Loud–Clear Group were first instructed to repeat
the phrase “Buy Bobby a puppy” at a comfortable rate
and loudness (the habitual style). The habitual style was
followed by an instruction to repeat the phrase twice as
loud as comfortable (the loud style) and, finally, to over-
enunciate each word (the clear style). Participants assigned
to the Clear–Loud Group first repeated the phrase in the
habitual style and were then instructed to repeat the
phrase and “over-enunciate” each word and, finally, to
repeat the phrase twice as loud as comfortable. The
groups exhibited significant differences in the pattern of
response as measured by speech intensity, articulatory
range of motion, and speech rate.
Specifically, group-level data suggest that talkers
who produced the habitual-clear-loud order exhibited
greater speech intensity during the loud style than partici-
pants who produced the habitual-loud-clear order. Rela-
tive to articulation, talkers who produced the habitual-
loud–clear order exhibited significant increases in lip aper-
ture range of motion between the habitual and loud and
the loud and clear styles. Conversely, talkers who pro-
duced the habitual-clear-loud order seemed to retain the
increases in lip aperture range of motion associated with
the clear style when performing the loud style. Relative to
rate, participants exhibited statistically similar reductions
in speaking rate between the clear and habitual styles.
Participants who performed the habitual-loud-clear order
exhibited only similar speaking rates in the habitual and
loud styles. Conversely, participants who performed the
habitual-clear-loud order exhibited a significantly slower
speaking rate in the loud than the habitual style, likely
indicating participants retained the decreases in speaking
rate associated with the clear style when performing the
loud style. Overall, the pattern of results suggests that
talkers carry over salient features of a specific higher
effort style into subsequent speaking styles.
The carry-over effects in this study were observed
regardless of which higher effort style was performed after
the habitual style. For example, the habitual-clear-loud
order led to a greater speech intensity in the loud condi-
tion than did the habitual-loud-clear order. Additionally,
the articulatory features associated with clear speech were
retained when the loud style was produced after the clear
style. Therefore, the carry-over effects observed seemed to
be constructive in nature, meaning that components of the
8Journal of Speech, Language, and Hearing Research 1–11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
previously performed style added to the features of the
subsequent style rather than negating any style-related
effects of the subsequent speech style. In support of this
claim, carry-over effects generally enhanced the changes in
speech intensity, lip and jaw displacements, and speaking
rate associated with the higher effort styles in comparison
to the habitual style.
Thus, although we do not suggest that carry-over
effects negate the findings reported in prior studies that
have examined several speaking styles using a within-
subject design, it is possible that speaking styles performed
later in the sequence exhibited additive carry-over effects
associated with the previously performed high-effort styles.
For example, Lam et al. (2012) found that hearing
impaired and over-enunciate conditions led to the greatest
acoustic changes in the phonatory and articulatory sys-
tems, respectively, compared with a generic instruction to
speak clearly. Although this is likely the case due to the
specificity of these instructions, it is possible that carry-
over effects from the generic clear speech condition influ-
enced the effects measured in the hearing impaired and
over-enunciate conditions, which were both produced
after the habitual and generic clear speech instruction.
Likewise, Mefferd (2017) examined changes in jaw and
tongue displacements between habitual, slow, loud, and
clear styles elicited in that order. Contrary to the hypoth-
esis outlined in the study, the loud style yielded similar
jaw displacements to the slow style, and the clear style
yielded the largest jaw displacements. Although differ-
ences in tongue- and jaw-specific patterns were observed
between styles, it is possible that carry-over effects made
these differences less clear. In both studies, the authors
mention order effects as a potential limitation and pro-
vide reasonable interpretations of the data. Data from
this investigation provide evidence of an order effect but
do not n egat e the findings of prior work. Instead, the
current data suggest that prior instruction may modulate
the degree of s tyle-related response observed following
subsequent instruction.
Individual Variability
The models revealed significant variation in the
style-related changes between participants for measures of
speech intensity, lip aperture range, and speech rate.
Although this was not the primary aim of the study, con-
tinuing to document individual variability associated with
adopting these higher effort styles remains an important
finding and ideological project in the literature (e.g.,
Ferguson, 2004; Ferguson & Kewley-Port, 2007; Koenig
& Fuchs, 2019; Perkell et al., 2002; Smiljanić &Bradlow,
2005;Whitfieldetal.,2018).Inthisstudy,individual
trend data may qualify the nature of the group-level
carry-over effects observed for some talkers. For exam-
ple, individual trends indicate that participants who per-
formed the habitual-loud-clear order exhibited variable
differences in lip aperture between the loud and clear
styles with some showing the largest range of motion in
the loud style and others in the clear style.
Another example of individual variation can be
observed in the speech intensity data. The fixed effect
parameter estimates revealed that performing the habitual-
clear-loud order led to increases in vocal loudness with
each successive style, yielding a greater intensity difference
between the loud and clear styles for this group compared
with the participants who performed the habitual-loud-
clear style. Although there were no between-group differ-
ences in speech intensity for the clear style, two talkers
who performed the habitual-loud-clear order exhibited
similar or higher dB SPL in the clear style compared with
the loud style. This may suggest that these talkers retained
the gains in speech intensity associated with loud speech
when performing the clear style. As in this study, authors
often report significant between-talker variation when
examining clear and loud speech, suggesting that talkers
adopt individual strategies when performing these higher
effort styles (e.g., Ferguson & Kewley-Port, 2007; Perkell
et al., 2002; Smiljanić & Bradlow, 2005; Whitfield et al.,
2018).
Caveats, Limitations, and
Potential Implications
There are potential caveats and limitations to the
current investigation. A key factor that must be consid-
ered when interpreting the current results is that the par-
ticipant sample was primarily composed of women. It is
possible that the magnitudes and patterns observed in the
current sample may differ across a range of socio-
demographic factors, including the full range of gender
identity and expression. Future work should adequately
include and report data on talkers from a range of socio-
demographic backgrounds.
Another consideration was that the protocol was
intended to maximize potential carry-over effects. Specifi-
cally, participants transitioned between successive speech
styles with little-to-no breaks. Additionally, a simple sen-
tence repetition task was used to examine speech produc-
tion. These methodological choices were made intention-
ally to determine the maximum degree of carry-over
effects that would be expected for this type of study.
Other authors have varied the speech stimuli and insti-
tuted protocols to account for potential carry-over effects
(e.g., Huber & Chandrasekaran, 2006; Kearney et al.,
2017; Lam et al., 2012; Whitfield et al., 2021). For exam-
ple, Lam et al. (2012) engaged participants in a
Whitfield et al.: Order Affects Clear and Loud Speech 9
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
•
conversation between each successive speaking style to
minimize carry-over effects. Therefore, one might expect
the degree of carry-over between successive styles to be
lower in an investigation that employs distractor proto-
cols. Future studies should examine the impact of such
mitigating strate gies on carry-over effects.
Additionally, data from this study may have some
initial, although limited, clinical utility. Specifically, this
study suggests that the use of successive instructions to
speak clearer and louder than usual has a compounding
carry-over effect that may be advantageous to clients with
hypophonia, articulatory imprecision, and a faster-than-
normal rate, which are often observed in talkers with
hypokinetic dysarthria. Future studies should continue to
systematically examine the utility of combining several
higher effort speaking styles during treatment sessions that
address multiple speech motor impairments (e.g., Lam &
Tjaden, 2016; Ramig et al., 2018).
Data Availability Statement
The datasets generated and/or analyzed during the
current study are not publicly available but are available
from the corresponding author on reasonable request.
Acknowledgments
Portions of this work were supported by internal
funding from Bowling Green State University received by
Jason Whitfield.
References
ADInstruments. (2016). LabChart (Version 7) [Computer
software].
Ferguson, S. H. (2004). Talker differences in clear and conversa-
tional speech: Vowel intelligibility for normal-hearing lis-
teners. The Journal of the Acoustical Society of America,
116(4, Pt. 1), 2365–2373. https://doi.org/10.1121/1.1788730
Ferguson, S. H., & Kewley-Port, D. (2007). Talker differences in
clear and conversational speech: Acoustic characteristics of
vowels. Journal of Speech, Language, and Hearing Research,
50(5), 1241–1255. https://doi.org/10.1044/1092-4388(2007/087)
Huber, J. E., & Chandrasekaran, B. (2006). Effects of increasing
sound pressure level on lip and jaw movement parameters
and consistency in young adults. Journal of Speech, Language,
and Hearing Research, 49(6), 1368–1379. https://doi.org/10.
1044/1092-4388(2006/098)
Kearney, E., Giles, R., Haworth, B., Faloutsos, P., Baljko, M., &
Yunusova, Y. (2017). Sentence-level movements in Parkinson’s
disease: Loud, clear, and slow speech. Journal of Speech, Lan-
guage, and Hearing Research, 60(12), 3426–3440. https://doi.
org/10.1044/2017_JSLHR-S-17-0075
Koenig,L.L.,& Fuchs,S. (2019). Vowel formants in normal and
loud speech. Journal of Speech, Language, and Hearing Research,
62(5), 1278–1295. https://doi.org/10.1044/2018_JSLHR-S-18-0043
Lam, J., & Tjaden, K. (2016). Clear speech variants: An acoustic
study in Parkinson’s disease. Journal of Speech, Language,
and Hearing Research, 59(4), 631–646. https://doi.org/10.1044/
2015_JSLHR-S-15-0216
Lam, J., Tjaden, K., & Wilding, G. (2012). Acoustics of clear
speech: Effect of instruction. Journal of Speech, Language,
and Hearing Research, 55(6), 1807–1821. https://doi.org/10.
1044/1092-4388(2012/11-0154)
Mefferd, A. S. (2017). Tongue-and jaw-specific contributions to
acoustic vowel contrast changes in the diphthong/ai/in
response to slow, loud, and clear speech. Journal of Speech,
Language, and Hearing Research, 60(11), 3144–3158. https://
doi.org/10.1044/2017_JSLHR-S-17-0114
Mefferd, A. S., & Green, J. R. (2010). Articulatory-to-acoustic
relations in response to speaking rate and loudness manipula-
tions. Journal of Speech, Language, and Hearing Research, 53(5),
1206–1219. https://doi.org/10.1044/1092-4388(2010/09-0083)
National Instruments Co. (2018). LabVIEW (Version 2018)
[Computer software].
Perkell, J. S., Zandipour, M., Matthies, M. L., & Lane, H.
(2002). Economy of effort in different speaking conditions. I.
A preliminary study of intersubject differences and modeling
issues. The Journal of the Acoustical Society of America,
112(4), 1627–1641. https://doi.org/10.1121/1.1506369
Picheny, M. A., Durlach, N. I., & Braida, L. D. (1986). Speaking
clearly for the hard of hearing II: Acoustic characteristics
of clear and conversational speech. Journal of Speech and
Hearing Research, 29(4), 434–446. https://doi.org/10.1044/jshr.
2904.434
Quené, H., & Van den Bergh, H. (2004). On multi-level modeling
of data from repeated measures designs: A tutorial. Speech
Communication, 43(1–2), 103–121. https://doi.org/10.1016/j.
specom.2004.02.004
Ramig, L., Halpern, A., Spielman, J., Fox, C., & Freeman, K.
(2018). Speech treatment in Parkinson’s disease: Randomized
controlled trial (RCT). Movement Disorders, 33(11), 1777–
1791. https://doi.org/10.1002/mds.27460
Searl, J., & Evitts, P. M. (2013). Tongue–palate contact pressure,
oral air pressure, and acoustics of clear speech. Journal of
Speech, Language, and Hearing Research, 56(3), 826–839.
https://doi.org/10.1044/1092-4388(2012/11-0337)
Smiljanić, R., & Bradlow, A. R. (2005). Production and percep-
tion of clear speech in Croatian and English. The Journal of
the Acoustical Society of America, 118(3), 1677–1688. https://
doi.org/10.1121/1.2000788
Smiljanić, R., & Gilbert, R. C. (2017). Intelligibility of noise-
adapted and clear speech in child, young adult, and older
adult talkers. Journal of Speech, Language, and Hearing
Research, 60(11), 3069–3080. https://doi.org/10.1044/2017_
JSLHR-S-16-0165
Švec, J. G., & Granqvist, S. (2018). Tutorial and guidelines on
measurement of sound pressure level in voice and speech.
Journal of Speech, Language, and Hearing Research, 61(3),
441–461. https://doi.org/10.1044/2017_JSLHR-S-17-0095
Tjaden, K., Lam, J., & Wilding, G. (2013). Vowel acoustics in
Parkinson’s disease and multiple sclerosis: Comparison of
clear, loud, and slow speaking conditions. Journal of Speech,
Language, and Hearing Research, 56(5), 1485–1502. https://
doi.org/10.1044/1092-4388(2013/12-0259)
Tjaden, K., Sussman, J. E., & Wilding, G. E. (2014). Impact of
clear, loud, and slow speech on scaled intelligibility and
10 Journal of Speech, Language, and Hearing Research 1–11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
speech severity in Parkinson’s disease and multiple sclero-
sis. Journal of Speech, Language, and Hearing Research,
57(3), 779–792. https://doi.org/10.1044/2014_JSLHR-S-12-
0372
Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipu-
lations in dysarthria: Acoustic and Perceptual Findings. Jour-
nal of Speech, Language, and Hearing Research, 47(4), 766–
783. https://doi.org/10.1044/1092-4388(2004/058)
Weisser, A., & Buchholz, J. M. (2019). Conversational speech
levels and signal-to-noise ratios in realistic acoustic conditions.
The Journal of the Acoustical Society of America, 145(1), 349–
360. https://doi.org/10.1121/1.5087567
Whitfield, J. A., Dromey, C., & Palmer, P. (2018). Examining
acoustic and kinematic measures of articulatory working
space: Effects of speech intensity. Journal of Speech, Lan-
guage, and Hearing Research, 61(5), 1104–1117. https://doi.
org/10.1044/2018_JSLHR-S-17-0388
Whitfield, J. A., Holdosh, S. R., Kriegel, Z., Sullivan, L. E., &
Fullenkamp, A. M. (2021). Tracking the costs of clear and
loud speech: Interactions between speech motor control and
concurrent visuomotor tracking. Journal of Speech, Language,
and Hearing Research, 64(6S), 2182–2195. https://doi.org/10.
1044/2020_JSLHR-20-00264
Whitfield, J. A., & Mehta, D. D. (2019) Examination of clear
speech in Parkinson disease using measures of working
vowel space. Journal of Speech, Language, and Hearing
Research 62(7), 2082–2098. https://doi.org/10.1044/2019_
JSLHR-S-MSC18-18-0189
Whitfield et al.: Order Affects Clear and Loud Speech 11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions