ArticlePDF Available

Abstract and Figures

Purpose: The purpose of this investigation was to examine the impact of instruction order on the speech production response when adopting higher effort speaking styles, specifically loud and clear speech. Method: Speech intensity, lip aperture range, and speech rate data were collected from 24 talkers who repeated the utterance "Buy Bobby a puppy" using habitual, clear, and loud speech. Participants were assigned in quasi-random fashion to one of two groups: a Clear-Loud Group (11 participants; order: habitual-clear-loud) or a Loud-Clear Group (13 participants; order: habitual-loud-clear). Results: Talkers in the Clear-Loud Group exhibited higher speech intensity during the loud style compared with those who performed the Loud-Clear Group. Furthermore, talkers in the Clear-Loud Group retained the increases lip aperture range and reductions in speech rate associated with the clear style when producing the loud style. Conversely, talkers in the Loud-Clear Group exhibited significant increases in lip aperture range between the habitual and loud styles and between the loud and clear styles. Additionally, the Loud-Clear Group exhibited a reduction in speech rate only during the clear style, as no differences in speech rate were observed between the habitual and loud styles. Conclusions: These findings may suggest that producing a higher effort style leads to carry-over effects in subsequent styles. Future research should investigate factors that moderate the degree of order effects for both research and clinical purposes. For instance, if generalizable, the compounding carry-over effects may prove advantageous for certain clinical populations.
Content may be subject to copyright.
Research Note
Order Affects Clear and Loud Speech Response
Jason A. Whitfield,
a
Adam M. Fullenkamp,
b
and Zoe Kriegel
c
a
Department of Communication Sciences and Disorders, Bowling Green State University, OH
b
School of Human Movement, Sport, and Leisure
Studies, Bowling Green State University, OH
c
Division of Communication Disorders, University of Wyoming, Laramie
AR T I CLE I N F O
Article History:
Received January 12, 2023
Revision received May 19, 2023
Accepted June 30, 2023
Editor-in-Chief: Cara E. Stepp
Editor: Kathleen F. Nagle
https://doi.org/10.1044/2023_JSLHR-23-00028
Correspondence to Jason A. Whitfield: jawhitf@bgsu.edu. Disclosure:
The first and second authors are employed by Bowling Green State
University. The third author is employed by the University of
Wyoming. There are no other financial or nonfinancial conflic ts of
interest to disclose.
AB ST R A C T
Purpose: The purpose of this investigation was to examine the impact of
instruction order on the speech production response when adopting higher
effort speaking styles, specifically loud and clear speech.
Method: Speech intensity, lip aperture range, and speech rate data were col-
lected from 24 talkers who repeated the utterance Buy Bobby a puppy using
habitual, clear, and loud speech. Participants were assigned in quasi-random
fashion to one of two groups: a ClearLoud Group (11 participants; order:
habitual-clear-loud) or a LoudClear Group (13 participants; order: habitual-
loud-clear).
Results: Talkers in the ClearLoud Group exhibited higher speech intensity dur-
ing the loud style compared with those who performed the LoudClear Group.
Furthermore, talkers in the ClearLoud Group retained the increases lip aperture
range and reductions in speech rate associated with the clear style when pro-
ducing the loud style. Conversely, talkers in the LoudClear Group exhibited
significant increases in lip aperture range between the habitual and loud styles
and between the loud and clear styles. Additionally, the LoudClear Group
exhibited a reduction in speech rate only during the clear style, as no differ-
ences in speech rate were observed between the habitual and loud styles.
Conclusions: These findings may suggest that producing a higher effort style
leads to carry-over effects in subsequent styles. Future research should investi-
gate factors that moderate the degree of order effects for both research and
clinical purposes. For instance, if generalizable, the compounding carry-over
effects may prove advantageous for certain clinical populations.
Instructions to speak clearer or louder than usual
result in relatively reliable modifications to the speech
signal (e.g., Kearney et al., 2017; Lam & Tjaden, 2016;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Tjaden et al., 2013; Tjaden & Wilding, 2004; Whitfield
et al., 2021). Clear speech, for example, can be naturally
elicited when a listener has difficul ty hearing or under-
standing. Loud speech can be elicited in noisy environ-
ments or when a talker is interacting with interlocutors
located at further than typical distances (e.g., Koenig &
Fuchs, 2019; Picheny et al., 1986; Smiljanić & Gilbert,
2017; Weis ser & Buchholz, 2019). Additionally, clear and
loud speech forms can be explicitly elicited in labora-
tory or clinical environments by instructing talkers to
speak clearer or louder than usual (e.g., Lam et al., 2012;
Smiljanić & Gilbert, 2017; Whitfield et al., 2018). The
speech production changes, typically associated with
adopting a clear or loud speech style, enhance the under-
standability of the speech signal, making these styles com-
mon talker-oriented strategies utilized in the treatment of
dysarthria (e.g., Kearney et al., 2017; Lam & Tjaden,
2016; Ramig et al., 2018; Tjaden et al., 2013).
Instructions to speak clearer than usual tend to yield
robust changes in acoustic and kinematic measures of
articulation and speaking rate that reflect the larger artic-
ulatory displacements and slower syllable rates (e.g., 10%
50% reduction in syllable rate when adopting a clear
speech style; Kearney et al., 2017; Lam & Tjaden, 2016;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Smiljanic & Gilbert, 2017; Tjaden et al., 2013; Whitfield
Journal of Speech, Language, and Hearing Research 111 Copyright © 2023 American Speech-Language-Hearing Association 1
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
et al., 2021; Whitfield & Mehta, 2019). Instructions to
speak louder than usual yield robust increases in speech
intensity (e.g., 815 dB increase in speech intensity when
adopting a louder speech style; Huber & Chandrasekaran,
2006; Kearney et al., 2017; Mefferd, 2017; Tjaden et al.,
2013; Whitfield et al., 2021). Therefore, existing data sug-
gest that talkers employ instruction-specific adjustments
when producing higher effort speaking styles.
However, data from these investigations also suggest
that adopting a higher effort speaking style is associated
with an overall increase in speech motor drive, leading to
changes across multiple speech subsystems, including res-
piration, phonation, articulation, and prosody. For exam-
ple, in addition to robust changes in articulatory kinemat-
ics and speaking rate, adopting a clearer-than-normal
speech style may lead to phonatory adjustments and
increases in speech intensity (e.g., 1.56 dB increase in
speech intensity; Kearney et al., 2017; Mefferd, 2017;
Whitfield et al., 2021). Additionally, adopting a louder-
than-comfortable intensity leads to changes in articulatory
kinematics characterized by larger articulatory displace-
ments, in addition to robust increases in speech intensity
(e.g., Huber & Chandrasekaran, 2006; Kearney et al.,
2017; Mefferd, 2017; Mefferd & Green, 2010; Whitfield
et al., 2018, 2021). Therefore, although specific instruc-
tions to speak clearer or louder than usually produced
robust changes in the targeted systems, general and
system-wide changes are also observed for these higher
effort speaking styles (e.g., Huber & Chandrasekaran,
2006; Kearney et al., 2017; Lam et al., 2012; Mefferd,
2017; Whitfield et al., 2018, 2021).
Many authors have used a within-subject design that
involves performing several different speech style modifi-
cations in succession when examining these higher effort
speech styles (e.g., Huber & Chandrasekaran, 2006;
Kearney et al., 2017; Lam & Tjaden, 2016; Lam et al.,
2012; Mefferd, 2017; Mefferd & Green, 2010; Smiljanic &
Gilbert, 2017; Tjaden et al., 2013; Tjaden & Wilding,
2004; Whitfield et al., 2021). Although these studies report
consistent differences among higher effort styles, there is
little uniformity in the methodological choices used to
address the potential carry-over effects that may arise
from producing multiple speech styles in succession.
Although all authors typically begin with the habitual,
conversational, or plain style, some proceed through a
fixed instruction order to elicit subsequent speaking styles
(e.g., Mefferd, 2017; Smiljanic & Gilbert, 2017), others
adopt a randomized order (e.g., Kearney et al., 2017), and
others counterbal ance the order of higher effort styles
among participants (e.g., Huber & Chandrasekaran, 2006;
Whitfield et al., 2021). For instance, in a study by Lam
et al. (2012), participants initially produced sentences in
their habitual style and then were instructed to read the
stimuli while speaking clearly. Following these instruc-
tions, the researchers counter-balanced instructions to
speak as if speaking to someone who has a hearing
impairment and overenunciate each word to examine
the impact of more descriptive and explicit clear speech
instructions. The researcher engaged participants in con-
versation between each speaking style to minimize carry-
over effects (Lam et al., 2012). However, despite the dif-
ferent approaches used to address the potential carry-
over effects of performing multiple higher effort speaking
styles, no investigation has examined the extent of the
carry-over phenomenon to date.
In summary, higher effort speaking styles, such as
clear or loud speech styles, yield robust changes in speech
production that vary to some extent with the specific
instruction used to elicit the speaking style (e.g., Lam
et al., 2012; Mefferd, 2017; Whitfield et al., 2021). How-
ever, adopting a clear and loud speech style also yields
system-wide changes that impact speech motor subsystems
that are not the direct target of the instruction (e.g.,
Huber & Chandrasekaran, 2006; Kearney et al., 2017;
Lam et al., 2012; Mefferd, 2017; Mefferd & Green, 2010;
Whitfield et al., 2018, 2021). Thus, carry-over effects are
likely when a talker performs multiple higher effort styles
in succession. Although authors have employed various
methodological choices that attempt to minimize carry-
over effects, little published data are available to deter-
mine the extent to which instruction order impacts the
response to subsequently performed speaking styles.
The current investigation aimed to examine the
extent to which instruction order impacts changes in
speech and voice production associated with adopting
clear and loud speech. Acoustic and kinematic measures
extracted from repetitions of the sentence, Buy Bobby a
puppy, were analyzed to address the aim of the study.
The key outcome variables used to examine adjustments
in phonation, articulation, and speech timing associated
with adopting these higher effort styles were speech inten-
sity, lip aperture range, and speech rate. Participants were
sorted into two equal counter-balanced groups in a quasi-
random fashion. Both talker groups repeated the sentence
using a habitual speaking style first. Next, one group pro-
duced the sentence using a clear style and then a loud
style, whereas the other group produced the sentence using
a loud style and then a clear style.
Based on the productionorientation adjustments
associated with clear and loud speech reported in prior
studies, we hypothesized that speech production changes
observed in one style would carry over to the subsequent
style. For instance, we expected that participants who per-
formed clear speech before loud speech would retain the
reductions in speech rate and increases in lip aperture
2Journal of Speech, Language, and Hearing Research 111
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
range when transitioning to the loud style. Therefore, we
anticipated that participants who performed the habitual-
clear-loud order would exhibit slower articulation rates
and greater lip aperture ranges when producing the loud
stylecomparedwiththose whofollowedthe habitual-
loud-clear order. Relative to speech intensity, we pre-
dicted that increases in vocal loudness would be greater
in the loud style than in the clear style. This expectation
is primaril y based on the instruction used to elicit clear
speech, which emphasizes articulatory changes rather than
changes in phonation (e.g., Lam et al., 2012). Addition-
ally, we hypothesized that performing the habitual-clear-
loud order would result in greater intensity in the loud
style compared with the habitual-loud-clear order, as par-
ticipants may experience a constructive or additive effect,
leading to an increase in vocal loudness across each suc-
cessive style performed. Alternatively, performing the loud
style before the clear style might lead participants to carry
over the increases in vocal loud ness to the clear speech
style. However, increasing vocal loudness may not be the
primary goal of clea r speech, as some studies have
reported only slight-to-modest increases in intensity when
adopting clear speech (e.g., Kearney et al., 2017; Lam
et al., 2012; Searl & Evitts, 2013; Tjaden et al., 2014).
Method
Participants
Twenty-five student volunteers (23 women and two
men) were recruited from the Bowling Green State Uni-
versity (BGSU) student body to serve as participants in
this study. All protocols were approved by the Institu-
tional Review Board at BGSU.
Participants were assigned to one of two groups, a
LoudClear Group or ClearLoud Group, in a quasi-
random fashion so that the groups were relatively even.
However, kinematic data from one participant were cor-
rupted, which yielded a final sample of 24 participants
that included 13 participants in the LoudClear Group
and 11 in the ClearLoud Group. Participants ranged in
age from 19 to 23 years old (M = 21.1 years; SD =
0.97 years). All participants indicated they were healthy at
the time of recording. Additionally, participants self-
reported a history free of speech-language-hearing impair-
ment and did not have any neurological diagnoses that
can affect speech. During scheduling, participants were
instructed to arrive for testing with a freshly shaven face
or short facial hair (where applicable), and they were
instructed to arrive with minimal makeup or lotion on
their faces. These instructions were intended to facilitate
improved motion capture marker adhesion during data
collection. Each participant completed a single data collec-
tion session. Upon arrival for the session, an informed con-
sent document was provided to the participant for review.
Instrumentation
Participants were seated at a table in a sound-
attenuated booth during data collection. Speech intensity
(i.e., sound pressure level, SPL) was measured using a
Brüel & Kjær Type 2250-S sound level meter (SLM) that
was mounted to the table approximately 0.5 m in front
of the participant (Brüel & Kjær E ngineering Co.).
Speech audio was captured using a h ead-mounted micro-
phone (Shure Beta 53) and a high-quality microphone
pre-amplifier (Millennia HV-3D). A DC signal that was
proportional to the fast-weighted dB-A from the SLM
and the pre-amplified microphone signal were both
recorded using the ADInstruments PowerLab A-to-D
converter and LabChart software (ADInstruments, 2016).
Speech kinematics were captured via a seven-camera,
passive optical Qualysis Miquis motion capture system
(Qualisys AB, Göteborg, Sweden; Sampling Rate: 80 Hz).
The camera system was calibrated to track 4-mm spherical
markers covered with retroreflective tape (3M Manufactur-
ing Co.) within a volume approximately 3 ft wide × 3 ft deep
× 3 ft high. Seven passive reflective markers were placed on
the upper lip, lower lip, each oral commissure, and chin
(three markers) prior to data collection to track lip and jaw
movement. A local skull-based reference coordinate system
was created from three markers affixed to the midfrontal
ridge and each zygomatic arch to account for the influence
of head and body movements when tracking lip and jaw
kinematics (Whitfield et al., 2021). Three additional markers
were placed on the end of the SLM so that the exact mouth-
to-SLM distance could be derived from each sampling inter-
val. Figure 1 depicts a schematic of the marker placement.
The acoustic and kinematic data were synchronized
using a MOTU Audio Express sound card (Mark of the
Unicorn Co.), which streamed a common, analog Society
of Motion Picture and Television Engineers (SMPTE)
time-stamp signal to both the acoustic and motion capture
recording systems. The motion capture marker positions
and the analog SMPTE signal were recorded directly
through the motion capture system, which included a ded-
icated, synchronized analog-to-digital (A/D) device, and
the analog SLM and SMPTE signals were recorded using
a separate ADInstruments A/D board (ADInstruments
Co.). Motion capture marker positions were recorded at a
sampling frequency of 80 Hz, SLM signals were recorded
at a sampling frequency of 20 kHz (i.e., 250× motion cap-
ture), and SMPTE signals were recorded on each system
at a sampling frequency of 20 kHz to facilitate temporal
alignment of the recorded data.
Whitfield et al.: Order Affects Clear and Loud Speech 3
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
Figure 1. Schematic diagram showing the marker placement
configuration.
Protocol
During testing, participants repeated the sentence
Buy Bobby a puppy using habitual, loud, and clear
speech. The sentence was selected because it is short,
grammatically simple, and contains a sequence of bilabial
consonants and various vowels, making it ideal for
examining changes to lip and jaw movements associated
with adopting different speaking styles. For the habitual
style, participants were instructed to recite the phrase at
a comfortable rate and loudness, as if you were speak-
ing to someone seated across the table. For the clear
speech style, participants were instructed to over-
enunciate each word in the phrase. The over-enunciate
instruction was selected to elicit clear speech based on
research by Lam et al. (2012), which shows that an
instruction to over-enunciate seems to elicit changes in
speech articulation. For the loud speech style, partici-
pants were instructed to speak twice as loud as comfort-
able. All participan ts repeated the phrase in the habitual
style first.
Following the habitual style, participants completed
one of two different orders based on group assignment.
Participants in the ClearLoud Group performed the repe-
tition task in the clear style and then in the loud style.
Those assigned to the LoudClear Group performed the
repetition task in the loud style and followed in the clear
style. Group assignment was quasi-random, with 11 par-
ticipants in the ClearLoud Group and 13 participants in
the LoudClear Group. Under each of the three speech
style conditions, participants repeated the sentence consis-
tently in the specified style for approximately 1520 s until
the experimenter asked them to stop. Given the differences
in speech rate between speakers, each talker produced a
different number of utterances. For this investigation, the
first 10 repetitions produced in each style were analyzed
to ensure a balanced number of tokens. Therefore, speech
intensity, lip aperture range, and speech rate (described
below) were measured for 30 tokens per talker, 10 in each
of the specified styles.
Data Processing
Custom software was developed using LabVIEW
2018 (National Instruments Co., 2018) to analyze the
common SMPTE time stamp recorded on each system
and to temporally align the SLM and motion capture
data. The average voltage of the fast dBA-weighted signal
from the SLM captured during the spoken phrase was
used to estimate speech intensity (SPL
mean
). The average
distance between the virtual midpoint of the upper lip and
lower lip markers and the SLM, d, was computed for each
utterance produced. This distance, d, was used to correct the
SPL
mean
estimates to a constant virtual distance of 0.5 m
using the following equation: Level = L
p@d
20*log
10
(0.5/d),
where L
p@d
was the average dB value, and d was the com-
puted distance between the SLM and the mouth (e.g., Švec
& Granqvist, 2018; Whitfield et al., 2021). During the spo-
ken utterance, the average mouth-to-SLM distance was rel-
atively stable, as the average of the standard variation in d
for each participant was 1.957 mm (SD = 1.130 mm).
Across all tokens, the mean mouth-to-SLM distance, d, for
each participant was 0.627 m (SD = 0.077 m; Range =
0.423 to 0.806 m). Descriptive statistics suggested that the
average distance, d, for each participant was not different
across the habitual (M = 0.627, SD = 0.081), loud (M =
0.630, SD = 0.087), or clear (M = 0.624, SD = 0.073) style.
The mean absolute difference between the original and cor-
rected values was 3.353% (SD = 1.576) or 2.012 dB SPL
(SD = 0.888).
The motion capture data were also analyzed using
custom NI LabVIEW 2018 software to evaluate speech
kinematics during each trial. All motion capture time
series data were first processed using a low-pass, second-
order Butterworth filter with phase correction and a
20 Hz cutoff frequency. Although this filtering approach
is conventional for human movement data processing,
additional visual inspection of the motion capture spectra
was conducted prior to filtering to verify that there was
no significant frequency response above 10 Hz. The three
motion capture markers affixed to the upper skull (i.e.,
midfrontal ridge and two zygomatic arches) were used to
develop a unique three-dimensional (3D) coordinate sys-
tem for the head, which would serve as a local spatial ref-
erence for movements of the markers surrounding the
4Journal of Speech, Language, and Hearing Research 111
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
mouth. Without this consideration, movements of the lip
and jaw are combined with global movements of the skull,
obfuscating speech-specific kinematics and conflating head
mannerisms with functional movements of the mouth. The
3D positions of the markers surrounding the mouth (i.e.,
upper and lower lips and right and left mouth corners)
were rotated via matrix transformation into the head local
coordinate system, so that all speech kinematic measures
were relative to the skull. Next, lip aperture range of
motion (LA
ROM
) was determined for each trial by calculat-
ing the range of distances between the upper and lower lip
markers (i.e., lip aperture
max
lip aperture
min
).
Speech rate was derived from the acoustic signal
using the waveform display in LabChart. The duration in
seconds of each utterance produced was extracted. The
speech rate of each utterance was computed as the num-
ber of syllables produced during each utterance of Buy
Bobby a puppy (i.e., 6) divided by the measured utter-
ance duration in seconds (i.e., syl/s).
Statistical Analysis
Three linear mixed model (LMM) analyses were
conducted to evaluate global changes in the primary
dependent variables associated with speech style and
order. The use of LMM analyses is preferable to tradi-
tional statistical analyses because of reduced sensitivity to
sphericity and homogeneity of variance (Quené & Van
den Bergh, 2004). Additionally, these models are better
suited for repeated-measures analysis of nested data struc-
tures than general linear models. Models were constructed
using SPL
mean
,LA
ROM
, and speech rate as the dependent
variables. For each model, the independent variables of
Group (ClearLoud vs. LoudClear) and Speech Style
(habitual, loud, and clear) were specified as fixed effects,
including the interaction between Group and Style. Style
was specified as a random slope term, and participant was
specified as a random intercept. The intercept for each
model was initially mapped to the habitual speaking style
condition performed by the LoudClear Group. Relevel-
ing involves mapping the intercept to a different group or
style to examine a particular contrast. For this study, we
decided to relevel the models to quantify potential differ-
ences between groups and the pattern of change between
speaking styles. Random effects were examined, and indi-
vidual means were plotted to characterize individual varia-
tion in the pattern of results for each dependent variable.
Statistical analyses for this study were conducted using the
lmer and lmerTest packages in R (Version 4.2.1).
Prior to constructing the statistical models, the data
were screened for skewness and outliers. Measurement
observations that were greater than three standard devia-
tions from the mean were removed to ensure they did not
influence the model. Across the three dependent variables,
less than 3% of observations were excluded. After con-
structing each LMM, q-q, and residual plots were exam-
ined and confirmed an adequate fit of the model. An
alpha level of α < .05 and the 95% confidence intervals of
the estimates were used to guide statistical interpretation.
Results
SPL
To determine the extent to which style order
affected the speech intensity response, an LMM was con-
structed using SPL
mean
as the dependent variable. Parame-
ter estimates and associated model elements are depicted
in Figure 2a. The model revealed no differences between
the ClearLoud Group and LoudClear Group for the
habitual style (Est. = 1.472 dB SPL, SE = 1.537, t[22.0] =
.958, p = .349). As expected, talkers in the LoudClear
Group exhibited increases in speech intensity for both the
loud (Est. = 8.772 dB SPL, SE = 1.201, t[22.0] = 7.302,
p < .001) and clear speech (Est. = 5.832 dB SPL, SE =
1.153, t[22.0] = 4.668, p < .001) styles compared with the
habitual style. Parameter estimates revealed that the rela-
tive effects in the ClearLoud Group did not statistically
differ (p > .05). Releveling the model to the loud style
revealed that participants in the ClearLoud Group exhib-
ited greater speech intensity for the loud style than partici-
pants in the LoudClear Group (Est. = 4.261 dB SPL,
SE = 1.468, t[22.0] = 2.903, p = .008). Finally, the rele-
veled LMM revealed that the intensity difference between
the loud and the clear style was greater for the partici-
pants in the ClearLoud Group than those in the Loud
Clear Group (Est.= 4.400 dB SPL, SE = 1.26, t[22.0] =
3.483, p = .002).
This finding is confirmed in the percent difference
calculations presented in Table 1, which reports the aver-
age percent difference between the habitual and loud style,
habitual and clear style, and loud and clear style for each
group. Note that talkers in the ClearLoud Group exhib-
ited a greater difference in dB SPL between the clear
and loud styles compared with talkers in the LoudClear
Group. Random effects estimates revealed that there was
significant variation in the style-related change between
participants (p < .001). The mean intensity for each partici-
pant in each style is plotted in Figure 3. This plot high-
lights the between-participant variability in these trends. In
general, the speech intensity values for the ClearLoud
Group were more tightly clustered than in the LoudClear
Group. All talkers in the ClearLoud group exhibited the
greatest speech intensity in the loud style, which was per-
formed last. Additionally, the difference in speech intensity
Whitfield et al.: Order Affects Clear and Loud Speech 5
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
between the loud and clear styles was greater for talkers
in the ClearLoud Group compared with the LoudClear
Group. The majority of participants in the LoudClear
group exhibited a greater speech intensity in the loud
compared with the habitual style. However, two partici-
pants in the LoudClear Group exhibited speech intensity
in the Clear Style that was comparable with or greater
than their intensity in the loud style.
Figure 2. Estimated means and 95% confidence i nter vals for (a)
the mean of the fast, a-weighted sound pressure level (SPL
AF
),
(b) lip aperture (LA) range of motion in mm, and (c) speech
rate in syllables per second (syl/s) for the habitual, loud, and
clear speec h styles performed by the LoudClear Group (order:
habitual-loud-clear) and the ClearLoud Group (order: habitual-
clear-loud).
These trends, along with fixed effects estimates, sug-
gest that the gains in speech intensity were greater for par-
ticipants who were instructed to overenunciate and then
speak twice as loud as comfortable compared with those
who received the instruction to speak twice as loud as
comfortable, followed by the instruction to overenunciate.
Individual level data confirmed this overall trend but also
indicated that two talkers in the LoudClear Group exhib-
ited similar or higher speech intensity in the clear and
loud styles.
Lip Aperture Range of Motion
An LMM analysis was completed using LA
ROM
as
the dependent variable to examine changes in articulatory
range of motion between styles. Parameter estimates are
plotted in Figure 2b. Fixed effects estimates revealed that
there was no difference in LA
ROM
between groups for the
habitual style (p = .492). As expected, LA
ROM
was signifi-
cantly affected by speaking style. The LoudClear Group
exhibited greater LA
ROM
for the loud (Est. = 4.554 mm,
SE = 1.139, t[22.0] = 3.998, p < .001) and clear speech
(Est. = 9.640 mm, SE = 1.656, t[22.0] = 5.821, p < .001)
styles compared with the habitual style. Parameter esti-
mates revealed that the relative changes between the habit-
ual and clear style for the ClearLoud Group were statis-
tically similar (p > .05). Releveling the model revealed
that the talkers in the LoudClear Group exhibited a sig-
nificant increase in LA
ROM
from the loud to clear Speech
style (Est. = 5.086 mm, SE = 1.345, t[22.0] = 3.780, p =
.001). Conversely, talkers in the ClearLoud Group exhib-
ited no differences in LA
ROM
between the clear and loud
styles (p= .517).
This finding is generally reflected in Table 1, which
indicates that the percent difference between the clear and
loud styles was larger, on average, for the LoudClear
Group compared with the ClearLoud Group, although
between-participant variation was high. Random effect
estimates confirmed there was significant variation in the
style-related change in LA
ROM
between participants (p <
.001). The mean LA
ROM
for each participant in each style
is plotted in Figure 3 to highlight between-participant var-
iability in these trends. For all but one participant in the
LoudClear Group, LA
ROM
increased or was similar
between the loud and clear styles. The pattern in the
6Journal of Speech, Language, and Hearing Research 111
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
ClearLoud Group was more variable than the Loud
Clear Group. One participant in the ClearLoud Group
exhibited little-to-no change in LA
ROM
across styles, and
one exhibited a reduction in LA
ROM
between the habitual
and loud styles. Seven of the 11 participants in the Clear
Loud Group exhibited greater or comparable LA
ROM
in
the clear compared with the loud style. In comparison,
four exhibited greater LA
ROM
in the loud style.
Table 1. M and SD of the percent difference between each speaking style produced.
Measure Habitual vs. Loud Habitual vs. Clear Clear vs. Loud
dB SPL
LoudClear Group 15.7 (9.27) 9.65 (8.39) 5.09 (4.74)
ClearLoud Group 19.81 (7.28) 6.52 (6.65) 11.02 (4.14)
LA
ROM
LoudClear Group 27.83 (16.35) 58.85 (45) 23.42 (25.39)
ClearLoud Group 31.19 (32.94) 34.71 (28.51) 4.53 (17.09)
Speech rate
LoudClear Group 2.15 (8.19) 29.9 (17.87) 27.96 (18.25)
ClearLoud Group 15.26 (14.16) 22.3 (17.54) 9.17 (10.87)
Note. The LoudClear Group performed the task in the habitual-loud-clear order, and the ClearLoud Group performed the task in the
habitual-clear-loud order. SPL = sound pressure level; LA
ROM
= lip aperture range of motion.
In summary, talkers in the LoudClear Group, who
were instructed to produce the sentence first in the habit-
ual style, then in the loud style, and finally in the clear
style, exhibited significant increases in LA
ROM
with each
successive style. Parameter estimates indicated that talkers
in the ClearLoud Group, who produced the phrase first
in the habitual style, then in the clear style, and finally in
the loud style, may have retained the increases in LA
ROM
associated with clear speech in the loud style. However,
examination of individual participant data indicated that
participants in the ClearLoud Group exhibited a more
variable pattern, with some exhibiting the largest LA
ROM
in the loud style and others in the clear style.
Speech Rate
A final LMM analysis was conducted using speech
rate as the dependent variable to quantify changes in
speech timing. Parameter estimates are plotted in Figure
2c. No differences in speech rate were observed between
groups for the habitual style (p = .722). As expected,
speech rate significantly differed with speaking style. For
the LoudClear Group, a significant reduction in speech
rate was observed during the clear style (Est.= 1.47 syl/s,
SE =0.270, t[22.0] = 5.466, p <.001), compared with
the habitual style. No differences in speech rate were
observed between the loud and habitual styles (p =.530)
for participants in the LoudClear Group. This trend was
different from the ClearLoud Group, who exhibited a sig-
nificantly slower speech rate in the loud style compared
with the LoudClear Group (Est.= 0.692 syl/s, SE =
0.265, t[22.0] = 2.611, p = .016). Releveling the model
revealed that compared with the habitual style, speech rates
produced by the ClearLoud Group were slower for both
the clear (Est.= 1.154 syl/s, SE =0.293, t[22.0] = 3.934,
p < .001) and the loud (Est.= 0.806 syl/s, SE =0.195,
t[22.0] = 4.135, p < .001) styles.
This finding is confirmed in Table 1, which indicates
that the percent decrease in speech rate between the clear
and loud styles was larger for the LoudClear Group
compared with the ClearLoud Group. Random effect
estimates revealed there was significant variation in the
style-related change in speech rate between participants
(p < .001). The mean speech rate for each participant in
each style is plotted in Figure 3c to highlight between-
participant variability in these trends. Seven of 13 partici-
pants in the LoudClear Group exhibited speaking rates
in the loud style that were within ) 2% of their habitual
speech rates, and one exhibited a substantially faster rate
in the loud compared with the habitual style. The rema in-
der exhibited slower speaking rates in the loud style com-
pared with the habitual style. Additionally, 12 of 13 par-
ticipants in the LoudClear Group exhibited a substan-
tially slower speech rate in the clear compared with the
loud style. Alternatively, all but one participant in the
ClearLoud Group exhibited a speaking rate in the loud
condition that was slower than their rate in the habitual
style. Additionally, the speaking rate differences between
the loud and clear styles were smaller for the ClearLoud
Group compared with the LoudClear Group.
In summary, talkers who were first instructed to
speak twice as loud as comfortable and then over-
enunciate exhibited only a slight reduction, on average, in
speech rate from the loud to the habitual style and a sub-
stantial reduction in speech rate between the loud and
clear styles. Conversely, talkers who were first instructed
to over-enunciate and then to speak twice as loud as com-
fortable exhibited significantly slower rates in the loud
Whitfield et al.: Order Affects Clear and Loud Speech 7
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
style, likely reflecting a carry-over effect in the rate reduc-
tion associated with the clear style. Although there was
some intertalker variability, most participants followed
these trends.
Figure 3. Individual trends showing changes in (a) the mean of the
fast, a-weighted sound pressure level (SPL
AF
), (b) lip aperture (LA)
range of motion in mm, and (c) speech rate in syllables per second
(syl/s) for the habitual, loud, and clear speech styles for each par-
ticipant in the LoudClear Group (order: habitual-loud-clear) and
the ClearLoud Group (order: habitual-clear-loud).
Discussion
The purpose of this investigation was to determine
the extent to which the order of speaking style perfor-
mance impacts the speech production response associated
with adopting higher effort speaking styles, namely, clear
and loud speech. In the current investigation, talkers were
assigned to two experimental groups. Participants assigned
to the LoudClear Group were first instructed to repeat
the phrase Buy Bobby a puppy at a comfortable rate
and loudness (the habitual style). The habitual style was
followed by an instruction to repeat the phrase twice as
loud as comfortable (the loud style) and, finally, to over-
enunciate each word (the clear style). Participants assigned
to the ClearLoud Group first repeated the phrase in the
habitual style and were then instructed to repeat the
phrase and over-enunciate each word and, finally, to
repeat the phrase twice as loud as comfortable. The
groups exhibited significant differences in the pattern of
response as measured by speech intensity, articulatory
range of motion, and speech rate.
Specifically, group-level data suggest that talkers
who produced the habitual-clear-loud order exhibited
greater speech intensity during the loud style than partici-
pants who produced the habitual-loud-clear order. Rela-
tive to articulation, talkers who produced the habitual-
loudclear order exhibited significant increases in lip aper-
ture range of motion between the habitual and loud and
the loud and clear styles. Conversely, talkers who pro-
duced the habitual-clear-loud order seemed to retain the
increases in lip aperture range of motion associated with
the clear style when performing the loud style. Relative to
rate, participants exhibited statistically similar reductions
in speaking rate between the clear and habitual styles.
Participants who performed the habitual-loud-clear order
exhibited only similar speaking rates in the habitual and
loud styles. Conversely, participants who performed the
habitual-clear-loud order exhibited a significantly slower
speaking rate in the loud than the habitual style, likely
indicating participants retained the decreases in speaking
rate associated with the clear style when performing the
loud style. Overall, the pattern of results suggests that
talkers carry over salient features of a specific higher
effort style into subsequent speaking styles.
The carry-over effects in this study were observed
regardless of which higher effort style was performed after
the habitual style. For example, the habitual-clear-loud
order led to a greater speech intensity in the loud condi-
tion than did the habitual-loud-clear order. Additionally,
the articulatory features associated with clear speech were
retained when the loud style was produced after the clear
style. Therefore, the carry-over effects observed seemed to
be constructive in nature, meaning that components of the
8Journal of Speech, Language, and Hearing Research 111
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
previously performed style added to the features of the
subsequent style rather than negating any style-related
effects of the subsequent speech style. In support of this
claim, carry-over effects generally enhanced the changes in
speech intensity, lip and jaw displacements, and speaking
rate associated with the higher effort styles in comparison
to the habitual style.
Thus, although we do not suggest that carry-over
effects negate the findings reported in prior studies that
have examined several speaking styles using a within-
subject design, it is possible that speaking styles performed
later in the sequence exhibited additive carry-over effects
associated with the previously performed high-effort styles.
For example, Lam et al. (2012) found that hearing
impaired and over-enunciate conditions led to the greatest
acoustic changes in the phonatory and articulatory sys-
tems, respectively, compared with a generic instruction to
speak clearly. Although this is likely the case due to the
specificity of these instructions, it is possible that carry-
over effects from the generic clear speech condition influ-
enced the effects measured in the hearing impaired and
over-enunciate conditions, which were both produced
after the habitual and generic clear speech instruction.
Likewise, Mefferd (2017) examined changes in jaw and
tongue displacements between habitual, slow, loud, and
clear styles elicited in that order. Contrary to the hypoth-
esis outlined in the study, the loud style yielded similar
jaw displacements to the slow style, and the clear style
yielded the largest jaw displacements. Although differ-
ences in tongue- and jaw-specific patterns were observed
between styles, it is possible that carry-over effects made
these differences less clear. In both studies, the authors
mention order effects as a potential limitation and pro-
vide reasonable interpretations of the data. Data from
this investigation provide evidence of an order effect but
do not n egat e the findings of prior work. Instead, the
current data suggest that prior instruction may modulate
the degree of s tyle-related response observed following
subsequent instruction.
Individual Variability
The models revealed significant variation in the
style-related changes between participants for measures of
speech intensity, lip aperture range, and speech rate.
Although this was not the primary aim of the study, con-
tinuing to document individual variability associated with
adopting these higher effort styles remains an important
finding and ideological project in the literature (e.g.,
Ferguson, 2004; Ferguson & Kewley-Port, 2007; Koenig
& Fuchs, 2019; Perkell et al., 2002; Smiljanić &Bradlow,
2005;Whitfieldetal.,2018).Inthisstudy,individual
trend data may qualify the nature of the group-level
carry-over effects observed for some talkers. For exam-
ple, individual trends indicate that participants who per-
formed the habitual-loud-clear order exhibited variable
differences in lip aperture between the loud and clear
styles with some showing the largest range of motion in
the loud style and others in the clear style.
Another example of individual variation can be
observed in the speech intensity data. The fixed effect
parameter estimates revealed that performing the habitual-
clear-loud order led to increases in vocal loudness with
each successive style, yielding a greater intensity difference
between the loud and clear styles for this group compared
with the participants who performed the habitual-loud-
clear style. Although there were no between-group differ-
ences in speech intensity for the clear style, two talkers
who performed the habitual-loud-clear order exhibited
similar or higher dB SPL in the clear style compared with
the loud style. This may suggest that these talkers retained
the gains in speech intensity associated with loud speech
when performing the clear style. As in this study, authors
often report significant between-talker variation when
examining clear and loud speech, suggesting that talkers
adopt individual strategies when performing these higher
effort styles (e.g., Ferguson & Kewley-Port, 2007; Perkell
et al., 2002; Smiljanić & Bradlow, 2005; Whitfield et al.,
2018).
Caveats, Limitations, and
Potential Implications
There are potential caveats and limitations to the
current investigation. A key factor that must be consid-
ered when interpreting the current results is that the par-
ticipant sample was primarily composed of women. It is
possible that the magnitudes and patterns observed in the
current sample may differ across a range of socio-
demographic factors, including the full range of gender
identity and expression. Future work should adequately
include and report data on talkers from a range of socio-
demographic backgrounds.
Another consideration was that the protocol was
intended to maximize potential carry-over effects. Specifi-
cally, participants transitioned between successive speech
styles with little-to-no breaks. Additionally, a simple sen-
tence repetition task was used to examine speech produc-
tion. These methodological choices were made intention-
ally to determine the maximum degree of carry-over
effects that would be expected for this type of study.
Other authors have varied the speech stimuli and insti-
tuted protocols to account for potential carry-over effects
(e.g., Huber & Chandrasekaran, 2006; Kearney et al.,
2017; Lam et al., 2012; Whitfield et al., 2021). For exam-
ple, Lam et al. (2012) engaged participants in a
Whitfield et al.: Order Affects Clear and Loud Speech 9
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
conversation between each successive speaking style to
minimize carry-over effects. Therefore, one might expect
the degree of carry-over between successive styles to be
lower in an investigation that employs distractor proto-
cols. Future studies should examine the impact of such
mitigating strate gies on carry-over effects.
Additionally, data from this study may have some
initial, although limited, clinical utility. Specifically, this
study suggests that the use of successive instructions to
speak clearer and louder than usual has a compounding
carry-over effect that may be advantageous to clients with
hypophonia, articulatory imprecision, and a faster-than-
normal rate, which are often observed in talkers with
hypokinetic dysarthria. Future studies should continue to
systematically examine the utility of combining several
higher effort speaking styles during treatment sessions that
address multiple speech motor impairments (e.g., Lam &
Tjaden, 2016; Ramig et al., 2018).
Data Availability Statement
The datasets generated and/or analyzed during the
current study are not publicly available but are available
from the corresponding author on reasonable request.
Acknowledgments
Portions of this work were supported by internal
funding from Bowling Green State University received by
Jason Whitfield.
References
ADInstruments. (2016). LabChart (Version 7) [Computer
software].
Ferguson, S. H. (2004). Talker differences in clear and conversa-
tional speech: Vowel intelligibility for normal-hearing lis-
teners. The Journal of the Acoustical Society of America,
116(4, Pt. 1), 23652373. https://doi.org/10.1121/1.1788730
Ferguson, S. H., & Kewley-Port, D. (2007). Talker differences in
clear and conversational speech: Acoustic characteristics of
vowels. Journal of Speech, Language, and Hearing Research,
50(5), 12411255. https://doi.org/10.1044/1092-4388(2007/087)
Huber, J. E., & Chandrasekaran, B. (2006). Effects of increasing
sound pressure level on lip and jaw movement parameters
and consistency in young adults. Journal of Speech, Language,
and Hearing Research, 49(6), 13681379. https://doi.org/10.
1044/1092-4388(2006/098)
Kearney, E., Giles, R., Haworth, B., Faloutsos, P., Baljko, M., &
Yunusova, Y. (2017). Sentence-level movements in Parkinsons
disease: Loud, clear, and slow speech. Journal of Speech, Lan-
guage, and Hearing Research, 60(12), 34263440. https://doi.
org/10.1044/2017_JSLHR-S-17-0075
Koenig,L.L.,& Fuchs,S. (2019). Vowel formants in normal and
loud speech. Journal of Speech, Language, and Hearing Research,
62(5), 12781295. https://doi.org/10.1044/2018_JSLHR-S-18-0043
Lam, J., & Tjaden, K. (2016). Clear speech variants: An acoustic
study in Parkinsons disease. Journal of Speech, Language,
and Hearing Research, 59(4), 631646. https://doi.org/10.1044/
2015_JSLHR-S-15-0216
Lam, J., Tjaden, K., & Wilding, G. (2012). Acoustics of clear
speech: Effect of instruction. Journal of Speech, Language,
and Hearing Research, 55(6), 18071821. https://doi.org/10.
1044/1092-4388(2012/11-0154)
Mefferd, A. S. (2017). Tongue-and jaw-specific contributions to
acoustic vowel contrast changes in the diphthong/ai/in
response to slow, loud, and clear speech. Journal of Speech,
Language, and Hearing Research, 60(11), 31443158. https://
doi.org/10.1044/2017_JSLHR-S-17-0114
Mefferd, A. S., & Green, J. R. (2010). Articulatory-to-acoustic
relations in response to speaking rate and loudness manipula-
tions. Journal of Speech, Language, and Hearing Research, 53(5),
12061219. https://doi.org/10.1044/1092-4388(2010/09-0083)
National Instruments Co. (2018). LabVIEW (Version 2018)
[Computer software].
Perkell, J. S., Zandipour, M., Matthies, M. L., & Lane, H.
(2002). Economy of effort in different speaking conditions. I.
A preliminary study of intersubject differences and modeling
issues. The Journal of the Acoustical Society of America,
112(4), 16271641. https://doi.org/10.1121/1.1506369
Picheny, M. A., Durlach, N. I., & Braida, L. D. (1986). Speaking
clearly for the hard of hearing II: Acoustic characteristics
of clear and conversational speech. Journal of Speech and
Hearing Research, 29(4), 434446. https://doi.org/10.1044/jshr.
2904.434
Quené, H., & Van den Bergh, H. (2004). On multi-level modeling
of data from repeated measures designs: A tutorial. Speech
Communication, 43(12), 103121. https://doi.org/10.1016/j.
specom.2004.02.004
Ramig, L., Halpern, A., Spielman, J., Fox, C., & Freeman, K.
(2018). Speech treatment in Parkinsons disease: Randomized
controlled trial (RCT). Movement Disorders, 33(11), 1777
1791. https://doi.org/10.1002/mds.27460
Searl, J., & Evitts, P. M. (2013). Tonguepalate contact pressure,
oral air pressure, and acoustics of clear speech. Journal of
Speech, Language, and Hearing Research, 56(3), 826839.
https://doi.org/10.1044/1092-4388(2012/11-0337)
Smiljanić, R., & Bradlow, A. R. (2005). Production and percep-
tion of clear speech in Croatian and English. The Journal of
the Acoustical Society of America, 118(3), 16771688. https://
doi.org/10.1121/1.2000788
Smiljanić, R., & Gilbert, R. C. (2017). Intelligibility of noise-
adapted and clear speech in child, young adult, and older
adult talkers. Journal of Speech, Language, and Hearing
Research, 60(11), 30693080. https://doi.org/10.1044/2017_
JSLHR-S-16-0165
Švec, J. G., & Granqvist, S. (2018). Tutorial and guidelines on
measurement of sound pressure level in voice and speech.
Journal of Speech, Language, and Hearing Research, 61(3),
441461. https://doi.org/10.1044/2017_JSLHR-S-17-0095
Tjaden, K., Lam, J., & Wilding, G. (2013). Vowel acoustics in
Parkinsons disease and multiple sclerosis: Comparison of
clear, loud, and slow speaking conditions. Journal of Speech,
Language, and Hearing Research, 56(5), 14851502. https://
doi.org/10.1044/1092-4388(2013/12-0259)
Tjaden, K., Sussman, J. E., & Wilding, G. E. (2014). Impact of
clear, loud, and slow speech on scaled intelligibility and
10 Journal of Speech, Language, and Hearing Research 111
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
speech severity in Parkinsons disease and multiple sclero-
sis. Journal of Speech, Language, and Hearing Research,
57(3), 779792. https://doi.org/10.1044/2014_JSLHR-S-12-
0372
Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipu-
lations in dysarthria: Acoustic and Perceptual Findings. Jour-
nal of Speech, Language, and Hearing Research, 47(4), 766
783. https://doi.org/10.1044/1092-4388(2004/058)
Weisser, A., & Buchholz, J. M. (2019). Conversational speech
levels and signal-to-noise ratios in realistic acoustic conditions.
The Journal of the Acoustical Society of America, 145(1), 349
360. https://doi.org/10.1121/1.5087567
Whitfield, J. A., Dromey, C., & Palmer, P. (2018). Examining
acoustic and kinematic measures of articulatory working
space: Effects of speech intensity. Journal of Speech, Lan-
guage, and Hearing Research, 61(5), 11041117. https://doi.
org/10.1044/2018_JSLHR-S-17-0388
Whitfield, J. A., Holdosh, S. R., Kriegel, Z., Sullivan, L. E., &
Fullenkamp, A. M. (2021). Tracking the costs of clear and
loud speech: Interactions between speech motor control and
concurrent visuomotor tracking. Journal of Speech, Language,
and Hearing Research, 64(6S), 21822195. https://doi.org/10.
1044/2020_JSLHR-20-00264
Whitfield, J. A., & Mehta, D. D. (2019) Examination of clear
speech in Parkinson disease using measures of working
vowel space. Journal of Speech, Language, and Hearing
Research 62(7), 20822098. https://doi.org/10.1044/2019_
JSLHR-S-MSC18-18-0189
Whitfield et al.: Order Affects Clear and Loud Speech 11
Downloaded from: https://pubs.asha.org Bowling Green State University on 09/12/2023, Terms of Use: https://pubs.asha.org/pubs/rights_and_permissions
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Purpose Prior work has demonstrated that competing tasks impact habitual speech production. The purpose of this investigation was to quantify the extent to which clear and loud speech are affected by concurrent performance of an attention-demanding task. Method Speech kinematics and acoustics were collected while participants spoke using habitual, loud, and clear speech styles. The styles were performed in isolation and while performing a secondary tracking task. Results Compared to the habitual style, speakers exhibited expected increases in lip aperture range of motion and speech intensity for the clear and loud styles. During concurrent visuomotor tracking, there was a decrease in lip aperture range of motion and speech intensity for the habitual style. Tracking performance during habitual speech did not differ from single-task tracking. For loud and clear speech, speakers retained the gains in speech intensity and range of motion, respectively, while concurrently tracking. A reduction in tracking performance was observed during concurrent loud and clear speech, compared to tracking alone. Conclusions These data suggest that loud and clear speech may help to mitigate motor interference associated with concurrent performance of an attention-demanding task. Additionally, reductions in tracking accuracy observed during concurrent loud and clear speech may suggest that these higher effort speaking styles require greater attentional resources than habitual speech.
Article
Full-text available
Purpose The purpose of the current study was to characterize clear speech production for speakers with and without Parkinson disease (PD) using several measures of working vowel space computed from frequently sampled formant trajectories. Method The 1st 2 formant frequencies were tracked for a reading passage that was produced using habitual and clear speaking styles by 15 speakers with PD and 15 healthy control speakers. Vowel space metrics were calculated from the distribution of frequently sampled formant frequency tracks, including vowel space hull area, articulatory–acoustic vowel space, and multiple vowel space density (VSD) measures based on different percentile contours of the formant density distribution. Results Both speaker groups exhibited significant increases in the articulatory–acoustic vowel space and VSD 10 , the area of the outermost (10th percentile) contour of the formant density distribution, from habitual to clear styles. These clarity-related vowel space increases were significantly smaller for speakers with PD than controls. Both groups also exhibited a significant increase in vowel space hull area; however, this metric was not sensitive to differences in the clear speech response between groups. Relative to healthy controls, speakers with PD exhibited a significantly smaller VSD 90 , the area of the most central (90th percentile), densely populated region of the formant space. Conclusions Using vowel space metrics calculated from formant traces of the reading passage, the current work suggests that speakers with PD do indeed reach the more peripheral regions of the vowel space during connected speech but spend a larger percentage of the time in more central regions of formant space than healthy speakers. Additionally, working vowel space metrics based on the distribution of formant data suggested that speakers with PD exhibited less of a clarity-related increase in formant space than controls, a trend that was not observed for perimeter-based measures of vowel space area.
Article
Full-text available
Purpose: This study evaluated how 1st and 2nd vowel formant frequencies (F1, F2) differ between normal and loud speech in multiple speaking tasks to assess claims that loudness leads to exaggerated vowel articulation. Method: Eleven healthy German-speaking women produced normal and loud speech in 3 tasks that varied in the degree of spontaneity: reading sentences that contained isolated /i: a: u:/, responding to questions that included target words with controlled consonantal contexts but varying vowel qualities, and a recipe recall task. Loudness variation was elicited naturalistically by changing interlocutor distance. First and 2nd formant frequencies and average sound pressure level were obtained from the stressed vowels in the target words, and vowel space area was calculated from /i: a: u:/. Results: Comparisons across many vowels indicated that high, tense vowels showed limited formant variation as a function of loudness. Analysis of /i: a: u:/ across speech tasks revealed vowel space reduction in the recipe retell task compared to the other 2. Loudness changes for F1 were consistent in direction but variable in extent, with few significant results for high tense vowels. Results for F2 were quite varied and frequently not significant. Speakers differed in how loudness and task affected formant values. Finally, correlations between sound pressure level and F1 were generally positive but varied in magnitude across vowels, with the high tense vowels showing very flat slopes. Discussion: These data indicate that naturalistically elicited loud speech in typical speakers does not always lead to changes in vowel formant frequencies and call into question the notion that increasing loudness is necessarily an automatic method of expanding the vowel space.
Article
Full-text available
Background As many as 89% of people with Parkinson's disease (PD) develop speech disorders. Objectives This randomized controlled trial evaluated two speech treatments for PD matched in intensive dosage and high‐effort mode of delivery, differing in subsystem target: voice (respiratory‐laryngeal) versus articulation (orofacial‐articulatory). Methods PD participants were randomized to 1‐month LSVT LOUD (voice), LSVT ARTIC (articulation), or UNTXPD (untreated) groups. Speech clinicians specializing in PD delivered treatment. Primary outcome was sound pressure level (SPL) in reading and spontaneous speech, and secondary outcome was participant‐reported Modified Communication Effectiveness Index (CETI‐M), evaluated at baseline, 1, and 7 months. Healthy controls were matched by age and sex. Results At baseline, the combined PD group (n = 64) was significantly worse than healthy controls (n = 20) for SPL (P < 0.05) and CETI‐M (P = 0.0001). At 1 and 7 months, SPL between‐group comparisons showed greater improvements for LSVT LOUD (n = 22) than LSVT ARTIC (n = 20; P < 0.05) and UNTXPD (n = 22; P < 0.05). Sound pressure level differences between LSVT ARTIC and UNTXPD at 1 and 7 months were not significant (P > 0.05). For CETI‐M, between‐group comparisons showed greater improvements for LSVT LOUD and LSVT ARTIC than UNTXPD at 1 month (P = 0.02; P = 0.02). At 7 months, CETI‐M between‐group differences were not significant (P = 0.08). Within‐group CETI‐M improvements for LSVT LOUD were maintained through 7 months (P = 0.0011). Conclusions LSVT LOUD showed greater improvements than both LSVT ARTIC and UNTXPD for SPL at 1 and 7 months. For CETI‐M, both LSVT LOUD and LSVT ARTIC improved at 1 month relative to UNTXPD. Only LSVT LOUD maintained CETI‐M improvements at 7 months. © 2018 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society
Article
Full-text available
Purpose: The purpose of this study was to examine the effect of speech intensity on acoustic and kinematic vowel space measures and conduct a preliminary examination of the relationship between kinematic and acoustic vowel space metrics calculated from continuously sampled lingual marker and formant traces. Method: Young adult speakers produced 3 repetitions of 2 different sentences at 3 different loudness levels. Lingual kinematic and acoustic signals were collected and analyzed. Acoustic and kinematic variants of several vowel space metrics were calculated from the formant frequencies and the position of 2 lingual markers. Traditional metrics included triangular vowel space area and the vowel articulation index. Acoustic and kinematic variants of sentence-level metrics based on the articulatory-acoustic vowel space and the vowel space hull area were also calculated. Results: Both acoustic and kinematic variants of the sentence-level metrics significantly increased with an increase in loudness, whereas no statistically significant differences in traditional vowel-point metrics were observed for either the kinematic or acoustic variants across the 3 loudness conditions. In addition, moderate-to-strong relationships between the acoustic and kinematic variants of the sentence-level vowel space metrics were observed for the majority of participants. Conclusions: These data suggest that both kinematic and acoustic vowel space metrics that reflect the dynamic contributions of both consonant and vowel segments are sensitive to within-speaker changes in articulation associated with manipulations of speech intensity.
Article
Full-text available
Purpose: This study examined intelligibility of conversational and clear speech sentences produced in quiet and in noise by children, young adults, and older adults. Relative talker intelligibility was assessed across speaking styles. Method: Sixty-one young adult participants listened to sentences mixed with speech-shaped noise at -5 dB signal-to-noise ratio. The analyses examined percent correct scores across conversational, clear, and noise-adapted conditions and the three talker groups. Correlation analyses examined whether talker intelligibility is consistent across speaking style adaptations. Results: Noise-adapted and clear speech significantly enhanced intelligibility for young adult listeners. The intelligibility improvement varied across the three talker groups. Notably, intelligibility benefit was smallest for children's speaking style modifications. Listeners also perceived speech produced in noise by older adults to be less intelligible compared to the younger talkers. Talker intelligibility was correlated strongly between conversational and clear speech in quiet, but not for conversational speech produced in quiet and in noise. Conclusions: Results provide evidence that intelligibility variation related to age and communicative barrier has the potential to aid clinical decision making for individuals with speech disorders, particularly dysarthria.
Article
Full-text available
Purpose: This study sought to determine decoupled tongue and jaw displacement changes and their specific contributions to acoustic vowel contrast changes during slow, loud, and clear speech. Method: Twenty typical talkers repeated "see a kite again" 5 times in 4 speech conditions (typical, slow, loud, clear). Speech kinematics were recorded using 3-dimensional electromagnetic articulography. Tongue composite displacement, decoupled tongue displacement, and jaw displacement during /ai/, as well as the distance between /a/ and /i/ in the F1-F2 vowel space, were examined during the diphthong /ai/ in "kite." Results: Displacements significantly increased during all 3 speech modifications. However, jaw displacements increased significantly more during clear speech than during loud and slow speech, whereas decoupled tongue displacements increased significantly more during slow speech than during clear and loud speech. In addition, decoupled tongue displacements increased significantly more during clear speech than during loud speech. Increases in acoustic vowel contrast tended to be larger during slow speech than during clear speech and were predominantly tongue-driven, whereas those during clear speech were fairly equally accounted for by changes in decoupled tongue and jaw displacements. Increases in acoustic vowel contrast during loud speech were smallest and were predominantly tongue-driven, particularly in men. Conclusions: Findings suggest that task-specific patterns of decoupled tongue and jaw displacement change and task-specific patterns of decoupled tongue and jaw contributions to vowel acoustic change across these speech modifications. Clinical implications are discussed.
Article
Estimating the basic acoustic parameters of conversational speech in noisy real-world conditions has been an elusive task in hearing research. Nevertheless, these data are essential ingredients for speech intelligibility tests and fitting rules for hearing aids. Previous surveys did not provide clear methodology for their acoustic measurements and setups, were opaque about their samples, or did not control for distance between the talker and listener, even though people are known to adapt their distance in noisy conversations. In the present study, conversations were elicited between pairs of people by asking them to play a collaborative game that required them to communicate. While performing this task, the subjects listened to binaural recordings of different everyday scenes, which were presented to them at their original sound pressure level (SPL) via highly open headphones. Their voices were recorded separately using calibrated headset microphones. The subjects were seated inside an anechoic chamber at 1 and 0.5 m distances. Precise estimates of realistic speech levels and signal-to-noise ratios (SNRs) were obtained for the different acoustic scenes, at broadband and third octave levels. It is shown that with acoustic background noise at above approximately 69 dB SPL at 1 m distance, or 75 dB SPL at 0.5 m, the average SNR can become negative. It is shown through interpolation of the two conditions that if the conversation partners would have been allowed to optimize their positions by moving closer to each other, then positive SNRs should be only observed above 75 dB SPL. The implications of the results on speech tests and hearing aid fitting rules are discussed.
Article
Purpose: Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to improve their accuracy and reproducibility. Method: Basic information is put together from standards, technical, voice and speech literature, and practical experience of the authors and is explained for nontechnical readers. Results: Variation of SPL with distance, sound level meters and their accuracy, frequency and time weightings, and background noise topics are reviewed. Several calibration procedures for SPL measurements are described for stand-mounted and head-mounted microphones. Conclusions: SPL of voice and speech should be reported together with the mouth-to-microphone distance so that the levels can be related to vocal power. Sound level measurement settings (i.e., frequency weighting and time weighting/averaging) should always be specified. Classified sound level meters should be used to assure measurement accuracy. Head-mounted microphones placed at the proximity of the mouth improve signal-to-noise ratio and can be taken advantage of for voice SPL measurements when calibrated. Background noise levels should be reported besides the sound levels of voice and speech.
Article
Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying speaking styles on these articulatory movements. Method: Twenty-one speakers with PD and 20 healthy controls produced 3 sentences under normal, loud, clear, and slow speaking conditions. Speech intelligibility was rated for each speaker. A 3-dimensional electromagnetic articulograph tracked movements of the articulators. Measures included articulatory working spaces, ranges along the first principal component, average speeds, and sentence durations. Results: Speakers with PD demonstrated significantly smaller jaw movements as well as shorter than normal sentence durations. Between-speaker variation in movement size of the jaw, tongue blade, and tongue dorsum was associated with speech intelligibility. Analysis of speaking conditions revealed similar patterns of change in movement measures across groups and articulators: larger than normal movement sizes and faster speeds for loud speech, increased movement sizes for clear speech, and larger than normal movement sizes and slower speeds for slow speech. Conclusions: Sentence-level measures of articulatory movements are sensitive to both disease-related changes in PD and speaking-style manipulations.