Stephan R. Kuberski’s research while affiliated with Universität Potsdam and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (12)


Corrigendum to “Towards a dynamical account of inter-segmental coordination” [J. Phon. 109 (2025) 101392]
  • Article

May 2025

·

8 Reads

Journal of Phonetics

·

Stephan R. Kuberski

·


Fig. 2. Landmarks labeled by the Mview algorithm on the tongue body (TB) and tongue tip (TT) gestures of the word-initial /kl/ in the German word 'Klage'. The first and third panels show the positional values of the horizontal (X) and vertical (Y) dimensions from the TB and TT sensors over time respectively. The second and fourth panels show the non-directional velocity for the two sensors calculated by differentiating the Euclidean distance traveled in the two spatial dimensions. Landmarks are indicated with dashed lines.
Fig. 8. Distributions of the estimated slopes for the effects of the reciprocal of the harmonic stiffness mean by C1 voicing (left) and C1 place (right) grouped by language.
Entropy values (higher entropy means more flexibility) of the ten landmarks both within and across (in column, Overall) languages. The flexibility ranking of each landmark (relative to other landmarks) is indicated by the number in the parentheses.
Towards a dynamical account of inter-segmental coordination
  • Article
  • Full-text available

January 2025

·

88 Reads

Journal of Phonetics

We offer an intrinsic timing account of durations widely used to characterize inter-segmental coarticulation or coproduction patterns cross-linguistically. In this account, measured durations are the result of dynamical properties of the coarticulated segments. Our account is developed on the basis of timing data, registered using Electromagnetic articulography (EMA), from stop-lateral clusters in three languages. In C1C2 stop-lateral consonant clusters from these languages, we show that the extent of the consonants’ coproduction (‘overlap’) is controlled by a synergy between the dynamical parameters of C1 opening and C2 closing stiffness, the two movements most relevant in the C1-to-C2 transition. The specific form of the overlap-stiffness relation is one where extent of coproduction is a linear function of the (reciprocal of the) mean of the two stiffness parameters. This result establishes a link between lag measures widely used to characterize inter-segmental coarticulation and the dynamical properties of the gestures of the segments whose co-production is at issue.

Download


Box plots showing the values of the classic kinematic parameters in the filtering approach. From left to right: movement duration, movement amplitude, peak velocity, peak acceleration, and peak deceleration. The top row shows data /ka/, and the bottom row shows data of /ta/ syllables. Within each panel, solid boxes correspond to data from closing movements and dashed boxes correspond to data from opening movements.
Box plots showing the values of the classic kinematic parameters in the splines approach. From left to right: movement duration, movement amplitude, peak velocity, peak acceleration, and peak deceleration. The top row shows data /ka/ and the bottom row shows data of /ta/ syllables. Within each panel, solid boxes correspond to data from closing movements and dashed boxes correspond to data from opening movements.
Scatter plots showing the three kinematic relations in the filtering approach. From left to right: v ~ A/T, v ~ A, and a ~ A. The top row shows data of /ka/ and the bottom row shows data of /ta/ syllables; speech rate is color coded, with fainter (darker) shades corresponding to movements of slower (faster) rates. decel. = deceleration; accel. = accelaration.
Scatter plots showing the three kinematic relations in the spline smoothing approach. From left to right: v ~ A/T, v ~ A, and a ~ A. The top row shows data of /ka/ and the bottom row shows data of /ta/ syllables; speech rate is color coded, with fainter (darker) shades corresponding to movements of slower (faster) rates. decel. = deceleration; accel. = accelaration.
German subset: standard errors of the regressions for the three kinematic relations (v ~ A/T, v ~ A, and a ~ A) and the related rela- tive percentage difference (RPD) values between the splines and the filtering approach.
Comparing Two Smoothing Approaches in Estimating Kinematic Parameters

April 2024

·

71 Reads

Purpose We compare two signal smoothing and differentiation approaches: a frequently used approach in the speech community of digital filtering with approximation of derivatives by finite differences and a spline smoothing approach widely used in other fields of human movement science. Method In particular, we compare the values of a classic set of kinematic parameters estimated by the two smoothing approaches and assess, via regressions, how well these reconstructed values conform to known laws about relations between the parameters. Results Substantially smaller regression errors were observed for the spline smoothing than for the filtering approach. Conclusion This result is in broad agreement with reports from other fields of movement science and underpins the superiority of splines also in the domain of speech.


Fig. 2. Performance gain in regressions of the linear model (right ordinates, in percentage point of the explained variance) and their statistical significance (left ordinates, in standard z-score) as a function of metronome rate (abscissae) and increments in segmentation thresholding (0%, 5%, 10%, 15%, 20%, and 25%, color-coded). The z-scores larger than 1.96 (dotted lines) indicate statistical significance at an alpha-level of 0.05 (95% confidence).
Fig. 3. A typical sequence of /ka/ syllables, produced by one of our speakers at normal speech rate (210 bpm metronome), portrayed in the phase plane (left) and in the Hooke plane (right). Arrows indicate the direction of time (closings: from /a/ to /k/, openings: from /k/ to /a/). Regions shaded in gray schematically indicate those parts of the data which are affected by velocity thresholding: the darker the shade (the higher the threshold), the larger the amount of data excluded from the trajectory.
How thresholding in segmentation affects the regression performance of the linear model

September 2023

·

86 Reads

·

4 Citations

JASA Express Letters

Evaluating any model underlying the control of speech requires segmenting the continuous flow of speech effectors into sequences of movements. A virtually universal practice in this segmentation is to use a velocity-based threshold which identifies a movement onset or offset as the time at which the velocity of the relevant effector breaches some threshold percentage of the maximal velocity. Depending on the threshold choice, more or less of the movement's trajectory is left in for model regression. This paper makes explicit how the choice of this threshold modulates the regression performance of a dynamical model hypothesized to govern speech movements.


Figure 4: Percentage of exclusive coextensiveness (PEC) for each pair of overlap measure and C1 opening / C2 closing movement across languages and clusters.
Figure 5: Distributions of the four overlap measures by cluster across German and English. Note that IPI is the interval: 'C1 release to C2 target'.
How measures of gestural overlap relate to dynamics: Evidence from German and English word-initial stop-lateral clusters

August 2023

·

15 Reads

·

1 Citation

In data from English and German clusters (C1C2), we examine if and how the stiffness of C1 opening and C2 closing movements (the two relevant movements in the C1-to-C2 transition) modulate overlap, using four overlap measures. Results show a variegated picture where different overlap measures do or do not depend on the stiffness parameters. We seek explanations for this patterning that lead to a better understanding of the relation between the plethora of overlap measures used in the literature and the dynamics of the gestures whose overlap is at issue.



Fig. 3: Range of movement amplitudes and effective target widths. a Sequences of [ta]. b Sequences of [ka]. Data are drawn separately for each speaker (subpanels). Metronome rate is colour coded with fainter shades for slower rates and darker shades for faster rates.
Fig. 4: Relation between movement duration and index of difficulty. a Sequences of [ta]. b Sequences of [ka]. Data are drawn separately for each speaker (subpanels). Metronome rate is colour coded with fainter shades for slower rates and darker shades for faster rates. Linear regressions of contiguous Fitts-compliant rates are drawn as thick lines (corresponding r 2 values are given in the bottom right corner of each panel). Linear regression lines are not meant to indicate fits to the entire data set but only to a subset starting from a (speaker-specific) rate and including all higher rates (see text for details).
Fitts’ Law in Tongue Movements of Repetitive Speech

October 2019

·

151 Reads

·

9 Citations

Fitts’ law, perhaps the most celebrated law of human motor control, expresses a relation between the kinematic property of speed and the non-kinematic, task-specific property of accuracy. We aimed to assess whether speech movements obey this law using a metronome-driven speech elicitation paradigm with a systematic speech rate control. Specifically, using the paradigm of repetitive speech, we recorded via electromagnetic articulometry speech movement data in sequences of the form /CV…/ from 6 adult speakers. These sequences were spoken at 8 distinct rates ranging from extremely slow to extremely fast. Our results demonstrate, first, that the present paradigm of extensive metronome-driven manipulations satisfies the crucial prerequisites for evaluating Fitts’ law in a subset of our elicited rates. Second, we uncover for the first time in speech evidence for Fitts’ law at the faster rates and specifically beyond a participant-specific critical rate. We find no evidence for Fitts’ law at the slowest metronome rates. Finally, we discuss implications of these results for models of speech.


The speed-curvature power law in tongue movements of repetitive speech

March 2019

·

209 Reads

·

17 Citations

The speed-curvature power law is a celebrated law of motor control expressing a relation between the kinematic property of speed and the geometric property of curvature. We aimed to assess whether speech movements obey this law just as movements from other domains do. We describe a metronome-driven speech elicitation paradigm designed to cover a wide range of speeds. We recorded via electromagnetic articulometry speech movements in sequences of the form /CV…/ from nine speakers (five German, four English) speaking at eight distinct rates. First, we demonstrate that the paradigm of metronome-driven manipulations results in speech movement data consistent with earlier reports on the kinematics of speech production. Second, analysis of our data in their full three-dimensions and using advanced numerical differentiation methods offers stronger evidence for the law than that reported in previous studies devoted to its assessment. Finally, we demonstrate the presence of a clear rate dependency of the power law’s parameters. The robustness of the speed-curvature relation in our datasets lends further support to the hypothesis that the power law is a general feature of human movement. We place our results in the context of other work in movement control and consider implications for models of speech production.


Fundamental motor laws and dynamics of speech

January 2019

·

13 Reads

The present work is a compilation of three original research articles submitted (or already published) in international peer-reviewed venues of the field of speech science. These three articles address the topics of fundamental motor laws in speech and dynamics of corresponding speech movements: 1. Kuberski, Stephan R. and Adamantios I. Gafos (2019). "The speed-curvature power law in tongue movements of repetitive speech". PLOS ONE 14(3). Public Library of Science. doi: 10.1371/journal.pone.0213851. 2. Kuberski, Stephan R. and Adamantios I. Gafos (In press). "Fitts' law in tongue movements of repetitive speech". Phonetica: International Journal of Phonetic Science. Karger Publishers. doi: 10.1159/000501644 3. Kuberski, Stephan R. and Adamantios I. Gafos (submitted). "Distinct phase space topologies of identical phonemic sequences". Language. Linguistic Society of America. The present work introduces a metronome-driven speech elicitation paradigm in which participants were asked to utter repetitive sequences of elementary consonant-vowel syllables. This paradigm, explicitly designed to cover speech rates from a substantially wider range than has been explored so far in previous work, is demonstrated to satisfy the important prerequisites for assessing so far difficult to access aspects of speech. Specifically, the paradigm's extensive speech rate manipulation enabled elicitation of a great range of movement speeds as well as movement durations and excursions of the relevant effectors. The presence of such variation is a prerequisite to assessing whether invariant relations between these and other parameters exist and thus provides the foundation for a rigorous evaluation of the two laws examined in the first two contributions of this work. In the data resulting from this paradigm, it is shown that speech movements obey the same fundamental laws as movements from other domains of motor control do. In particular, it is demonstrated that speech strongly adheres to the power law relation between speed and curvature of movement with a clear speech rate dependency of the power law's exponent. The often-sought or reported exponent of one third in the statement of the law is unique to a subclass of movements which corresponds to the range of faster rates under which a particular utterance is produced. For slower rates, significantly larger values than one third are observed. Furthermore, for the first time in speech this work uncovers evidence for the presence of Fitts' law. It is shown that, beyond a speaker-specific speech rate, speech movements of the tongue clearly obey Fitts' law by emergence of its characteristic linear relation between movement time and index of difficulty. For slower speech rates (when temporal pressure is small), no such relation is observed. The methods and datasets obtained in the two assessment above provide a rigorous foundation both for addressing implications for theories and models of speech as well as for better understanding the status of speech movements in the context of human movements in general. All modern theories of language rely on a fundamental segmental hypothesis according to which the phonological message of an utterance is represented by a sequence of segments or phonemes. It is commonly assumed that each of these phonemes can be mapped to some unit of speech motor action, a so-called speech gesture. For the first time here, it is demonstrated that the relation between the phonological description of simple utterances and the corresponding speech motor action is non-unique. Specifically, by the extensive speech rate manipulation in the herein used experimental paradigm it is demonstrated that speech exhibits clearly distinct dynamical organizations underlying the production of simple utterances. At slower speech rates, the dynamical organization underlying the repetitive production of elementary /CV/ syllables can be described by successive concatenations of closing and opening gestures, each with its own equilibrium point. As speech rate increases, the equilibria of opening and closing gestures are not equally stable yielding qualitatively different modes of organization with either a single equilibrium point of a combined opening-closing gesture or a periodic attractor unleashed by the disappearance of both equilibria. This observation, the non-uniqueness of the dynamical organization underlying what on the surface appear to be identical phonemic sequences, is an entirely new result in the domain of speech. Beyond that, the demonstration of periodic attractors in speech reveals that dynamical equilibrium point models do not account for all possible modes of speech motor behavior.


Citations (4)


... The cubic term was initially constrained to position-only, but the LA and TT models required both x 3 andẋ 3 in order to improve on the linear model. This is not necessarily unusual for models of human movement (Beek & Beek, 1988;Schöner, 1990) and there may be an advantage to the inclusion of nonlinear velocity terms more generally, especially for modelling qualitatively distinct movement dynamics, such as limit cycles (Kuberski & Gafos, 2023). The nonlinear model clearly provides a better fit than the linear model for trajectories in Figure 14, with the exception of TD, where SINDy fails to find an optimal model. ...

Reference:

Discovering dynamical laws for speech gestures
How thresholding in segmentation affects the regression performance of the linear model

JASA Express Letters

... The ability to predict the movement time and evaluate the of arbitrarily curved paths would also be helpful in determining level difficulties in video games such as Osu! [9], Trombone Champ [27], or even physical ones such as the wire and loop game. Although they are typically evaluated on hand-based interactions, human performance models have valuable applications outside of user interface research in explaining the complexities of other motor control tasks such as vocal articular trajectories and tongue movements during speech production [13,16]. ...

Fitts’ Law in Tongue Movements of Repetitive Speech

... The power law also extends beyond drawing movements to encompass foot movements (Ivanenko et al. 2002) and walked trajectories (Vieilledent et al. 2001;Hicheur et al. 2005;Pham et al. 2007). The law is observed in smooth pursuit eye movements (de'Sperati and Viviani 1997;Kowler et al. 2019), and tongue movements (Tasko and Westbury 2004;Perrier and Fuchs 2008;Kuberski and Gafos 2019). The power law is also found in non-human primate drawings (Schwartz 1994;Abeles et al. 2013), in the crawling of larvae (Drosophila melanogaster; Zago et al. 2016) and of the buff-tailed bumblebee (Bombus terrestris; James et al. 2020). ...

The speed-curvature power law in tongue movements of repetitive speech

... The author also mentioned that this method might fail in bilabial fricative /f/, glottal fricative /h/ because of transient-like properties. The PI method also adopted in the study [166] for CBT detection. ...

A landmark-based approach to automatic voice onset time estimation in stop-vowel sequences
  • Citing Conference Paper
  • December 2016