
Pertti PaloUniversity of Alberta | UAlberta · Communication sciences and disorders
Pertti Palo
Doctor of Philosophy
About
40
Publications
11,071
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
257
Citations
Introduction
Additional affiliations
August 2006 - June 2011
Education
September 2013 - August 2016
June 2011
June 2006
Helsinki University of Technology
Field of study
- Language Technology
Publications
Publications (40)
Accounting for between and within-subject variability is a relatively easy task when one uses mixed effects regression [1, 2]. Modelling speaker (and items) as random effects allows for coefficients of the main effects to be adjusted to account for this type of variability [1, 2]. Using a maximal specification approach [1] allows for an accurate es...
We report results from using Pixel Difference-a metric based on Euclidean distance-to analyze articulatory gestures. Our data consists of mono-and disyllabic English words recorded with tongue ultrasound from four speakers. We analyze the shapes of Pixel Difference curves based on acoustic segmentation. The motivation is to see when and how articul...
A great variety of tools are available for analysing speech data, but few permit simultaneous annotation of audio, articulatory and other modalities. Being able to view several data modalities is key when analysing speech production data. The Speech Articulation ToolKIT (SATKIT) is a free open source tool for analysing and annotating speech data wh...
This paper presents a dynamic account of the tongue contour imaged with Ultrasound Tongue Imaging (UTI) and quantified via Generalised Additive Mixed Models, during the articulation of guttural consonants (uvular, pharyngealised, pharyngeal) in Levantine Arabic. Gutturals are claimed to form a natural class and we aim to quantify the degree of (dis...
Tongue ultrasound data arecommonly analyzed by using different metrics. In this study, we evaluate the local stability and long distance reliability of Average Nearest Neighbour Distance (Zharkova and Hewlett, 2009), Median Point-by-Point Distance (Palo, 2020), Procrustes analysis and Modified Curvature Index (Dawson et al., 2015). A metric which h...
Pixel difference is a holistic change measure for ultrasound videos and other image sequences. It is based on Euclidean distance (l2 norm) which is calculated by interpreting each frame as a vector of pixels. Recently, we found that 3D/4D ultrasound data producesignificantly worse signal-to-noise ratio in Pixel Difference curves, which appears to b...
No PDF available
ABSTRACT
It is frequently of interest to examine differences (e.g., the 1st or 2nd difference) in signals and images. Differencing can be employed for the detection of edges in images, or the detection of events in temporal sequences. One such event is the onset of articulatory movement in ultrasound image sequences of the tongue....
No PDF available
ABSTRACT
Acoustic analysis of typically developing elementary school-aged (prepubertal) children’s speech has been primarily performed on cross-sectional data in the past. Few studies have examined longitudinal data in this age group. For this presentation, we analyze the developmental changes in the acoustic properties of children...
The position of the tongue during rest and its relationship to subsequent pre-acoustic speech movements have shed light on speech motor planning. They may find applications in technologies such as silent speech interfaces. Palo (2019) found that the duration of pre-acoustic speech movements are strongly correlated with acoustic utterance duration,...
Our goal is to understand articulatory gestures at the end of an utterance and their timing relation to acoustic speech. The poster presents preliminary results from data collected for a different project.
There's a pre-print of the corresponding 2-page article on ResearchGate here https://www.researchgate.net/publication/355049824_Computer_Assisted_Segmentation_of_Tongue_Ultrasound_and_Lip_Videos
Published version is here https://jcaa.caa-aca.ca/index.php/jcaa/issue/view/293/218%C3%A9
No PDF available
ABSTRACT
The position of the tongue during rest and its relationship to subsequent pre-acoustic speech movements have shed light on speech motor planning. They may find applications in technologies such as silent speech interfaces. Palo [“Measuring pre-speech articulation,” Ph.D. thesis (Queen Margaret University, Edinburgh, 2019)]...
A tongue ultrasound study of Finnish coarticulatory direction in a cross-linguistic context.
In order to understand speech articulation, we need to understand not only what movements of the articulators are used to produce a given sound, but also how those articu-lator movements are produced by muscle actions. This paper approaches this problem by analysing ultrasound data with three methods. First, Pixel Difference accounts for all change...
Poster presented at the 12th International Seminar on Speech Production (ISSP)
What do speakers do when they start to talk? This thesis concentrates on the articulatory aspects of this problem, and offers methodological and ex- perimental results on tongue movement, captured using Ultrasound Tongue Imaging (UTI).
Speech initiation occurs at the start of every utterance. An understanding of the timing relationship between arti...
Objective:
This study investigated whether adding an additional modality, namely ultrasound tongue imaging (UTI), to perception-based phonetic transcription impacted on the identification of compensatory articulations and on interrater reliability.
Patients and methods:
Thirty-nine English-speaking children aged 3-12 years with cleft lip and pal...
Abstract: Previous research by Gibbon (2004) shows that at least 8 distinct error types can be identified in the speech of people with cleft lip and palate (CLP) using electropalatography (EPG), a technique which measures tongue-palate contact. However, EPG is expensive and logistically difficult. In contrast, ultrasound is cheaper and arguably bet...
The aim of the current study was to investigate subtle characteristics of social perception and interpretation in high-functioning individuals with autism spectrum disorders (ASDs), and to study the relation between watching and interpreting. As a novelty, we used an approach that combined moment-by-moment eye tracking and verbal assessment. Sixtee...
Pitch analysis tools are used widely in order to measure and to visualize the melodic aspects of speech. The resulting pitch contours can serve various research interests linked with speech prosody, such as intonational phonology, interaction in conversation, emotion analysis, language learning and singing. Due to physiological differences and indi...
We study the effect that phonetic onset has on acoustic and articulatory reaction times. An acoustic study by Rastle et al. (2005) shows that the place and manner of the first consonant in a target affects acoustic RT. An articulatory study by Kawamoto et al. (2008) shows that the same effect is not present in articulatory reaction time of the lips...
Vocal emotions are expressed either by speech or singing. The difference is that in singing the pitch is predetermined while in speech it may vary freely. It was of interest to study whether there were voice quality differences between freely varying and mono-pitched vowels expressed by professional actors. Given their profession, actors have to be...
The tongue moves silently in preparation for speech. We analyse Ultrasound Tongue Imaging (UTI) data of these pre-speech to speech phases from five speakers, whose native languages (L1) are English (n = 3), German, and Finnish. Single words in the subjects' respective L1 were elicited by a standard picture naming task. Our focus is to automate the...
We present anatomic and acoustic data from a pilot study on the Finnish vowels [a. e. i. o. u. y. ae. circle dot].(1) The data were acquired simultaneously with 3D magnetic resonance imaging (MRI) and a custom built sound recording system. The data consist of a single static repetition of each vowel with constant F0. The imaging sequence was 7.6 s...
We compare numerically computed resonances of the human vocal tract with
formants that have been extracted from speech during vowel pronunciation. The
geometry of the vocal tract has been obtained by MRI from a male subject, and
the corresponding speech has been recorded simultaneously. The resonances are
computed by solving the Helmholtz partial d...
This study examines the relationship of voice quality and speech-based personality assessment of Finnish-speaking female speakers. Five Finnish-speaking female subjects recorded a text passage with eight different vocal qualities. Samples that passed the preselection test for the voice qualities were played to 50 Finnish-speaking listeners, who rep...
This article describes an arrangement for simultaneous recording of speech and the geometry of vocal tract. Experimental design is considered from the phonetics point of view. The speech signal is recorded with an acoustic-electrical arrangement and the vocal tract with MRI. Finally, data from pilot measurements on vowels is presented, and its qual...
Spatial data of the human vocal tract (VT), larynx, and thorax can be obtained by magnetic resonance imaging (MRI) during steady, sustained phonation. Long acquisition time increases the resolution as well as the errors due to involuntary motion of VT. We discuss two experiments with a single test subject to find a suitable 3D MRI data acquisition...
We design and construct a recording arrangement for speech during an MRI scan of the speakers vocal tract. We concentrate on the acoustic environment around the test sub- ject inside the MRI machine. The data thus obtained is used for construction and validation of a numerical model of the vocal tract.
We discuss recording arrangements for speech during an MRI scan of the speak-ers vocal tract. The image and sound data thus obtained will be used for construction and vali-dation of a numerical model for the vocal tract.
This article describes modal analysis of acoustic waves in the human vocal tract while the subject is pronouncing [o]. The model used is the wave equation in three dimensions, together with physically relevant boundary conditions. The geometry is reconstructed from anatomical MRI data obtained by other researchers. The computations are carried out...
We study computationally the dynamics of sound production in the vocal tract (VT). Our mathemat- ical formulation is based on the three-dimensional wave equation, together with physically relevant boundary conditions. We focus on formant and pressure information in the VT. For this purpose, we make use of anatomical data obtained by MRI by other re...
We compare numerically computed resonances of the human vocal tract with formants that have been extracted from speech during vowel pronunciation. The geometry of the vocal tract has been obtained by MRI, and the corresponding speech has been recorded simultaneously. The resonances are computed by solving the Helmholtz partial differential equation...