Book

Voice Attractiveness: Studies on Sexy, Likable, and Charismatic Speakers

Authors:

Abstract

This book addresses various aspects of acoustic–phonetic analysis, including voice quality and fundamental frequency, and the effects of speech fluency and non-native accents, by examining read speech, public speech, and conversations. Voice is a sexually dimorphic trait that can convey important biological and social information about the speaker, and empirical findings suggest that voice characteristics and preferences play an important role in both intra- and intersexual selection, such as competition and mating, and social evaluation. Discussing evaluation criteria like physical attractiveness, pleasantness, likability, and even persuasiveness and charisma, the book bridges the gap between social and biological views on voice attractiveness. It presents conceptual, methodological and empirical work applying methods such as passive listening tests, psychoacoustic rating experiments, and crowd-sourced and interactive scenarios and highlights the diversity not only of the methods used when studying voice attractiveness, but also of the domains investigated, such as politicians’ speech, experimental speed dating, speech synthesis, vocal pathology, and voice preferences in human interactions as well as in human–computer and human–robot interactions. By doing so, it identifies widespread and complementary approaches and establishes common ground for further research.
... The field of vocal attractiveness (VA) has established the role of several vocal traits which contribute to the listener's percept of the speaker's physical, and in some cases romantic, attractiveness (Weiss, Trouvain, Barkat-Defradas, & Ohala, 2021). Because much of this research, either explicitly or implicitly, focuses on sexual attractiveness, it is perhaps not surprising that much of the research on VA has been conducted in the fields of evolutionary biology and psychology, and only considerably more recently also in linguistics. ...
Conference Paper
Sexual dimorphism has played a key role in research on vocal attractiveness. This paper looks at Darwin's The Descent of Man and Selection in Relation to Sex (1871[2021]) with the following questions in mind: 1) What role does the vocal have according to Darwin?; 2) What reasons are provided for any tendencies identified by Darwin?; and 3) Have these (in humans) been supported by research conducted since? Darwin consistently acknowledges the role of the voice in sexual selection. Specific details are discussed in the paper, which further compares these with more recent research on vocal attractiveness and suggests further avenues of research from a phonetician's perspective.
... In terms of vocal acoustics, it has been generally found that a lower fundamental frequency (F0), perceived as lower pitch, is judged as more trustworthy for men but not women in economic domains, yet less trustworthy in romantic situations (Schild et al., 2020). However, other research indicates that people are more trusting of higher-pitched male voices regardless of situational context (O'Connor and Barclay, 2017;Weiss et al., 2021). Further, lower pitch has been linked to higher levels of testosterone (Dabbs and Mallinger, 1999) and to self-reports of sexual infidelity (Schild et al., 2021). ...
Article
Full-text available
Trust is an aspect critical to human social interaction and research has identified many cues that help in the assimilation of this social trait. Two of these cues are the pitch of the voice and the width-to-height ratio of the face (fWHR). Additionally, research has indicated that the content of a spoken sentence itself has an effect on trustworthiness; a finding that has not yet been brought into multisensory research. The current research aims to investigate previously developed theories on trust in relation to vocal pitch, fWHR, and sentence content in a multimodal setting. Twenty-six female participants were asked to judge the trustworthiness of a voice speaking a neutral or romantic sentence while seeing a face. The average pitch of the voice and the fWHR were varied systematically. Results indicate that the content of the spoken message was an important predictor of trustworthiness extending into multimodality. Further, the mean pitch of the voice and fWHR of the face appeared to be useful indicators in a multimodal setting. These effects interacted with one another across modalities. The data demonstrate that trust in the voice is shaped by task-irrelevant visual stimuli. Future research is encouraged to clarify whether these findings remain consistent across genders, age groups, and languages.
... Os elementos prosódicos compreendem a sílaba, a acentuação, a entoação, a qualidade e a dinâmica da voz, o ritmo, a pausa, e a taxa de elocução. Na construção da expressividade da fala, têm especial relevância as variações entoacionais, a taxa de elocução, as pausas e a qualidade e a dinâmica da voz, como apontado nos trabalhos desenvolvidos por Weiss et al (2020), Barbosa e Madureira (2016e 2018, Barbosa, Madureira e Mareüil (2017), Fontes e Madureira (2019), Madureira (2016Madureira ( e 2020. Esses elementos apresentam correlatos acústicos e perceptivos, conforme apontamos a seguir. ...
Article
Full-text available
A investigação da declamação suscita interesse por sua carga expressiva. O propósito deste trabalho é investigar, por meio de um experimento fonético, os efeitos impressivos causados nos ouvintes por declamações do Soneto da Fidelidade. Na análise perceptiva, foram avaliados 4 descritores semânticos (agradabilidade, impacto emocional, projeção e interpretação vocais) com base em uma escala Likert. Como estímulos para a avaliação dos ouvintes foram utilizadas gravações de 8 locutores. O teste foi aplicado a 38 juízes. A análise acústica compreendeu parâmetros extraídos automaticamente. Os resultados são confrontados por meio de análise estatística multidimensional e o impacto impressivo das declamações é discutido.
... Tutto ciò si traduce in significati sociali e sistemi di aspettative e valore: mentre la naturalezza sembrerebbe essere definita in base al rapporto che la voce artificiale è in grado di instaurare con il suo modello umano, l'analisi rivela che è piuttosto la voce stessa a essere definita in base al concetto di naturalezza concepito in laboratorio per finalità operative. Come messo in evidenza da Weiss et al. (2021), e come riscontrato nella mia ricerca sul campo insieme ai programmatori (cfr. parr. ...
Book
Full-text available
La voce artificiale si confronta con la crescente diffusione della tecnologia parlante, dagli assistenti vocali come Siri e Alexa, alla sintesi vocale per persone con disabilità fino alla clonazione vocale e al deepfake. La voce artificiale non è solo una tecnologia, ma una pratica culturale: essa riguarda il modo di concepire la comunicazione, il ruolo delle macchine e l’espressività vocale stessa, sempre più ibridata con processi informatici ed elettro-acustici che ridefiniscono i rapporti tra voce, corpo e soggettività. La voce artificiale indaga in parallelo l’impatto della tecnologia vocale sull’immaginario dell’Intelligenza Artificiale, con le narrazioni che ne alimentano il mito, e le pratiche di programmazione messe in campo per realizzare quelle tecnologie, le quali promuovono saperi ed epistemologie che si traducono in strutture sociali e modelli organizzativi, fino a interessare la condizione antropologica della contemporaneità. Perché il computer parlante è la materializzazione di significati sociali e modi di pensare la voce che si esprimono tanto a livello sonoro quanto a livello delle operazioni di misurazione, modellizzazione e apprendimento automatico che ingegnerizzano la voce. Perché indagare media-archeologicamente la voce significa spingersi oltre l’antropomorfismo della macchina parlante e rivolgersi all’ecosistema socio-materiale e tecno-culturale nel quale ogni voce, umana o non umana che sia, diventa ascoltabile. Perché il suono dà accesso a una conoscenza del mondo in grado di cogliere le mediazioni, le relazioni, le tracce e gli affetti. Se ne consiglia la lettura a chi abbia voglia di affrontare un viaggio articolato tra artefatti, saperi, storie, desideri, immaginari antichi e moderni, ma anche interessi, pratiche di laboratorio e pratiche artistiche che insieme compongono la voce artificiale come fenomeno molteplice e storicamente determinato. Se ne sconsiglia la lettura a chi sia soddisfatto dalle narrazioni deterministiche e soluzioniste che vedono l’Intelligenza Artificiale come forza misteriosa e magica o come semplice sinonimo di progresso.
Article
Full-text available
This pilot study reports on acoustic and perceptual profiles of two American female speakers’ productions of six American English social affective expressions: Authority, Declaration, Irritation, Sincerity, Uncertainty and walking on eggs as spoken in the linguistic sentence frame, Mary was dancing. The acoustic profile describes the prosodic characteristics of the utterances as a whole, as well as the voice quality characteristics of the nuclear stress syllable in the utterances. The perceptual profiles describe listeners’ 3 dimensional VAD emotional ratings, i.e., Valence, Arousal, and Dominance, of the utterances and listeners’ auditory impressions of the nuclear stress syllable. Multifactorial Analyses were applied to examine the relation between the prosodic characteristics and the VAD scales, and also the relationship between voice quality measurements on the nuclear stress vowel and auditory perceptions. The prosodic MFA results indicate that for these two American English speakers, a soft / noisy voice, with weak harmonics and irregular rhythm with pauses and hesitations, as in the expressions of Uncertainty and WOEG, is perceived by listeners as accommodating and not positive. Loud, tense voices with energy in the upper frequencies, as in theexpression of Irritation, are perceived as Aroused. Expressions of Authority, Declaration, and Sincerity tend to have comparatively regular rhythm and relatively flat intonation. The MFA analysis of voice quality measurements and auditory perceptions suggests that Normalized Amplitude Quotient may indeed be a good estimate for tense voice due to glottal closing behavior, Cepstral Peak Prominence, a good estimation for strong non-noisy harmonics, Peak Slope, a good estimate of spectral related tense voice, and Hammarberg Index, for distribution of spectral energy, i.e., strong or weak energy in the upper frequencies.
Article
The current study investigates the average effect: the tendency for humans to appreciate an averaged (face, bird, wristwatch, car, and so on) over an individual instance. The effect holds across cultures, despite varying conceptualizations of attractiveness. While much research has been conducted on the average effect in visual perception, much less is known about the extent to which this effect applies to language and speech. This study investigates the attractiveness of average speech rhythms in Dutch and Mandarin Chinese, two typologically different languages. This was tested in a series of perception experiments in either language in which native listeners chose the most attractive one from a pair of acoustically manipulated rhythms. For each language, two experiments were carried out to control for the potential influence of the acoustic manipulation on the average effect. The results confirm the average effect in both languages, and they do not exclude individual variation in the listeners’ perception of attractiveness. The outcomes provide a new crosslinguistic perspective and give rise to alternative explanations to the average effect.
Article
Full-text available
Although Voice Assistants are ubiquitously available for some years now, the interaction is still monotonous and utilitarian. Sound design offers conceptual and methodological research to design auditive interfaces. Our work aims to complement and supplement voice interaction with sonic overlays to enrich the user experience. Therefore, we followed a user-centered design process to develop a sound library for weather forecasts based on empirical results from a user survey of associative mapping. After analyzing the data, we created audio clips for seven weather conditions and evaluated the perceived combination of sound and speech with 15 participants in an interview study. Our findings show that supplementing speech with soundscapes is a promising concept that communicates information and induces emotions with a positive affect for the user experience of Voice Assistants. Besides a novel design approach and a collection of sound overlays, we provide four design implications to support voice interaction designers.
Article
Full-text available
Introduction Voice has been used to project identity in dubbing, in order to auditory portray appropriate role images in TV dramas. This study investigates the character voices of leading male characters in Empresses in the Palace . Methods Different acoustic characteristics of character voices and matching relation between acoustics and role images are explored by comparing F0, CPP, harmonic amplitude differences of speech spectrum. Results The voice quality of characters is related to their relative social status. The subordinates usually adopt a higher pitch or breathy voice, while the dominators use a lower pitch or modal/creaky voice. In addition, CPP, F0, and H1-A3 are the key acoustic indicators to distinguish character voices. Discussion These results reveal the acoustic characteristics of character voices of certain types, as well as provide guidance for dubbing vividly.
Article
Full-text available
In this perspective paper we explore the question how audible smiling can be integrated in speech synthesis applications. In human-human communication, smiling can serve various functions, such as signaling politeness or as a marker of trustworthiness and other aspects that raise and maintain the social likeability of a speaker. However, in human-machine communication, audible smiling is nearly unexplored, but could be an advantage in different applications such as dialog systems. The rather limited knowledge of the details of audible smiling and their exploitation for speech synthesis applications is a great challenge. This is also true for modeling smiling in spoken dialogs and testing it with users. Thus, this paper argues to fill the research gaps in identifying factors that constitute and affect audible smiling in order to incorporate it in speech synthesis applications. The major claim is to focus on the dynamics of audible smiling on various levels.
Chapter
Intensity is a crucial parameter characterizing phonological units manifestation in any language and an important cue to oppose informative and uninformative utterance parts. This paper presents the results of an acoustic study of intensity patterns for syllables in Chinese commercial and social radio advertisement depending on the syllable’s information load. The material was recorded from 6 Mandarin Chinese speakers reading the ads from 5 Chinese radio stations. Measurements were performed in Praat. During the study, 3 factors were considered: advertisement type, information load, and gender. Five intensity manifestation patterns were discovered. The most frequent (51%) was intensity range vs intensity level when one parameter was higher on the informative syllable while the other – on the uninformative syllable. The pattern with both intensity range and intensity level higher either on informative or uninformative syllables accounted for 20% of all tokens being more frequent in commercial advertisement and in female speech. Other patterns were less frequent. Only in 4% there was no intensity difference. Higher intensity values were accompanied by increased duration in 52% of the syllables.These findings not only enhance our knowledge of speech mechanisms but give implications for teachers and learners at Engineering Departments involved in CLIL. For lectures, it is crucial to attract and keep audience attention, boost persuasiveness of orally reported engineering results. The obtained data give them a tool to skillfully manipulate prosodic parameters for achieving the purpose. Applying this tool is important for students acting as both recipients and speakers of Chinese for engineering purposes. KeywordsAdvertising discourseInformation loadIntensitySyllableAcoustic cue
Chapter
Full-text available
In this paper we outline the research which highlights the issues of curriculum integrative strata. We show how the integrative design of curriculum provides its special organization in terms of both statics and dynamics. We also suggest that integration of elements within the curriculum of the course in foreign languages for future engineers can be implemented according to several directions: a) multidimensional mastering of the content of education, that is, the possibility of forming communicative competence within a certain system of academic disciplines; b) the presence of interrelated disciplinary cycles, repeated further at a more complex level; c) reliance in teaching on interdisciplinary connections and several integrated syllabi and special courses; and d) solving general educational problems within the disciplines of all taught complex of academic subjects. In the paper we prove that in order to achieve the qualities of integrity, continuity and consistency of the curriculum, the integration of its elements can be carried out by using four main strategies: fusion, insertion, correlation and harmonization. Then we present an illustrative study from IT engineering department. The study was carried out in January-May 2019. Students of both control and experimental groups dealt with the same subject area, but they were involved in different curriculum modes, which supposed different target skills orientation, learning material and techniques. The conclusions on the effectiveness of integrative educational curriculum were made on the basis of the result analysis, performed after initial, intermediate, and final stages of the experiment.KeywordsIntegrative design of curriculumStudy of foreign languagesCommunicative competence formationInterdisciplinary course connections
ResearchGate has not been able to resolve any references for this publication.