Hugues Salamin’s research while affiliated with University of Glasgow and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (28)


Figure 1 of 1
Negotiating over Mobile Phones: Calling or Being Called Can Make the Difference
  • Article
  • Full-text available

December 2014

·

558 Reads

·

8 Citations

Cognitive Computation

·

Hugues Salamin

·

Mobile phones pervade our everyday life like no other technology, but the effects they have on one-to-one conversations are still relatively unknown. This paper focuses on how mobile phones influence negotiations, i.e., on discussions where two parties try to reach an agreement starting from opposing preferences. The experiments involve 60 pairs of unacquainted individuals (120 subjects). They must make a "yes" or "no" decision on whether several objects increase the chances of survival in a polar environment or not. When the participants disagree about a given object (one says "yes" and the other says "no"), they must try to convince one another and reach a common decision. Since the subjects discuss via phone, one of them (selected randomly) calls while the other is called. The results show that the caller convinces the receiver in 70 % of the cases ( value = 0.005 according to a two-tailed binomial test). Gender, age, personality and conflict handling style, measured during the experiment, fail in explaining such a persuasiveness difference. Calling or being called appears to be the most important factor behind the observed result.

Download


Fig. 3 Amplitude of elbow gesticulation in low and high rated lectures. The former ones present higher medians, suggesting a correlation between low ratings and frequent/wide gestures. 
Predicting online lecture ratings based on gesturing and vocal behavior

June 2014

·

90 Reads

·

20 Citations

Journal on Multimodal User Interfaces

·

Hugues Salamin

·

·

[...]

·

Nonverbal behavior plays an important role in any human-human interaction. Teaching-an inherently social activity-is not an exception. So far, the effect of nonverbal behavioral cues accompanying lecture delivery was investigated in the case of traditional ex-cathedra lectures, where students and teachers are co-located. However, it is becoming increasingly more frequent to watch lectures online and, in this new type of setting, it is still unclear what the effect of nonverbal communication is. This article tries to address the problem and proposes experiments performed over the lectures of a popular web repository ("Videolectures"). The results show that automatically extracted nonverbal behavioral cues (prosody, voice quality and gesturing activity) predict the ratings that "Videolectures" users assign to the presentations.


Fig. 1. The plots show how F 1 Score, Precision and Recall change as a function of the parameter λ, the weight adopted for the Language Model. The plots have been obtained for the five-fold protocol, but the Figures for the other setup show similar behaviors. 
Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

October 2013

·

261 Reads

·

35 Citations

This article presents experiments on automatic detection of laughter and fillers, two of the most important nonverbal behavioral cues observed in spoken conversations. The proposed approach is fully automatic and segments audio recordings captured with mobile phones into four types of interval: laughter, filler, speech and silence. The segmentation methods rely not only on probabilistic sequential models (in particular Hidden Markov Models), but also on Statistical Language Models aimed at estimating the a-priori probability of observing a given sequence of the four classes above. The experiments are speaker independent and performed over a total of 8 hours and 25 minutes of data (120 people in total). The results show that F1 scores up to 0.64 for laughter and 0.58 for fillers can be achieved.



The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism

August 2013

·

3,029 Reads

·

513 Citations

The INTERSPEECH 2013 Computational Paralinguistics Challenge provides for the first time a unified test-bed for Social Signals such as laughter in speech. It further introduces conflict in group discussions as a new task and deals with autism and its manifestations in speech. Finally, emotion is revisited as task, albeit with a broader range of overall twelve enacted emotional states. In this paper, we describe these four Sub-Challenges, their conditions, baselines, and a new feature set by the openSMILE toolkit, provided to the participants.


Automatic recognition of personality and conflict handling style in mobile phone conversations

July 2013

·

34 Reads

·

7 Citations

This article proposes experiments on the automatic recognition of personality traits and conflict handling style based on nonverbal communication. The tests are performed over the SSPNet-Nokia Corpus, a collection of 60 mobile phone calls (120 subjects in total) based on the Winter Survival Task. Nonverbal behavioral cues are extracted from speech (captured with the phone microphones) and motor activation (captured indirectly via the gyroscopes mounted on the phones). Support Vector Machines are then adopted to map the cues into two psychological constructs, namely the “Big Five” personality traits and the dimensions of the “Rahim Organizational Conflict Inventory”, the former capturing individual characteristics and the latter accounting for subjects attitude towards conflict and disagreement. The results show that performances higher than chance to a statistically significant extent can be achieved for one personality trait (Neuroticism) and two conflict handling styles (Dominating and Obliging).


Towards Causal Modeling of Human Behavior

January 2013

·

9 Reads

Smart Innovation

This article proposes experiments on decision making based on the "Winter Survival Task", one of the scenarios most commonly applied in behavioral and psychological studies. The goal of the Task is to identify, out of a predefined list of 12 items, those that are most likely to increase the chances of survival after the crash of a plane in a polar area. In our experiments, 60 pairs of unacquainted individuals (120 subjects in total) negotiate a common choice of the items to be retained after that each subject has performed the task individually. The results of the negotiations are analyzed in causal terms and show that the choices made by the subjects individually act as a causal factor with respect to the outcome of the negotiation.


Figure 1: Illustrating comparison of the three analysis methods, i.e. time aligned moving average (TAMA) based, utterance-based, and utterance-sensitive window based or HYBRID.  
Figure 2: Prosodic accommodation levels obtained for each prosodic parameter, at several anchor points (every 50 sec, computed for a window of 110sec) for Conversation 12.
Figure 3: Lexical accommodation levels obtained at several anchor points (every 50 sec, window of 110sec) for Conv. 12.
Investigating fine temporal dynamics of prosodic and lexical accommodation

January 2013

·

303 Reads

·

19 Citations

Conversational interaction is a dynamic activity in which participants engage in the construction of meaning and in establishing and maintaining social relationships. Lexical and prosodic accommodation have been observed in many studies as contributing importantly to these dimensions of social interaction. However, while previous works have considered accommodation mechanisms at global levels (for whole conversations, halves and thirds of conversations), this work investigates their evolution through repeated analysis at time intervals of increasing granularity to analyze the dynamics of alignment in a spoken language corpus. Results show that the levels of both prosodic and lexical accommodation fluctuate several times over the course of a conversation.



Citations (24)


... Taken together, these data suggest that the fundamental frequency is one of the acousticprosodic features more likely to be synchronized during conversational interactions. In this respect, and considering also speech rate convergence, Bonin et al. (2013) suggest the following (using the term accommodation to refer to the phenomenon we call convergence): ...

Reference:

..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
Investigating fine temporal dynamics of prosodic and lexical accommodation
  • Citing Conference Paper
  • August 2013

... Other acoustic characteristics such as energy, intensity and speaking rate were subsequently established as markers of emotion [6]. These features, proposed as part of the Interspeech para-linguistic challenges, were high dimensional [7], [8]. The dimensionality issue was partially addressed by Eyben et al. [9], where a minimalist set S. Dutta and S. Ganapathy are with the learning and extraction and acoustic patterns (LEAP) laboratory, Electrical Engineering, Indian Institute of Science, Bangalore, India, 560012. ...

The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism

... The landscape of audio classification has undergone significant evolution, primarily driven by the adoption of deep learning models that transform raw audio signals into actionable insights [1]- [3]. Traditionally, convolutional neural networks (CNNs) have been at the forefront of this transformation, as their inherent spatial locality and translation equivariance have been effective on audio data represented as spectrogram images [4], [5]. In parallel, the success of architectures in the domain of natural language processing has paved the way for their integration into audio classification tasks, particularly through hybrid models that combine CNNs with self-attention mechanisms to enhance the models' ability to capture longrange dependencies [6]- [10]. ...

The interspeech 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism

... To improve the integration of the relationship network, method that combined with subtitle text information is advisable. In recently, some methods employ conversations between roles to analyze the interaction of roles [10], [11]. Conversations between people can well reflect the behavior of roles, thus their roles of the group can be properly recognized. ...

Role Recognition for Meeting Participants: an Approach Based on Lexical Information and Social Network Analysis

... The Vera am Mittag dataset [11] contains recordings from a reality show. The Canal 9 corpus [31] focuses on political debates. While these datasets are interesting from the content point of view, they do not provide the whole sensor range we are interested in. ...

Canal9: A database of political debates for analysis of social interactions
  • Citing Article
  • January 2009

... When it comes to computers, however, they are socially ignorant. This gap has led to the emergence of social signal processing (SSP) [71] that aims at providing computers with the ability to sense and understand human social signals. Analyzing facial expressions has ever been a core focus in this area [46,80,91], which has achieved tremendous progress. ...

Social Signal Processing: Understanding social interactions through nonverbal behavior analysis
  • Citing Conference Paper
  • June 2009

... Regarding persuasion or the ability to influence the other party, Yuan et al. (2003) report perceived advantages to influence other negotiators in video or audio plus text compared to text only negotiators. Vinciarelli et al. (2014) find that in phone negotiations, the receiver of the call is more persuasive than the caller. While there is no unambiguous picture about the relative occurrence of cooperative and competitive behavior in different communication media, van Es et al. (2004) report that a behavioral strategy change is easier accomplished in asynchronous (email) than synchronous (FTF) media. ...

Negotiating over Mobile Phones: Calling or Being Called Can Make the Difference

Cognitive Computation

... For instance, Stewart et al. [31] shared findings from a preliminary investigation into accidental changes in grip pressure on mobile devices, observed under both stationary laboratory conditions and while walking, with significant pressure fluctuations noted in each scenario. The study utilized the FSR-402 sensor in experimental setups, observing pressure variations from 0.1 N to 3-10 N, a range supported by the sensor and deemed ergonomically suitable for fingertip pressure examinations. ...

An exploration of inadvertent variations in mobile pressure input

... This evidence underscores that a teacher's gestures significantly contribute to the comprehension of verbal information and play an integral role in concept understanding. Literature on PT training has also found that self-reflective practices using video recording can significantly boost confidence and enhance teaching skills (Cheng et al., 2014;Pi et al., 2019;Xiao & Tobin, 2018;Yang et al., 2020). PTs can improve various aspects of their teaching, often overlooked aspects of instruction such as gesture, posture, and gaze, by reviewing their own teaching recordings (McCoy & Lynam, 2021;Xiao & Tobin, 2018). ...

Predicting online lecture ratings based on gesturing and vocal behavior

Journal on Multimodal User Interfaces

... Recently, methods using deep learning have been proposed to automatically detect laughter in audio [7,8,9]. These methods detect laughter by learning from datasets annotated with the locations of laughter [10,11] or their own datasets. ...

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations