• Home
  • Vicomtech
  • Human Speech and Language Technology Department
  • Ivan Gonzalez Torre
Ivan Gonzalez Torre

Ivan Gonzalez Torre
Vicomtech · Human Speech and Language Technology Department

Doctor
AI researcher at Vicomtech. My current research focuses on the characterization of complex communication

About

30
Publications
4,933
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
172
Citations
Introduction
My core interest lies in the study of adaptive complex systems and emergence of complex behaviour. My current research focuses on the characterization of human voice, natural language and speech processing, statistical patterns on language and time series analysis. A selection of my topics of interest are: Interaction between linguistics, social sciences, cognitive sciences and statistical physics, Computational and quantitative linguistics, Speech processing and ASR and Scaling phenomena.
Additional affiliations
June 2020 - present
Vicomtech
Position
  • Researcher
October 2019 - June 2020
Universitat Politècnica de Catalunya
Position
  • PostDoc Position
November 2018 - April 2019
University of California, Merced
Position
  • Fulbright grantee
Education
September 2015 - September 2019
Universidad Politécnica de Madrid
Field of study
  • Physics of Complex Systems

Publications

Publications (30)
Preprint
Full-text available
This paper describes our proposed integration system for the spoofing-aware speaker verification challenge. It consists of a robust spoofing-aware verification system that use the speaker verification and antispoofing embeddings extracted from specialized neural networks. First, an integration network, fed with the test utterance's speaker verifica...
Article
Full-text available
This work presents three novel speech recognition architectures evaluated on the Spanish RTVE2020 dataset, employed as the main evaluation set in the Albayzín S2T Transcription Challenge 2020. The main objective was to improve the performance of the systems previously submitted by the authors to the challenge, in which the primary system scored the...
Article
Full-text available
Speech translation has been traditionally tackled under a cascade approach, chaining speech recognition and machine translation components to translate from an audio source in a given language into text or speech in a target language. Leveraging on deep learning approaches to natural language processing, recent studies have explored the potential o...
Article
Background Pause duration analysis is a common feature in the study of discourse in Alzheimer’s disease (AD) since this patient group has shown a consistent trend for longer pauses in comparison to healthy controls. This speech feature may also be helpful for early detection; however, studies involving patients at the pre-clinical, high-risk phase...
Article
Full-text available
Automatic speech recognition in patients with aphasia is a challenging task for which studies have been published in a few languages. Reasonably, the systems reported in the literature within this field show significantly lower performance than those focused on transcribing non-pathological clean speech. It is mainly due to the difficulty of recogn...
Article
Full-text available
Menzerath’s law is a quantitative linguistic law which states that, on average, the longer is a linguistic construct, the shorter are its constituents. In contrast, Menzerath-Altmann’s law (MAL) is a precise mathematical power-law-exponential formula which expresses the expected length of the linguistic construct conditioned on the number of its co...
Preprint
Full-text available
Background: Pause duration analysis is a common feature in the study of discourse in Alzheimer's disease (AD) and may also be helpful for its early detection. However, studies involving patients with amnestic mild cognitive impairment (aMCI) have yielded varying results. Objectives: To characterize the probability density distribution of speech pau...
Article
Full-text available
MULTIFRAC is an ImageJ plugin that addresses, through a user-friendly interface, the characterization and multiscaling analysis of 2D and 3D binary and gray images. It is notably recommended for the study of complex void structure and scaling behavior in soil science as well as for the analysis of self-similar patterns in any segmented phase. The m...
Article
Characterization of the complex soil structure is one the cornerstones of soil science and pore space detection is a crucial step in this process. Synthetic soil image construction has been proved to be an efficient resource for validating different binarization methods given that, unlike in real world, ground truth information is known. In this wo...
Article
Full-text available
Throughout the twentieth century, studies in quantitative linguistics have been showing the emergence of potential laws in languages, first in written texts and later in speech. These laws seem ubiquitous and robust but, why do they appear in language? Are they spurious results due to the arbitrariness of the segmentation of words or are they reall...
Article
Full-text available
In this work we consider Glissando Corpus—an oral corpus of Catalan and Spanish—and empirically analyze the presence of the four classical linguistic laws (Zipf’s law, Herdan’s law, Brevity law, and Menzerath–Altmann’s law) in oral communication, and further complement this with the analysis of two recently formulated laws: lognormality law and siz...
Thesis
Full-text available
Linguistic laws constitute one of the cornerstones, quantitatively measurable, of modern cognitive sciences and linguistics, and have been intensively researched during last century, mainly in written corpora. The conclusions reached from the study of statistical patterns of language are therefore, biased by the segmentation used, and characteristi...
Data
I. Additional details on the Buckeye corpus II. Additional results on the stochastic model of time duration II.A. Lognormality law for individual speakers II.B. Additional representations on the duration distribution of linguistic units II.C. Limit distributions of sums of independent Lognormals II.D. The error terms at word and BG levels III. Zipf...
Article
Full-text available
Physical manifestations of linguistic units include sources of variability due to factors of speech production which are by definition excluded from counts of linguistic symbols. In this work, we examine whether linguistic laws hold with respect to the physical manifestations of linguistic units in spoken English. The data we analyse come from a ph...
Poster
Full-text available
After a previous study on linguistic laws at the pre-phonemic level, in this work we verify with accuracy that acoustically transcribed durations of linguistic units at several scales (phonemes, words and Breath Groups) comply with log-normal distribution. To do this we have used a well-known Corpus (Buckeye Corpus) which contains conversational sp...
Poster
Full-text available
Brevity and frequency are two crucial factors in the processes of statistical learning in language. The compression principle had already been used previously to explain the origin of Zipf’s law for the frequency of words. Here we use a model from information theory to also explain the Zipf’s law of abbreviation, or the statistical tendency of more...
Chapter
Full-text available
As previously discussed in Chap. 5, soil structure is defined by the spatial arrangement of soil primary particles and aggregates. There is increasing evidence that quantitative characterization of the soil structure and of its heterogeneity and complexity holds the key to a deeper understanding on physical, chemical, and biological processes that...
Poster
Full-text available
In this work we compare the scaling behaviour of 2D Binary and Grey Values (GV) synthetic images, with real soil GV images and their corresponding binarized images. Synthetic Gray images where created using the Truncated Multifractal method (TMM) shown by Martín Sotoca et al. (2016, 2017). Real 2D gray images with different porosities are used for...
Poster
Full-text available
Linguistic laws constitute one of the quantitative cornerstones of modern cognitive sciences and have been routinely investigated in written corpora, or in the equivalent transcription of oral corpora. This means that inferences of statistical patterns of language in acoustics are biased by the arbitrary, language-dependent segmentation of the sign...
Article
Full-text available
Linguistic laws constitute one of the quantitative cornerstones of modern cognitive sciences and have been routinely investigated in written corpora, or in the equivalent transcription of oral corpora. This means that inferences of statistical patterns of language in acoustics are biased by the arbitrary, language-dependent segmentation of the sign...
Article
Recently, X-ray microtomography (μCT) has open a new way to study soil pore structures. However, μCT data originally comes as gray scale images and the selection of segmentation method to binarized it has an important influence in the structure characterization. Three soil μCT 3D images, corresponding to ploughed soil with different tillage tools,...
Article
Full-text available
Soil structure may be defined as the spatial arrangement of soil particles, aggregates and pores. The geometry of each one of these elements and their spatial arrangement has a great influence on the transport of fluids and solutes through the soil. Soil thin sections (STS) have been widely used to characterise them and more recently computed tomog...
Article
Full-text available
In this work, we study properties of texts from the perspective of complex network theory. Words in given texts are linked by co-occurrence and transformed into networks, and we observe that these display topological properties common to other complex systems. However, there are some properties that seem to be exclusive to texts; many of these prop...
Article
Full-text available
This paper aims to demonstrate the applicability of the Visual Graph algorithm in the characterization of the surface texture in manufactured parts. An experimental methodology has been designed to this purpose. Several surfaces have been manufactured with two different end-mills and using a set of different feed values. On the one hand, typical va...
Thesis
Full-text available
This work focuses to study soil structure as a complex system characterizing it through their multiscaling behaviour. The soil performs important functions as a medium for plant growth, water storage, modifier of the atmosphere and it is a habitat for organisms. Soil structure can be modelled as the spatial arrangement of soil particles, aggregates...
Article
Full-text available
In this work, we show how number theoretical problems can be fruitfully approached with the tools of statistical physics. We focus on g-Sidon sets, which describe sequences of integers whose pairwise sums are different, and propose a random decision problem which addresses the probability of a random set of k integers to be g-Sidon. First, we provi...
Article
Full-text available
Since the beginning of this century many complex systems have been studied by Complex Network Theory. Social, biological and technological systems have been viewed as networks were nodes represent system elements and edges represent the relationship between them. In this work we study languages from this perspective. Transforming texts in networks...

Network

Cited By

Projects

Projects (3)
Project
Creation of a (Castilian Spanish) corpus of oral narratives by subjects with multi-domain amnestic MCI with memory encoding deficits and memory retrieval only, as well as mild-to-moderate Alzheimer's dementia and age-matched healthy controls. Automatic extraction of linguistic features with the purpose of developing ML modelsfor the classification of the different clinical groups, in addition to the exploration of the links between their speech and neuropsychological profiles.
Project
Characterise the complex structure of pore distribution in soils. Apply measures as multifractal analysis, lacunarity or configuration entropy to the study of 2D and 3D soil images.
Project
Study of linguistic laws at different scales Origin of linguistic laws Characterisation of human voice