Fabio Tamburini

Fabio Tamburini
University of Bologna | UNIBO · Department of Classical Philology and Italian Studies FICLIT

MS & PhD Computer Science


How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more


Publications (111)
Download HERE: http://amsacta.unibo.it/7115/1/CLUB_WPL_volume6_2022.pdf Questo sesto volume della collana “CLUB Working Papers in Linguistics” raccoglie alcuni dei contributi presentati nel corso delle iniziative organizzate dal Circolo Linguistico dell’Università di Bologna nell’anno accademico 2020-2021. Risalgono al programma ufficiale i primi...
Full-text available
Ancient undeciphered scripts present problems of different nature, not just tied to linguistic identification. The undeciphered Cypro-Minoan script from second millennium BCE Cyprus, for instance, currently does not have a standardized, definitive inventory of signs, and, in addition, stands divided into three separate subgroups (CM1, CM2, CM3), wh...
Conference Paper
Full-text available
The application of machine learning techniques to ancient writing systems is a relatively new idea, and it poses interesting challenges for researchers. One particularly challenging aspect is the scarcity of data for these scripts, which contrasts with the large amounts of data usually available when applying neural models to computational linguist...
Full-text available
Purpose Attention has recently been paid to Clinical Linguistics for the detection and support of clinical conditions. Many works have been published on the “linguistic profile” of various clinical populations, but very few papers have been devoted to linguistic changes in patients with eating disorders. Patients with Anorexia Nervosa (AN) share si...
Full-text available
This paper presents work in progress for the creation of a Large Vocabulary Automatic Speech Recogniser for Italian using NVIDIA NeMo. Thanks to this package, we were able to build a reliable recogniser for adults’ speech by fine tuning the English model provided by NVIDIA and rescoring it with powerful neural language models, obtaining very good p...
Full-text available
Digital Linguistic Biomarkers extracted from spontaneous language productions proved to be very useful for the early detection of various mental disorders. This paper presents a computational pipeline for the automatic processing of oral and written texts: the tool enables the computation of a rich set of linguistic features at the acoustic, rhythm...
Full-text available
The increasing interest in various types of conversational interfaces has been supported by a progressive standardization of the technological frameworks used to build them. However, the landscape of available methodological frameworks for designing conversations is much more fragmented. We propose a highly generalizable methodology for designing c...
Full-text available
Purpose: attention has recently been paid to Clinical Linguistics for the detection and support of clinical conditions. Many works have been published on the “linguistic profile” of various clinical populations, but very few papers have been devoted to linguistic changes in patients with eating disorders. Patients with Anorexia Nervosa (AN) share s...
Almost 50 million people are living with dementia in 2018 worldwide, and the number will double every 20 years. The effectiveness of existing pharmacologic treatments for the disease is limited to symptoms control, and none of them are able to prevent, reverse or turn off the neurodegenerative process that leads to dementia; therefore, a prompt det...
Lingue e linguaggio (ISSN 1720-9331) Fascicolo 1, gennaio-giugno 2021 Ente di afferenza: Università di Bologna (unibo) Copyright c by Società editrice il Mulino, Bologna. Tutti i diritti sono riservati. Per altre informazioni si veda https://www.rivisteweb.it Licenza d'uso L'articoloè messo a disposizione dell'utente in licenza per uso esclusivamen...
Full-text available
Minoan Linear A is still an undeciphered script mainly used for administrative purposes on Bronze Age Crete. One of its most enigmatic features is the precise mathematical values of its system of numerical fractions. The aim of this article is to address this issue through a multi-stranded methodology that comprises palaeographical examination and...
Conference Paper
Full-text available
Recent works indicated the potential relevance of Natural Language Processing techniques for the detection of clinical conditions. This paper tries to address the issue in the Eating Disorder domain, by exploiting “linguistic biomarkers” for Anorexia Nervosa (AN) detection in female teenagers. We hypothesize that (i) disturbances in self-perceived...
Full-text available
This paper presents a new pitch tracking smoother based on deep neural networks (DNN). It leverages Long Short-Term Memories, a particular kind of recurrent neural network, for correcting pitch detection errors produced by state-of-the-art Pitch Detection Algorithms. The proposed system has been extensively tested using two reference benchmarks for...
Conference Paper
Full-text available
Text summarization has gained a consid-erable amount of research interest due todeep learning based techniques. We lever-age recent results in transfer learning forNatural Language Processing (NLP) us-ing pre-trained deep contextualized wordembeddings in a sequence-to-sequence ar-chitecture based on pointer-generator net-works.We evaluate our appro...
Full-text available
In this paper we present a work which aims to test the most advanced, state-of-the-art syntactic dependency parsers based on deep neural networks (DNN) on Italian. We made a large set of experiments by using two Italian treebanks containing different text types downloaded from the Universal Dependencies project and propose a new solution based on e...
Full-text available
Background: The discovery of early, non-invasive biomarkers for the identification of “preclinical” or “pre-symptomatic” Alzheimer's disease and other dementias is a key issue in the field, especially for research purposes, the design of preventive clinical trials, and drafting population-based health care policies. Complex behaviors are natural ca...
On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference seri...
On behalf of the Program Committee, a very warm welcome to the Fifth Italian Conference on Computational Linguistics (CLiC-­‐it 2018). This edition of the conference is held in Torino. The conference is locally organised by the University of Torino and hosted into its prestigious main lecture hall “Cavallerizza Reale”. The CLiC-­‐it conference seri...
Full-text available
A better use of the increasing functional capabilities of home automation systems and Internet of Things (IoT) devices to support the needs of users with disability, is the subject of a research project currently conducted by Area Ausili (Assistive Technology Area), a department of Polo Tecnologico Regionale Corte Roncati of the Local Health Trust...
Conference Paper
Full-text available
The main aim of the project described in this paper is to develop an experimental low cost system for environmental control through simplified user interfaces and voice control, to better respond to the needs of users with motor speech impairments (dysarthria). The project is actually being conducted by Area Ausili, a department of Polo Tecnologico...
Conference Paper
Full-text available
Full-text available
La collana pubblica gli atti del convegno annuale di Linguistica Computazionale (CLiC-it), che ha lo scopo di costituire un luogo di discussione di riferimento nel campo delle ricerce sulla linguistica computazionale. Gli atti includono interventi sul trattamento automatico della lingua, comprendenti le riflessioni teoriche e metodologiche sul tema...
Conference Paper
Full-text available
This paper presents some preliminary results of the OPLON project. It aimed at identifying early linguistic symptoms of cognitive decline in the elderly. This pilot study was conducted on a corpus composed of spontaneous speech sample collected from 39 subjects, who underwent a neuropsychological screening for visuo-spatial abilities, memory, langu...
The annual conference CLIC–it (''Italian Conference on Computational Linguistics'') is an initiative of the ''Italian Association of Computational Linguistics'' (AILC – www.ai-lc.it) which is intended to meet the need for a national and international forum for the promotion and dissemination of high-level original research in the field of Computati...
The annual conference CLIC–it (''Italian Conference on Computational Linguistics'') is an initiative of the ''Italian Association of Computational Linguistics'' (AILC – www.ai-lc.it) which is intended to meet the need for a national and international forum for the promotion and dissemination of high-level original research in the field of Computati...
Full-text available
EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for the Italian language: since 2007 shared tasks have been proposed covering the analysis of both written and spoken language with the aim of enhancing the development and dissemination of resources and technologies for Italian. EVALITA is an initiative of the Itali...
EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for the Italian language: since 2007 shared tasks have been proposed covering the analysis of both written and spoken language with the aim of enhancing the development and dissemination of resources and technologies for Italian. EVALITA is an initiative of the Itali...
Due to increased life expectancy, the prevalence of cognitive decline related to neurodegenerative diseases and to non-neurological conditions is increasing in western countries. As with other diseases, the burden might be reduced through personalized interventions delivered at early stages of the disease. Thus, there is an increasing demand, from...
Full-text available
In this paper we investigate whether it is possible to create a computational approach that allows us to distinguish topical tags (i.e. talking about the topic of a resource) and non-topical tags (i.e. describing aspects of a resource that are not related to its topic) in folksonomies, in a way that correlates with humans. Towards this goal, we col...
Prosodic prominence is commonly regarded as the perceptual salience of a linguistic unit relative to its environment. However, we are far from having a consensus on how it is measured subjectively and how it relates to objectively measurable acoustic events or linguistic structures such as lexical stress, prosodic focus, etc. Here we will concentra...
Conference Paper
Full-text available
English. This paper presents a work in progress on the design of a sentiment polarity classification system that participates in the EVALITA 2014 SENTIPOLC task. Although we have been working on the system implementation for only three months, the results are promising, as the system ranked 5th (out of 9) in the subjec-tivity detection task and 7th...
Full-text available
In this paper we investigate whether it is possible to create a computational approach that allows us to distinguish topical tags (i. e. , talking about the topic of a resource) and non-topical tags (i. e. , describing aspects of a resource that are not related to its topic) in folksonomies , in a way that correlates with humans. Towards this goal...
Full-text available
Prosodic prominence, a speech phenomenon by which some linguistic units are perceived as standing out from their environment, plays a very important role in human communication. In this paper we present a study on automatic prominence identification using Probabilistic Graphical Models, a family of Machine Learning Systems able to properly handle s...
This paper reports on the EVALITA 2011 Lemmatisation task, an initiative for the evaluation of automatic lemmatisation tools specifically developed for the Italian language. Despite lemmatisation is often considered a subproduct of a PoS-tagging procedure that does not cause any particular problem, there are a lot of specific cases, certainly in It...
Full-text available
This paper presents the AnIta-Lemmatiser, an automatic tool to lemmatise Italian texts. It is based on a powerful morphological analyser enriched with a large lexicon and some heuristic techniques to select the most appropriate lemma among those that can be morphologically associated to an ambiguous wordform. The heuristics are essentially based on...
Conference Paper
Full-text available
Regularities in position and level of prosodic prominences associated to patterns of Information Structure are identified for some Italian varieties. The experiments' results suggest a possibly new structural hypothesis on the role and function of the main prominence in marking information patterns. (1) An abstract and merely structural, “topologic...
Conference Paper
Full-text available
Regularities in position and level of prosodic prominences associated to patterns of Information Structure are identified for some Italian varieties. The experiments' results suggest a possibly new structural hypothesis on the role and function of the main prominence in marking information patterns. (1) An abstract and merely structural, “topologic...
Conference Paper
Full-text available
Regularities in position and level of prosodic prominences associated to patterns of Information Structure are identified for some Italian varieties. The experiments' results suggest a possibly new structural hypothesis on the role and function of the main prominence in marking information patterns. (1) An abstract and merely structural, “topologic...
Full-text available
Regularities in position and level of prosodic prominences associated to patterns of Information Structure are identified for some Italian varieties. The experiments' results suggest a possibly new structural hypothesis on the role and function of the main prominence in marking information patterns. (1) An abstract and merely structural, “topologic...
Conference Paper
Full-text available
This paper describes the development and evaluation of a prosody prediction module for unit selection speech synthesis that is based on the notion of perceptual prominence. We outline the design principles of the module and describe its implementation in the Bonn Open Synthesis System (BOSS). Moreover, we report results of perception experiments th...
Full-text available
In Italian appositive compounds like parola chiave 'keyword', the non-head constituent (N2) often undergoes a metaphorical interpretation and behaves like an adjective, emphasising a property of the head (N1). The main research question of the work described in this paper is: Are these really noun-clad adjectives? Specifi cally: (i) Do N2s "formall...
Full-text available
This paper presents ongoing research concerning the annota-tion of large corpora with morphological information. It aims at providing a general schema for inserting rich morphological information to enable complex corpus queries of word internal structure. Annotating real cor-pus data presents challenges that can hardly be managed with traditional...
Conference Paper
Full-text available
Developed using the principles of the Model-View-Controller architectural pattern, FolksEngine is a parametric search engine for folksonomies that allows us to test arbitrary search improvement algorithms by specifying them in three phases: expansion, where the original query is converted in multiple ones according to semantic rules associated to t...
Conference Paper
Full-text available
Semantic search engines rely on the existence of a rich set of semantic connections between the concepts associated to documents and those used for the queries. With folksonomies, this is not always guaranteed. Creating clusters of folksonomic tags around terms of controlled ontological vocabularies is a potentially sophisticated approach, but algo...
Full-text available
This paper presents an evolution of CORISTagger [1], an high-perfor-mance PoS-tagger for Italian developed at the University of Bologna. The sys-tem is composed of a second-order Hidden Markov Model tagger followed by a Transformation Based tagger. The use of such a stacked structure, paired with a powerful morphological analyser based on a large l...
Full-text available
We investigate the use of polymorphic categorial grammars as a model for parsing natural language. We will show that, despite the undecidability of the general model, a subclass of polymorphic categorial grammars, which we call linear, is mildly context-sensitive and we propose a polynomial parsing algorithm for them. An interesting aspect of the r...
Conference Paper
Full-text available
EVALITA 2007, the first edition of the initiative d evoted to the evaluation of Natural Language Proces sing tools for Italian, provided a shared framework where participants' systems had th e possibility to be evaluated on five different tas ks, namely Part of Speech Tagging (organised by the University of Bologna), P arsing (organised by the Univ...
Full-text available
Perceptual prominence is an important indicator of a word's and syllable's lexical, syntactic, semantic and pragmatic status in a discourse. Its automatic annotation would be a valuable enrichment of large databases used in unit selection speech synthesis and speech recognition. While much research has been carried out on the interaction between pr...
Full-text available
We aim to automatically induce a PoS tagset for Italian by analysing the distributional behaviour of Italian words. To this end, we propose an algorithm that (a) extracts information from loosely labelled dependency structures that encode only basic and broadly accepted syntactic relations, namely Head/Dependent and the distinction of dependents in...
Full-text available
In this paper we present work in progress on the PoS annotation of an Italian Corpus (CORIS) developed at CILTA (University of Bologna). We aim to automatically induce the PoS tagset by analysing the distributional behaviour of Italian words by relying only on theory-neutral linguistic knowledge. To this end, we propose an algorithm that derives a...
Conference Paper
Full-text available
This paper presents a follow up of a study on the automatic detection of prosodic prominence in continuous speech. Prosodic prominence involves two different prosodic features, pitch accent and stress, that are typically based on four acoustic parameters: fundamental frequency (F0) movements, overall syllable energy, syllable nuclei duration and mi...
Full-text available
A precise identification of prosodic phenomena and the construction of tools able to properly manage such phenomena are essential steps to disambiguate the meaning of certain utterances. In particular they are useful for a wide variety of tasks: automatic recognition of spontaneous speech, automatic enhancement of speech-generation systems, solving...
Full-text available
The increasing demand for linguistic resources consisting of substantial amounts of data, such as large corpora, presents the challenge of building computational infrastructures capable of handling unprecedented amounts of information. One possible solution is the sharing of high-level, linguistically motivated and carefully balanced corpora for bu...
Full-text available
This paper presents a study on the automatic detection of prosodic prominence in continuous speech, with particular reference to American English, but with good prospects of application to other languages. Perceptual prosodic prominence is supported by two different prosodic features: pitch accent and stress. Pitch accent is acoustically connected...
Full-text available
In this abstract we will present work in progress on the annotation of Italian Cor-pora carried out at the Interfaculty Center for Theoretical and Applied Linguistics (CILTA) -University of Bologna. The project aims at tagging the 100-million-words synchronic corpus of contemporary Italian, CORIS/CODIS, with syntactic informa-tion. In particular, w...
Conference Paper
Full-text available
This paper presents work in progress on the automatic detection of prosodic prominence in continuous speech. Prosodic prominence involves two different phonetic features: pitch accents, connected with fundamental frequency (F0) movements and syllable overall energy, and stress, which exhibits a strong correlation with syllable nuclei duration and m...
Conference Paper
Full-text available
This paper presents work in progress on the automatic detection of prosodic prominence in continuous speech. Prosodic prominence involves two different phonetic features: pitch accents, connected with fundamental frequency (FO) movements and syllable overall energy, and stress, which exhibits a strong correlation with syllable nuclei duration and h...
Full-text available
A corpus of written Italian -- CORIS -- has been under construction at the Centre for Theoretical and Applied Linguistics of Bologna University (CILTA) since 1998 and will soon be completed and made available on-line. The project aims at creating a representative and sizeable general reference corpus of contemporary Italian designed to be easily ac...
Full-text available
this article together. As far as academic requirements are concerned, F. Tamburini takes official responsibility for sections 1, 2 and 4 and S. Paci for sections 3
Full-text available
This paper intends to present the main lines of work in progress based on this empirical approach to linguistic analysis. In particular, we focus our attention on some problems relating to the morpho-syntactic annotation of corpora
Full-text available
This paper presents work in progress on the automatic detection of prosodic prominence in continuous speech. Prosodic prominence involves two different phonetic features: pitch accents, connected with fundamental frequency (F0) movements and syllable overall energy, and stress, which exhibits a strong correlation with syllable duration and high-fre...
Full-text available
A representative corpus of written Italian -- CORIS -- constructed at the Centre for Theoretical and Applied Linguistics of Bologna University (CILTA) is available on-line. Considering the importance of the comparability of reference corpora in interlinguistic studies, a further corpus -- CODIS -- was designed. Aimed at specialist needs, CODIS pres...
the XML specifications of a certain exercise type into a web form (presently we have 15 on-line templates for different teaching needs), and an XML parser builds the final DHTML page, inserting all the necessary scripts, interface objects (buttons, boxes, menus, layers, etc.), images and links for the selected exercise type. The final page is ready...
this paper, sections 2 and 3 describe the corpus design and formatting as well as the tools used to access corpus data. Sections 4 and 5 discuss two case studies on the basis of the analysis carried out in the pilot corpus now available -- about 18 m.w. -- . We consider two semantic areas which can be seen as two ends of the same variational contin...
The analysis of special multilingual corpora is still in its infancy, but it may serve a particularly important role for the directions it offers both in cross-linguistic investigation and in the selection of the most typical features of text types and genres. To exemplify the information which can be obtained from corpus evidence, the paper report...
Full-text available
The analysis of special multilingual corpora is still in its infancy, but it may serve a particularly important role for the directions it offers both in cross-linguistic investigation and in the selection of the most typical features of text types and genres. To exemplify the information which can be obtained from corpus evidence, the paper report...
Full-text available
This paper presents a numerical algorithm which realises a translation, rotation and scale invariant transform for single object grey scale images. This kind of algorithm is valid for all those applications which need to deal with shape information, independent of the orientation, distance and position of an object. The algorithm we present takes i...
Full-text available
Introduction Under the terms of the agreement with the Italian Ministry of Defence the aim of the MISSILE Project was to produce a software package that would enable learners to acquire a basic knowledge of the English language and/or improve their existing knowledge and become familiar with military terminology for communicating with military pers...
Full-text available
The language centre of the University of Bologna has been involved for many years in teaching languages using multimedia software in self-access environments. What we needed was authoring tools to create web-based exercises, modules and courses for language learning. Such tools had to be flexible, and able to accommodate the necessary procedures fo...