Francesca Frontini

Francesca Frontini
  • PhD
  • Professor (Associate) at Italian National Research Council

About

80
Publications
9,777
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
410
Citations
Current institution
Italian National Research Council
Current position
  • Professor (Associate)
Additional affiliations
June 2015 - August 2016
Italian National Research Council
Position
  • Researcher
September 2014 - May 2015
Sorbonne University
Position
  • PostDoc Position
October 2006 - April 2010
University of Pavia
Position
  • PhD Student
Education
October 2006 - May 2010
University of Pavia
Field of study
  • Linguistics

Publications

Publications (80)
Article
Full-text available
In this article, we introduce a new framework, the Intensional–Ontological Model (IOM), for representing meaning, and especially for representing semantic change, in linguistic linked data resources. This framework, which makes use of previous work in the literature on lexical semantics and ontologies, is intended to help clarify what we mean when...
Conference Paper
Full-text available
Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent...
Article
Full-text available
Audiobooks are a common part of everyday life and their possibilities are also used when learning a foreign language. The article comprehensively addresses the issue of audiobooks as a variation of the Listening comprehension activity on a sample of secondary and university students. In addition to their experience with audiobooks, their attitude t...
Article
Full-text available
CLARIN is a European Research Infrastructure Consortium developing and providing a federated and interoperable platform to support scientists in the field of the Social Sciences and Humanities in carrying-out language-related research. This contribution provides an overview of the entire infrastructure with a particular focus on tool interoperabili...
Conference Paper
Questo poster descrive gli obiettivi del progetto H2IOSC, Humanities and Heritage Open Science Cloud, che mira a costruire un cluster federato e inclusivo di IR nel dominio ESFRI dell'innovazione sociale e culturale volto a supportare ai ricercatori nelle varie discipline nei settori delle scienze umane, delle tecnologie linguistiche e dei beni cul...
Conference Paper
We present AcTo, a network of integrated projects for the development of language resources and tools for Medieval Occitan. This abstract illustrates the resources in the network, as well as the first steps towards their integration, aiming towards the harmonisation and interoperability of NLP and lexical resources for the annotation of digital edi...
Chapter
Full-text available
CLARIN stands for “Common Language Resources and Technology Infrastructure”. In 2012 CLARIN ERIC was established as a legal entity with the mission to create and maintain a digital infrastructure to support the sharing, use, and sustainability of language data (in written, spoken, or multimodal form) available through repositories from all over Eur...
Conference Paper
Full-text available
This paper describes the process of acquisition, cleaning, interpretation, coding and linguistic annotation of a collection of parliamentary debates from the Senate of the Italian Republic covering the COVID-19 pandemic emergency period and a former period for reference and comparison according to the CLARIN ParlaMint prescriptions. The corpus cont...
Preprint
Tool criticism has made us aware that the digital tools widely distributed in Digital Humanities have the power to reify. Therefore, the community needs a handle for gauging their validity, and a capacity for producing plausibility. But tools are not ‘just tools.’ The current panorama in Digital Literary Studies (DLS) presents a plethora of instrum...
Article
Full-text available
Cette recherche vise à étudier les préférences des auditeurs concernant les voix des livres audio. Des échantillons de 8 voix masculines et 7 voix féminines ont été extraits de différents livres audio et analysés. Une enquête a été réalisée pour obtenir le point de vue de 69 auditeurs en répondant à des questions sur les caractéristiques vocales. L...
Conference Paper
Full-text available
Over the course of the last few years, lexicography has witnessed the burgeoning of increasingly reliable automatic approaches supporting the creation of lexicographic resources such as dictionaries, lexical knowledge bases and annotated datasets. In fact, recent achievements in the field of Natural Language Processing and particularly in Word Sens...
Conference Paper
Full-text available
In this paper, the authors present a French Mediated Digital Discourse corpus, (88milSMS http://88milsms.huma-num.fr https://hdl.handle.net/11403/comere/cmr-88milsms). Efforts were undertaken over the years to ensure its publication according to the best practices and standards of the community, thus guaranteeing compliance with FAIR principles and...
Chapter
We present the MeDO project, aimed at developing resources for text mining and information extraction in the wastewater domain. We developed a specific Natural Language Processing (NLP) pipeline named WEIR-P (WastewatEr InfoRmation extraction Platform) which identifies the entities and relations to be extracted from texts, pertaining to information...
Article
Full-text available
This paper provides an introduction to the contributions presented in this thematic issue dedicated to the spatial humanities. Three main themes are addressed: (1) the processing of spatial information in textual corpora resulting from work in the human and social sciences, mainly in literary studies; (2) problems of acquisition, spatialisation and...
Article
Full-text available
The publication of Nan Z. Da's study in Critical Inquiry has triggered a debate about the methodological and conceptual dimensions of digitally assisted inquiry in literary studies. Nan Z. Da's fundamental critique of what she calls "Computational Literary Studies" addresses the work of the international Special Interest Group"DigitalLiterary Styli...
Conference Paper
RÉSUMÉ. Le projet "Mégadonnées, données liées et fouille de données pour les réseaux d’assainissement" (MeDo) a pour objectif de tirer profit des mégadonnées disponibles sur le web pour renseigner la géométrie et l’historique d’un réseau d’assainissement, en combinant différentes techniques de fouille de données et en multipliant les sources analys...
Conference Paper
Résumé. Dans cet article, nous proposons une chaîne de traitement reposant sur deux outils existants, l'un pour la reconnaissance des entités nommées, et l'autre pour la résolution des entités nommées. Par la suite, l'évaluation et l'adaptation de ces systèmes à l'analyse des textes issus de la littérature française du 19 ème siècle sont présentés....
Article
This paper discusses current open data science and archive principles and issues and their applicability to speech and oral archives. Firstly, a definition of speech and oral archives is provided: they represent a rather tricky object of study not only in language related studies but also in the social sciences and humanities. Secondly, we introduc...
Chapter
This work uses computational approaches to study literature. It addresses the question of characterisation in theatrical plays, concentrating on the work of Moliere, and trying to identify distinctive traits in the “voices” of some of the famous characters created by the French playwright. The used technique adopts a syntagmatic approach, targeting...
Conference Paper
In this article we describe our ongoing attempts to use the Semantic Web Rule Language (SWRL) to model the morphological layer of a wide-coverage Italian lexical resource, Parole-Simple-Clips (PSC); in this case that subset of PSC dealing with Italian noun morphology. After giving a brief introduction to SWRL and to Italian noun morphology we go on...
Article
More applications in the Digital Humanities rely on Linked Data for the semantic enrichment of digital collections by means of URI, typically for providing background information about authors, works of art and historical places, mentioned in these collections. In this sense, Named Entity Linking (NEL) is the task of automatically assigning the app...
Article
Full-text available
Il 1o ottobre 2015 il MIUR firma l’adesione dell’Italia a CLARIN-ERIC, l’infrastruttura di ricerca che offre risorse e tecnologie linguistiche dedicate al settore delle scienze del linguaggio e delle scienze umane e sociali. Questo articolo intende fornire alla comunità italiana una ampia panoramica di CLARIN, la sua missione, i suoi pilastri, i se...
Chapter
This chapter presents a computer platform supporting a Marine Information and Knowledge System based on a repository that gathers, classify and structures marine scientific literature and data, guaranteeing their accessibility by means of standard protocols. This requires the access to quality controlled data and to information that is provided in...
Conference Paper
This paper illustrates the transformation of GeoNames’ ontology concepts, with their English labels and glosses, into a GeoDomain WordNet-like resource in English, its translation into Italian, and its linking to the existing generic WordNets of both languages. The paper describes the criteria used for the linking of domain synsets to each other an...
Article
Full-text available
This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors’ names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, cand...
Article
Full-text available
In this paper we describe ongoing work in the restructuring of a tagset originally organised as a taxonomy and used to annotate literary themes and motifs in a corpus of classical works of poetry from a number of different traditions. We show how such a tagset can be rendered more efficient and useful through the appropriation of ideas and techniqu...
Article
Full-text available
This paper aims to discuss the challenges and benefits of the annotation of place names in literary texts and literary criticism. We shall first highlight the problems of encoding spatial information in digital editions using the TEI format by means of two manual annotation experiments and the discussion of various cases. This will lead to the ques...
Conference Paper
We present outgoing research whose goal is to assess quality of Linked Data for its usage in domain-specific Named-entity Linking (NEL). NEL is the task of assigning appropriate referents, typically an Uniform Resource Identifier (URI), to mentions of entities (e.g. persons or places) identified in textual documents. Nowadays, many of these approac...
Conference Paper
The present article discusses first experiments in toponym linking of Modern French digital editions aiming to provide an external referent to Linked Data sources. We have so far focused on testing two knowledge bases - French DBpedia and Geonames - for recall. Results highlight quality issues in these data sets for usage in NLP-tasks in domain-spe...
Conference Paper
This paper proposes a graph-based algorithm baptized REDEN for the disambiguation of authors’ names in French literary criticism texts and scientific essays from the 19th century. It leverages knowledge from different Linked Data sources in order to select candidates for each author mention, then performs fusion of DBpedia and BnF individuals into...
Article
Full-text available
Published in the abstract book of the 8th International Corpus Linguistics Conference (CL2015), Lancaster University 21-24 July 2015
Conference Paper
We present REDEN, a tool for graph-based Named Entity Linking that allows for the disambiguation of entities using domain-specific Linked Data sources and different configurations (e.g. context size). It takes TEI-annotated texts as input and outputs them enriched with external references (URIs). The possibility of customizing indexes built from va...
Conference Paper
This paper proposes a graph based methodology for automatically disambiguating authors’ mentions in a corpus of French literary criticism. Candidate referents are identified and evaluated using a graph based named entity linking algorithm, which exploits a knowledge-base built out of two different resources (DBpedia and the BnF linked data). The al...
Article
This paper describes the publication and linking of (parts of) PAROLE SIMPLE CLIPS (PSC), a large scale Italian lexicon, to the Semantic Web and the Linked Data cloud using the lemon model. The main challenge of the conversion is discussed, namely the reconciliation between the PSC semantic structure which contains richly encoded semantic informati...
Article
Full-text available
In this contribution, we present a computational stylistic study and comparison of classic French literary texts based on a data-driven approach where discovering interesting linguistic patterns is done without any prior knowledge. We propose an objective measure capable of capturing and extracting meaningful stylistic syntactic patterns from a giv...
Article
The MAPS (Marine Planning and Service Platform) project is a development of the Marine project (Ricerca Industriale e Sviluppo Sperimentale Regione Liguria 2007-2013) aiming at building a computer platform for supporting a Marine Information and Knowledge System, as part of the data management activities. One of the main objective of the project is...
Chapter
CLiC-it 2015 is held in Trento on December 3-4 2015, hosted and locally organized by Fondazione Bruno Kessler (FBK), one the most important Italian research centers for what concerns CL. The organization of the conference is the result of a fruitful conjoint effort of different research groups (Università di Torino, Università di Roma Tor Vergata a...
Conference Paper
Full-text available
In this paper we propose a model, called lemonDIA, for representing lexical semantic change using the lemon framework and based on the ontological notion of the perdurant. Namely we extend the notion of sense in lemon by adding a temporal dimension and then define a class of perdurant entities that represents a shift in meaning of a word and which...
Conference Paper
Full-text available
A complete picture of currently available language resources and technologies for the under-resourced languages of Europe is still lacking. Yet this would help policy makers, researchers and developers enormously in planning a roadmap for providing all languages with the necessary instruments to act as fully equipped languages in the digital era. I...
Conference Paper
Full-text available
In this paper we focus on the creation of general-purpose (as opposed to domain-specific) polarity lexicons in five languages: French, Italian, Dutch, English and Spanish using WordNet propagation. WordNet propagation is a commonly used method to generate these lexicons as it gives high coverage of general purpose language and the semantically rich...
Conference Paper
Full-text available
Action verbs have many meanings, covering actions in different ontological types. Moreover, each language categorizes action in its own way. One verb can refer to many different actions and one action can be identified by more than one verb. The range of variations within and across languages is largely unknown, causing trouble for natural language...
Conference Paper
Full-text available
Action verbs express important information in a sentence and they are the most frequent elements in speech, but they are also one of the most difficult part of the lexicon to learn for L2 language learners, because languages segment these concepts in very different ways. The two sentences "Mary folds her shirt" and "Mary folds her arms" refer to tw...
Conference Paper
Full-text available
In the last 20 years dictionaries and lexicographic resources such as WordNet have started to be enriched with multimodal content. Short videos depicting basic actions support the user’s need (especially in second language acquisition) to fully understand the range of applicability of verbs. The IMAGACT project has among its results a repository of...
Conference Paper
Full-text available
This paper describes the development of a web-service tool for the automatic extraction of Multi-word expressions lexicons, which has been integrated in a distributed platform for the automatic creation of linguistic resources. The main purpose of the work described is thus to provide a (computationally "light") tool that produces a full lexical re...
Article
Full-text available
Efficient access to information is crucial in the work of organizations that require decision taking in emergency situations. This paper gives an outline of GLOSS, an integrated system for the analysis and retrieval of data in the environmental and public security domain. We shall briefly present the GLOSS infrastructure and its use, and how semant...
Conference Paper
Full-text available
Action verbs, which are highly frequent in speech, cause disambiguation problems that are relevant to Language Technologies. This is a consequence of the peculiar way each natural language categorizes Action i.e. it is a consequence of semantic factors. Action verbs are frequently “general”, since they extend productively to actions belonging to di...
Conference Paper
Full-text available
Action verbs are the less predictable linguistic type for bilingual dictionaries and they cause major problems for NLP technologies. This is not only because of language specific phraseology, but it is rather a consequence of the peculiar way each language categorizes events. In ordinary languages the most frequent action verbs are “general”, sinc...
Conference Paper
Full-text available
This paper presents the IMAGACT annotation infrastructure which uses both corpus-based and competence-based methods for the simultaneous extraction of a language independent Action ontology from English and Italian spontaneous speech corpora. The infrastructure relies on an innovative methodology based on images of prototypical scenes and will iden...
Conference Paper
Full-text available
This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborat...
Conference Paper
Full-text available
This paper presents the metadata schema for describing language resources (LRs) cur-rently under development for the needs of META-SHARE, an open distributed facility for the exchange and sharing of LRs. An es-sential ingredient in its setup is the existence of formal and standardized LR descriptions, cornerstone of the interoperability layer of an...
Article
Full-text available
Techniques developed for synchronic text classification problems are applied to a significantly diachronic dataset. The scale of the temporal categories appears to matter. The problem addressed is that of using automated text classification methods to temporally locate The Donation of Constantine. The results reported do not contradict the analysis...
Article
Full-text available
Our work 1 explores the advantages of adopting a strict form-to-function perspective when annotating learner corpora. Hopefully, such a perspective provides both Foreign Language Teaching (FLT) and Second Language Acquisition (SLA) researchers with insights not relating to learners' errors, but to some systematic features of interlanguage (IL). A s...

Network

Cited By