
Rodolfo Delmonte- Ph.D. Melbourne Australia
- Retired at Ca' Foscari University of Venice
Rodolfo Delmonte
- Ph.D. Melbourne Australia
- Retired at Ca' Foscari University of Venice
I've been currently working on Deep Learning for Meaning Understanding and why using word embeddings is wrong
About
157
Publications
37,759
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
871
Citations
Introduction
I have always been fascinated by highly elaborated linguistic products like narratives, novels and poetry. These texts have a multilayered structure that constitutes an attractive field of research by computer. Text and language understanding, including inferring and reasoning, which is done mostly on syntactic and semantic information processing is at the heart of my research. This has always been done simulating cognitive processing by means of a pipeline of interacting modules - GETARUNS.
Current institution
Publications
Publications (157)
In this paper we explore ChatGPT's ability to produce a summary, a precis and/or an essay on the basis of excerpts from a novel – The Solid Mandala - by Noble Prize Australian writer Patrick White. We use a number of prompts to test a number of functions related to narrative analysis from the point of view of the “sujet”, the “fable”, and the style...
In this article, we focus on the association of sound and sense harmony in the collection of sonnets written by Shakespeare in the XVI° beginning of the XVII° century and propose a new four-dimensional representation to visualize them by means of the system called SPARSAR. To compute the degree of harmony and disharmony, we automatically extracted...
In this paper we focus on the association of sound and sense harmony in the collection of sonnets written by Shakespeare in the XVI° beginning of XVII° century and propose a new four-dimensional representation to visualize them by means of the system called SPARSAR. To compute the degree of harmony and disharmony, we have automatically extracted th...
In questo lavoro presentiamo il lavoro in corso per la creazione della versione italiana di SPARSAR sistema creato per l'analisi e la visualizzazione del contenuto poetico-quindi, prosodico, retorico e semantico-di poesie inglesi a partire dalla poesia elisabettiana. Il lavoro di conversione ha attualmente raggiunto e completato un primo traguardo...
Poetic devices implicitly work towards inducing the reader to associate intended and expressed meaning to the sounds of the poem. In turn, sounds may be organized a priori into categories and assigned presumed meaning as suggested by traditional literary studies. To compute the degree of harmony and disharmony, I have automatically extracted the so...
This book explores the poetry of Francis Webb who was both one of the greatest poets of last century and a man who in the second half of his life went in and out of mental hospitals, where he also died. The first half of the book re-proposes the contents of the first edition, which analyzes in detail the poetic language, its metaphors, symbols and...
Poetry is regarded the highest linguistic form of expression of a language and its culture, for this reason it is and will always constitute fundamental target of linguistic studies besides literary criticism. As far as computational studies are concerned, poetry always represents a difficult task to cope with, the more so today with the advancemen...
We assume that poetic devices have an implicit goal: producing an overall sound scheme that will induce the reader to associate intended and expressed meaning to the sound of the poem. Sounds may be organized into categories and assigned presumed meaning as suggested by traditional literary studies. In my work, I have extracted automatically the so...
In this paper we present a set of experiments carried out with BERT on a number of Italian sentences taken from poetry domain. The experiments are organized on the hypothesis of a very high level of difficulty in predictability at the three levels of linguistic complexity that we intend to monitor: lexical, syntactic and semantic level. To test thi...
In this chapter we are concerned with cognitive models that may motivate emotive and affective reactions in poetry reading which are responsible of aesthetic pleasure. To experiment and verify our approach, we chose the collection of sonnets Shakespeare wrote toward the end of his life. We look into current cognitive theories related to work of art...
In this chapter I will be concerned with Language Understanding and the way in which implicit information is used to allow for efficient and effective communication. This will be counterposed to the current hype in AI raised by supporters of the so-called self-learning approach and tackle the debated question of whether Deep Neural Networks "unders...
In this paper we present a set of experiments carried out with BERT on a number of Italian sentences taken from poetry domain. The experiments are organized on the hypothesis of a very high level of difficulty in predictability at the three levels of linguistic complexity that we intend to monitor: lexical, syntactic and semantic level. To test thi...
In this paper we present an experiment carried out with BERT on a small number of Italian sentences taken from two domains: newspapers and poetry domain. They represent two levels of increasing difficulty in the possibility to predict the masked word that we intended to test. The experiment is organized on the hypothesis of increasing difficulty in...
In the use and creation of current Deep Learning Models the only number that is used for the overall computation is the frequency value associated with the current word form in the corpus, which is used to substitute it. Frequency values come in two forms: absolute and relative.
Absolute frequency is used indirectly when selecting the vocabulary ag...
In this paper I will tackle the debated question of whether Deep Neural Networks "understand" Natural Language when they process it either for classification or for generation tasks. I will start by quoting work currently carried out in the field of explainable AI then I will look into the internal procedures adopted by current DNNs to cope with th...
Understanding Natural Language, i.e. understanding words, sentences and paragraphs, summarizing texts, answering questions appropriately are the most yearned for goals at the heart of the AI revolution which aims at creating virtual assistants that can conduct a sensible and fruitful dialogue with its human interlocutor. No such goal can be accompl...
In this paper we present work carried out for the Ac-ComplIt task. ItVENSES is a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a statistically based approach a...
In this paper1 we present the results obtained with ItVENSES a system for syntactic and semantic processing that is based on the parser for Italian called ItGetaruns to analyse each sentence. In previous EVALITA tasks we only used semantics to produce the results. In this year EVALITA, we used both a fully and mixed statistically based approach and...
Almost eight years after his untimely death, the scientific contribution of Emanuele Pianta still appears significant to us, in particular for the variety of the topics he dealt with and for his capacity to move cross-disciplinarily between different areas of computational linguistics. Today, retracing the steps of Emanuele’s scientific carrier has...
This file contains the output of our system, GETARUN (General Text and Reference Understanding), for the sentences created from Italian poems discussed in the accompanying paper. The output is sequential and starts from tokenization, sentence splitting and tagging. Then it shows annotated c-structure representation and the mapping into f-structure,...
In this paper we present ongoing work to produce an expressive TTS reader that can be used both in text and dialogue applications. The system called SPARSAR has been used to read (English) poetry so far but it can now be applied to any text. The text is fully analyzed both at phonetic and phonological level, and at syntactic and semantic level. In...
In this paper we study, analyse and comment rhetorical figures present in some of most interesting poetry of the first half of the twentieth century. These figures are at first traced back to some famous poet of the past and then compared to classical Latin prose. Linguistic theory is then called in to show how they can be represented in syntactic...
In this paper we study, analyse and comment rhetorical figures present in some of most interesting poetry of the first half of the twentieth century. These figures are at first traced back to some famous poet of the past and then compared to classical Latin prose. Linguistic theory is then called in to show how they can be represented in syntactic...
EVALITA is a periodic evaluation campaign of Natural Language Processing (NLP) and speech tools for the Italian language. The general objective of EVALITA is to promote the development of language and speech technologies for the Italian language, providing a shared framework where different systems and approaches can be evaluated in a consistent ma...
This paper presents computational work to detect satire/sarcasm in long commentaries on Italian politics. It uses the lexica extracted from the manual annotation based on Appraisal Theory, of some 30 K word texts. The underlying hypothesis is that using this framework it is possible to precisely pinpoint ironic content through the deep semantic ana...
In this paper we present four experiments on the analysis Italian social media texts using a linguistically-based semantic approach. The experiments are respectively: two on newspaper articles about two political crises, one on a twitter corpus centered on political themes, and one on a case study of strategic plan programs of candidates to the pre...
Shakespeare's Sonnets have been studied by literary critics for centuries after their publication. However, only recently studies made on the basis of computational analyses and quantitative evaluations have started to appear and they are not many. In our exploration of the Sonnets we have used the output of SPARSAR which allows a full-fledged ling...
In this chapter I will be concerned with what characterizes human language and the parser that computes it in real communicative situations. I will start by discussing and dismissing the Hauser et al. (2002) (HC&F) disputed claim that the “only uniquely human component of the faculty of language” be “recursion”. I will substantiate my rejection of...
In this paper we present ongoing work for the correction of Extended WordNet (XWN), the most extended freely downloadable resource of Logical Forms (LFs)-by the Human Language Technology Research Institute (HLTRI) of University of Texas at Dallas (UTD). In a previous paper we reported on type and number of errors detected in the 140,000 entries of...
In this paper we present a system for modality detection which is then used for Subjectivity and Factuality evaluation. The system has been tested lately on a task for Subjectivity and Irony detection in Italian tweets (http:// www. di. unito. it/ ~tutreeb/ sentipolc-evalita14/ index. html), where the performance was 10th and 4th, respectively, ove...
State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations, which however don’t include null elements. This is done to facilitate structural learning and prevents the probabilistic engine to postulate the existence of deprecated null elements everywhere, see [19]. However it is a fact that...
In this paper we present ongoing work to produce an expressive TTS reader that can be used both in text and dialogue applications. The system called SPARSAR has been used to read (English) poetry so far but it can now be applied to any text. The text is fully analyzed both at phonetic and phonological level, and at syntactic and semantic level. In...
CLiC-it 2015 is held in Trento on December 3-4 2015, hosted and locally organized by Fondazione Bruno Kessler (FBK), one the most important Italian research centers for what concerns CL. The organization of the conference is the result of a fruitful conjoint effort of different research groups (Università di Torino, Università di Roma Tor Vergata a...
State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations which however don't include null elements. This is done to facilitate structural learning and prevent the probabilistic engine to postulate the existence of deprecated null elements everywhere (see [15]). However it is a fact that i...
GETARUN, the system for text understanding developed at the University of Venice, is equipped with three main modules: a lower module for parsing where sentence strategies are implemented; a middle module for semantic interpretation and discourse model construction which is cast into Situation Semantics; and a higher module where reasoning and gene...
State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations which however don't include null elements. This is done to facilitate structural learning and prevent the probabilistic engine to postulate the existence of deprecated null elements everywhere (see [15]). However it is a fact that i...
In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the...
The success of a newspaper article for the public opinion can be measured by the degree in which the journalist is able to report and modify (if needed) attitudes, opinions, feelings and political beliefs. We present a symbolic system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools (also available i...
This paper presents a multilingual system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools with the intent to characterize the journalistic discourse. The method could help journalists by evidencing pragmatic aspects of the discursive abilities of locutor.
As it is known, the success of a newspaper article for the public opinion can be measured by the degree in which the journalist is able to report and modify (if needed) attitudes, opinions, feelings and political beliefs. We present a symbolic system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools w...
The success of a newspaper article for the public opinion can be measured by the degree in which the journalist is able to report and modify (if needed) attitudes, opinions, feelings and political beliefs. We present a symbolic system for Italian, derived from GETARUNS, which integrates a range of natural language processing tools with the intent t...
We present a system for Question Answering which computes a prospective answer from Logical Forms produced by a full-fledged NLP for text understanding, and then maps the result onto schemata in SPARQL to be used for accessing the Semantic Web. As an intermediate step, and whenever there are complex concepts to be mapped, the system looks for a cor...
In this paper, we present our solution for argumentative analysis of call center conversations in order to provide useful insights for enhancing Customer Interaction Analytics to a level that will enable more qualitative metrics and key performance indicators (KPIs) beyond the standard approach used in Customer Interaction Analytics. These metrics...
We present an approach to lemmatization based on exhaustive morphological analysis and use of external knowledge sources to help disambiguation which is the most relevant issue to cope with. Our system GETARUNS was not concerned with lemmatization directly and used morphological analysis only as backoff solution in case the word was not retrieved i...
In this chapter we present the major challenges of a new trend in business analytics, namely Interaction Mining. With the proliferation of unstructured data as the result of people interacting with each other using digital networked devices, classical methods in text business analytics are no longer effective. We identified the causes of their fail...
Logical Forms are an exceptionally important linguistic representation for highly demanding semantically related tasks like Question/ Answering and Text Understanding, but their automatic production at runtime is higly error-prone. The use of a tool like XWNet and other similar resources would be beneficial for all the NLP community, but not only....
In this chapter we present the major challenges of a new trend in business analytics, namely Interaction Mining. With the proliferation of unstructured data as the result of people interacting with each other using digital networked devices, classical methods in text business analytics are no longer effective. We identified the causes of their fail...
We present work carried out to simulate an interlanguage for English speakers learning Italian with a speech synthesizer made available by Apple. The lexicon is made of the most frequent 30,000 word forms of Italian as extracted from corpora and other similar Frequency Lists. The synthesizer has been filtered by a program in C-language that address...
We present an experiment evaluating the contribution of a system called GReG for reranking the snippets returned by Google’s
search engine in the 10 hits presented to the user and captured by the use of Google’s API. The evaluation aims at establishing
whether or not the introduction of deep linguistic information may improve the accuracy of Google...
In this paper, we address the issue of automatically identifying null instantiated arguments in text. We refer to Fillmore's theory of pragmatically controlled zero anaphora (Fillmore, 1986), which accounts for the phenomenon of omissible arguments using a lexically-based approach, and we propose a strategy for identifying implicit arguments in a t...
Interaction mining is about discovering and extracting insightful information from digital conversations, namely those human–human information exchanges mediated by digital network technology. We present in this article a computational model of natural arguments and its implementation for the automatic argumentative analysis of digital conversation...
Abstract summarization of conversations is a very challenging task that requires full understanding of the dialog turns, their
roles and relationships in the conversations. We present an efficient system, derived from a fully-fledged text analysis system
that performs the necessary linguistic analysis of turns in conversations and provides useful a...
In this paper, we present our solution for pragmatic analysis of call center conversations in order to provide useful insights for enhancing Call Center Analytics to a level that will enable new metrics and key performance indicators (KPIs) beyond the standard approach. These metrics rely on understanding the dy-namics of conversations by highlight...
We present a system for Question Answering which computes a prospective answer from Logical Forms produced by a full-fledged NLP for text understanding, and then maps the result onto schemata in SPARQL to be used for accessing the Semantic Web. It is just by the internal structure of the Logical Form that we are able to produce a suitable and meani...
This document reports the process of extending MorphoPro for Venetan, a lesser-used language spoken in the Nort-Eastern part of Italy. MorphoPro is the morphological component of TextPro, a suite of tools oriented towards a number of NLP tasks. In order to extend this component to Venetan, we developed a declarative representation of the morphologi...
The system to spot INIs, DNIs and their antecedents is an adaptation of VENSES, a system for semantic evaluation that has been used for RTE challenges in the last 6 years. In the following we will briefly describe the system and then the additions we made to cope with the new task. In particular, we will discuss how we mapped the VENSES analysis to...
In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. The current goal is that of extracting automatically argumentative information in order to build argumentative structure. The long term goal is using argumentative structure to produce autom...
In this paper we will be concerned with the role played by prosody in language learning and by the speech technology already available as commercial product or as prototype, capable to cope with the task of helping language learner in improving their knowledge of a second language from the prosodic point of view. The paper has been divided into two...
Semantic processing represents the new challenge for all applications that require text understanding, as for instance Q/A. In this paper we will highlight the need to couple statistical approaches with deep linguistic processing and will focus on ldquoimplicitrdquo or lexically unexpressed linguistic elements that are nonetheless necessary for a c...
In order to show that a system for text understanding has produced a sound representation of the semantic and pragmatic contents of a story, it should be able to answer questions about the participants and the events occurring in the story. This requires processing linguistic descriptions which are lexically expressed but also unexpressed ones, a t...
In order to show that a system for text understanding has produced a sound representation of the semantic and pragmatic contents of a story, it should be able to answer questions about the participants and the events occurring in the story. This requires processing linguistic descriptions which are lexically expressed but also unexpressed ones, a t...
In this paper we will focus on the notion of "implicit" or lexically unexpressed linguistic elements that are nonetheless necessary for a complete semantic interpretation of a text. We referred to "entities" and "events" because the recovery of the implicit material may affect all the modules of a system for semantic processing, from the grammatica...
In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. We will present the adjustments we made in order to cope with transcribed spoken dialogues like those produced in the ICSI Berkely project. In a final section we present preliminary evaluati...
In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. We will present the adjustments we made in order to cope with transcribed spoken dialogues like those produced in the ICSI Berkely project. In a final section we present preliminary evaluati...
In this paper we will focus on the notion of “implicit” or lexically unexpressed linguistic elements that are nonetheless necessary for a complete semantic interpretation of a text. We refer to “entities” and “events” because the recovery of the implicit material may affect all the modules of a system for semantic processing, from the grammatically...
In this paper we propose a new approach to the description of Arabic morphology using 2-tape finite state transducers, based
on a particular and systematic use of the operation of composition in a way that allows for incremental substitutions of concatenated
lexical morpheme specifications with their surface realization for non-concatenative proces...
In this paper we will present a system for Question Answering called GETARUNS, in its deep version applicable to closed domains, that is to say domains for which the lexical semantics is fully specified and does not have to be induced. In addition, no ontology is needed: semantic relations are derived from linguistic relations encoded in the syntax...
We present a system for text understanding called GETARUNS, in its deep version applicable only to Closed Domains. We will present the low level component organized according to LFG theory. The system also does pronominal binding, quantifier raising and temporal interpretation. Then we will introduce the high level component where the Discourse Mod...
We present an unsupervised linguistically-based approach to discourse relations recognition, which uses publicly available resources like manually annotated corpora (Discourse Graph Bank, Penn Discourse TreeBank, RST-DT), as well as empirically derived data from “causally” annotated lexica like LCS, to produce a rule-based algorithm. In our approac...
We present VENSES, a linguistically-based approach for semantic inference which is built around a neat division of labour between two main components: a grammatically-driven subsystem which is responsible for the level of predicate-arguments well-formedness and works on the output of a deep parser that produces augmented head-dependency structures....
Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler. NEALT Proceedings Series, Vol. 1 (2007), 43-54. © 2007 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically pub...
Information Extraction, Summarization and Question Answering all manipulate natural language texts and should benefit from the use of NLP techniques. Statistical techniques have till now outperformed symbolic processing of unrestricted text. However, Information Extraction and Question Answering require by far more accurate results of what is curre...
The system for semantic evaluation VENSES (Venice Semantic Evaluation System) is organized as a pipeline of two subsystems:
the first is a reduced version of GETARUN, our system for Text Understanding. The output of the system is a flat list of augmented
head-dependent structures with Grammatical Relations and Semantic Roles labels. The evaluation...
GREVAL, the test suite of 500 English sentences taken from SUSANNE Corpus and made available by John Carroll and Ted Briscoe at their website, has been used to test the performance of a symbolic linguistically-based parser called GETARUNS presented in (Delmonte, 2002). GETARUNS is a symbolic linguistically-based parser written in Prolog Horn clause...
Summarization and Question Answering need precise linguistic information with a much higher coverage than what is being offered by currently available statistically based systems. We assume that the starting point of any interesting application in these fields must necessarily be a good syntactic-semantic parser. In this paper we present the system...
The Italian Literacy Tutor (ILT) is a project designed to implement a program of individualized, computer-aided reading instruction with the potential to dramatically improve reading achievement and learning from text in Italian, especially considering L2 teaching/learning frameworks or working with children with reading disabilities. The Italian L...
Evaluating summaries is currently performed by the use of statistically-based tools which lack any linguistic knowledge and are unable to produce grammatical and semantic judgements (Landauer et al., 1997). However, summary evaluation needs precise linguistic information with a much finer-grained coverage than what is being offered by currently ava...
We present three applications which share some of their linguistic processor. The first application "FILES" -- Fully Integrated Linguistic Environment for Syntactic and Functional Annotation - is a fully integrated linguistic environment for syntactic and functional annotation of corpora currently being used for the Italian Treebank. The second app...