Simone Marchi

Simone Marchi
Italian National Research Council | CNR · Institute of Computational Linguistics "Antonio Zampolli" ILC

MD

About

27
Publications
3,615
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
179
Citations
Education
September 1993 - May 2000
Università di Pisa
Field of study
  • Computer Science

Publications

Publications (27)
Chapter
This work describes the first experiments conducted with a computational lexicon of Italian in a context of query expansion for full-text search. An application, composed of a graphical user interface and backend services to access the lexicon and the database containing the corpus to be queried, was developed. The text was morphologically analysed...
Article
The purpose of this research is to build an ontology of the masters appearing in the Babylonian Talmud (BT). The ontology built so far has been shared as a Linked Open Data and it will be linked to existing vocabularies. This work has been developed in the context of the Babylonian Talmud Translation Project, where more than eighty Talmudists are w...
Chapter
This paper introduces the research in Part-Of-Speech tagging of mishnaic Hebrew carried out within the Babylonian Talmud Translation Project. Since no tagged resource was available to train a stochastic POS tagger, a portion of the Mishna of the Babylonian Talmud has been morphologically annotated using an ad hoc developed tool connected with the D...
Conference Paper
Full-text available
La comunità delle Digital Humanities sta diventando sempre più inclusiva, non solo nei confronti di gruppi di ricerca in ambito computazionale, ma anche nei confronti delle comunità che praticano le discipline umanistiche con metodi non digitali, con le quali nel passato c'era un difetto di comunicazione. Grazie a questo dialogo ritrovato, è necess...
Poster
Full-text available
One of the main challenges of the DH community is to provide suitable software models and tools. To model the literary domain and the relative user requirements, we chose to follow the engineering principles of object-oriented analysis and design. The digital representation of a textual resource is a challenge as it involves several theoretical and...
Conference Paper
The paper describes a philological-computational tool developed by the Istituto di Linguistica Computazionale of Pisa, aimed at creating a digital edition of Ferdinand de Saussure’s unpublished manuscripts. Since the use of a digital edition and of the most modern computer technology allows a more in-depth research, the ILC is developing a set of d...
Chapter
The domain adaptation task was aimed at investigating techniques for adapting state–of–the–art dependency parsing systems to new domains. Both the language dealt with, i.e. Italian, and the target domain, namely the legal domain, represent two main novelties of the task organised at Evalita 2011 with respect to previous domain adaptation initiative...
Article
Full-text available
The study analyzed the writing products of subjects with high (highs) and low (lows) hypnotizability. The participants were asked to write short texts in response to highly imaginative scenarios in standard conditions. The texts were processed through computerized and manual methods. The results showed that the highs' texts were more sophisticated...
Article
Full-text available
Due to the rapidly expanding body of biomedical literature, biologists require increasingly sophisticated and efficient systems to help them to search for relevant information. Such systems should account for the multiple written variants used to represent biomedical concepts, and allow the user to search for specific pieces of knowledge (or events...
Conference Paper
Full-text available
The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possib...
Conference Paper
Full-text available
The paper describes a system for the automatic consolidation of Italian legislative texts to be used as a support of an editorial consolidating activity and dealing with the following typology of textual amendments: repeal, substitution and integration. The focus of the paper is on the semantic analysis of the textual amendment provisions and the f...
Conference Paper
Full-text available
Semantic annotation of text requires the dynamic merging of linguisticall y structured information and a "world model", usually represented as a domain-specific ontology. On the other hand, the process of engineering a domain ontology through semi-automatic ontology learning system requires the availability of a considerable amount of semantically...
Conference Paper
Full-text available
We describe here a methodology to combine two different techniques for Semantic Relation Extraction from texts. On the one hand, generic lexico- syntactic patterns are applied to the linguistically analyzed corpus to detect a first set of pairs of co-occurring words, possibly involved in "syntagmatic" relations. On the other hand, a statistical uns...
Conference Paper
Full-text available
The demand for efficient methods for extracting knowledge from multimedia content has led to a growing research community investigating the convergence of multimedia and knowledge technologies. In this paper we describe a methodology for extracting multimedia information from product catalogues empowered by the synergetic use and extension of a dom...
Article
Full-text available
In this paper we present an original approach to natural language query interpretation which has been implemented within the FuLL (Fuzzy Logic and Language) Italian project of BC S.r.l. In particular, we discuss here the creation of linguistic and ontological resources, together with the exploitation of existing ones, for natural language-driven da...
Article
Full-text available
The paper reports on methodology and preliminary results of a case study in automatically extracting ontological knowledge from Italian legislative texts in the environmental domain. We use a fully-implemented ontology learning system (T2K) that includes a battery of tools for Natural Language Processing (NLP), statistical text analysis and machine...
Article
The paper focuses on the automatic extraction of domain knowledge from Italian legal texts and presents a fully-implemented ontology learning system (T2K, Text-2-Knowledge) that includes a battery of tools for Natural Language Processing, statistical text analysis and machine learning. Evaluated results show the considerable potential of systems li...

Network

Cited By