Maurizio LenzeriniSapienza University of Rome | la sapienza · Department of Computer, Automatic and Management Engineering "Antonio Ruberti"
Maurizio Lenzerini
Laurea ingegneria elettronica
About
424
Publications
72,529
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
22,607
Citations
Publications
Publications (424)
In the context of healthcare, an AI solution is generally developed for a specific analysis task, based on a relevant dataset, with little attention to reusability and generalizability of its data preparation step. This paper focuses on a different scenario, which can be called context-oriented, where a set of clinical data sources, relevant for a...
Query answering for Knowledge Bases (KBs) amounts to extracting information from the various models of a KB, and presenting the user with an object that represents such information. In the vast majority of cases, this object consists of those tuples of constants that satisfy the query expression either in every model (certain answers) or in some mo...
Italian Arthroplasty Registry (Registro Italiano ArtroProtesi, RIAP) is organized as a federation of regional registries, involved on voluntary basis, with the purpose of collecting data to monitor joint prostheses safety and quickly recall patients in case of adverse events. Data collection flows may differ among the participating regions, therefo...
The Datalog query language can express several powerful recursive properties, often crucial in real-world scenarios. While answering such queries is feasible over relational databases, the picture changes dramatically when data is enriched with intensional knowledge. It is indeed well-known that answering Datalog queries is undecidable already over...
It is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing and analysis. Since nowadays data often reside in...
Given two datasets, i.e., two sets of tuples of constants, representing positive and negative examples, logical separability is the reasoning task of finding a formula in a certain target query language that separates them. As already pointed out in previous works, this task turns out to be relevant in several application scenarios such as concept...
In Ontology-Based Data Management (OBDM), an abstraction of a source query q is a query over the ontology capturing the semantics of q in terms of the concepts and the relations available in the ontology. Since a perfect characterization of a source query may not exist, the notions of best sound and complete approximations of an abstraction have be...
The Associazione Medici Diabetologi (AMD) collects and manages one of the largest worldwide-available collections of diabetic patient records, also known as the AMD database. This paper presents the initial results of an ongoing project whose focus is the application of Artificial Intelligence and Machine Learning techniques for conceptualizing, cl...
A data ecosystem (DE) offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data...
Italian Arthroplasty Registry (Registro Italiano ArtroProtesi, RIAP) faces important data integration challenges in collecting a huge quantity of data from different sources. This paper discusses the introduction of both an ontology and a central relational database supporting the described process, focusing on the main advantages and benefits comi...
In this paper we introduce the notion of mapping-based knowledge base (MKB) to formalize the situation where both the extensional and the intensional level of the ontology are determined by suitable mappings to a set of (relational) data sources. This allows for making the intensional level of the ontology as dynamic as traditionally the extensiona...
Given an input dataset (i.e., a set of tuples), query definability in Ontology-based Data Management (OBDM) amounts to find a query over the ontology whose certain answers coincide with the tuples in the given dataset. We refer to such a query as a characterization of the dataset with respect to the OBDM system. Our first contribution is to propose...
The use of virtual collections of data is often essential in several data and knowledge management tasks. In the literature, the standard way to define virtual data collections is via views, i.e., virtual relations defined using queries. In data and knowledge bases, the notion of views is a staple of data access, data integration and exchange, quer...
A Data Ecosystem offers a keystone-player or alliance-driven infrastructure that enables the interaction of different stakeholders and the resolution of interoperability issues among shared data. However, despite years of research in data governance and management, trustability is still affected by the absence of transparent and traceable data-driv...
Distributed information systems and applications are generally described in terms of components and interfaces among them. How these component-based architectures have been designed and implemented evolved over the years, giving rise to the so-called paradigm of Service-Oriented Computing (SOC). In this chapter, we will follow a 25-years-long journ...
OWL 2 QL is a standard profile of the OWL 2 ontology language, specifically tailored to Ontology-Based Data Management. Inspired by recent work on higher-order Description Logics, in this paper we present a new semantics for OWL 2 QL ontologies, called Metamodeling Semantics (MS), and show that, in contrast to the official Direct Semantics (DS) for...
The quantitative evaluation of research is currently carried out by means of indicators calculated on data extracted and integrated by analysts who elaborate them by creating illustrative tables and plots of results. In this approach, the robustness of the metrics used and the possibility for users of the metrics to intervene in the evaluation proc...
In Ontology-Based Data Access (OBDA), a domain ontology is linked to the data sources of an organization in order to query, integrate and manage data through the concepts and relations of the domain of interest, thus abstracting from the technical details of the data layer implementation. While the great majority of contributions in OBDA in the las...
Ontology-based data management (OBDM) is a powerful knowledge-oriented paradigm for managing data spread over multiple heterogeneous sources. In OBDM, the data sources of an information system are handled through the reconciled view provided by an ontology, i.e., the conceptualization of the underlying domain of interest expressed in some formal la...
In the context of the Description Logic DL-Liteℛ≠, i.e., DL-Liteℛ without UNA and with inequality axioms, we address the problem of adding to unions of conjunctive queries (UCQs) one of the simplest forms of negation, namely, inequality. It is well known that answering conjunctive queries with unrestricted inequalities over DL-Liteℛ ontologies is i...
Although current languages used in ontology-based data access (OBDA) systems allow for mapping source data to instances of concepts and relations in the ontology, several application domains need more flexible tools for inferring knowledge from data, which are able to dynamically acquire axioms about new concepts and relations directly from the dat...
We study the problem of associating formal semantic descriptions to data services. We base our proposal on the Ontology-based Data Access paradigm, where a domain ontology is used to provide a semantic layer mapped to the data sources of an organization. The basic idea is to explain the semantics of a data service in terms of a query over the ontol...
The issue of cooperation, integration, and coordination between information peers has been addressed over the years both in the context of the Semantic Web and in several other networked environments, including data integration, Peer-to-Peer and Grid computing, service-oriented computing, distributed agent systems, and collaborative data sharing. O...
Data interoperability refers to the issue of accessing and processing data from multiple sources in order to create more holistic and contextual information for improving data analysis, for better decision-making, and for accountability purposes. In the era towards a data-driven society, the notion of data interoperability is of paramount importanc...
An Ontology-based Data Access system is constituted by an ontology, namely a description of the concepts and the relations in a domain of interest, a database storing facts about the domain, and a mapping between the data and the ontology. In this paper, we consider ontologies expressed in the popular DL-Lite family of Description Logic, and we add...
While the amount of data stored in current information systems continuously grows, and the processes making use of such data become more and more complex, extracting knowledge and getting insights from these data, as well as governing both data and the associated processes, are still challenging tasks. The problem is complicated by the proliferatio...
Ontology-based data management (OBDM) is a recent paradigm for addressing data management based on a conceptualization of the domain of interest, called ontology. A system realizing the vision of OBDM is constituted by three layers: the ontology, that provides a high level, formal, logic-based representation of the above mentioned conceptualization...
While big data analytics is considered as one of the most important paths to competitive advantage of today’s enterprises, data scientists spend a comparatively large amount of time in the data preparation and data integration phase of a big data project. This shows that data integration is still a major challenge in IT applications. Over the past...
Metamodeling and metaquerying are gaining momentum in the context of both conceptual modeling and semantic web. Indeed it has been largely recognized that metamodeling represents a very useful tool to formalize complex patterns involving elements of the domain of interest, that otherwise are forced to be excluded from the modeling process, and a nu...
After years of focus on technologies for big data storing and processing, many observers are pointing out that making sense of big data cannot be done without suitable tools for conceptualizing, preparing, and integrating data (see http://www.dbta.com/). Research in the last years has shown that taking into account the semantics of data is crucial...
OWL 2 QL is the profile of OWL 2 targeted to Ontology-Based Data Access (OBDA) scenarios, where large amount of data are to be accessed, and thus answering conjunctive queries over data is the main task. However, this task is quite restrained wrt the classical KR Ask-and-Tell framework based on querying the whole theory, not only facts (data). If w...
We illustrate the usefulness of an Ontology-Based Data Management (OBDM) approach to develop an open information system, allowing for a deep level of interoperability among different databases, and accounting for additional dimensions of data quality compared to the standard dimensions of the OECD (Quality framework and guidelines for OECD statisti...
This paper proposes an Ontology-Based Data Management (OBDM) approach to coordinate, integrate and maintain the data needed for Science, Technology and Innovation (STI) policy development. The OBDM approach is a form of integration of information in which the global schema of data is substituted by the conceptual model of the domain, formally speci...
The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic and epigen...
Ontology-based data access (OBDA) is receiving great attention as a new paradigm for managing information systems through semantic technologies. According to this paradigm, a Description Logic ontology provides an abstract and formal representation of the domain of interest to the information system, and is used as a sophisticated schema for access...
Hi(DL − Lite
R
) is a higher-order Description Logic obtained from DL − Lite
R
by adding metamodeling features, and is equipped with a query language that is able to express higher-order queries. We investigate the problem of answering a particular class of such queries, called instance higher-order queries, posed over Hi(DL − Lite
R
) knowledge ba...
Ontology-based data access (OBDA) is a new paradigm aiming at accessing and managing data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. In the last years, this new paradigm has been used for providing users with ab-stract (independent from technological and system-oriented...
Ontology-based data access (OBDA) is a new paradigm aiming at accessing and managing data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. In the last years, this new paradigm has been used for providing users with suitable mechanisms for querying the data residing at the inf...
In this paper we present an ontology-based data management (OBDM) project concerning the Italian public debt domain, carried out within a joint collaboration between Sapienza University of Rome and the Department of Treasury of the Italian Ministry of Economy and Finance. We discuss the motivations at the basis of this project and present the main...
This chapter proposes steps towards the solution to the data access problem that end-users typically face when dealing with Big Data.
A schema mapping is a formal specification of the relationship holding between the databases conforming to two given schemas, called source and target, respectively. While in the general case a schema mapping is specified in terms of assertions relating two queries in some given language, various simplified forms of mappings, in particular lav and...
Ontology-based data access (OBDA) is a novel paradigm for accessing large data repositories through an ontology, that is a formal description of a domain of interest. Supporting the management of OBDA applications poses new challenges, as it requires to provide effective tools for (i) allowing both expert and non-expert users to analyze the OBDA sp...
Ontology-based data access (OBDA) is a novel paradigm for ac-cessing large data repositories through an ontology, that is a formal description of a domain of interest. Supporting the management of OBDA applications poses new challenges, as it requires to provide effective tools for (i) allowing both expert and non-expert users to analyze the OBDA s...
The work ?Data Integration under Integrity Constraints?, published at the CAiSE 2002 Conference, proposes a rewriting technique for answering queries in data integration systems, when the global schema contains the classical key and foreign key constraints, and the mapping between the data sources and the global schema is of the global-as-view type...
In ontology-based data access (OBDA), an ontology is connected to autonomous, and generally pre-existing, data repositories through mappings, so as to provide a high-level, conceptual view over such data. User queries are posed over the ontology, and answers are computed by reasoning both on the ontology and the mappings. Query answering in OBDA sy...
Ontology-based data access (OBDA) is a new paradigm for manag-ing complex information systems through semantic technologies. In this paper we present what we believe is the first industrial experience of a comprehensive OBDA project, developed jointly by Sapienza University of Rome and the De-partment of Treasury of the Italian Ministry of Econonom...
Artificial Intelligence technologies are growingly used within several software systems ranging from Web services to mobile applications. It is by no doubt true that the more AI algorithms and methods are used the more they tend to depart from a pure "AI" spirit and end to refer to the sphere of standard software. In a sense, AI seems strongly conn...
Several recent techniques and tools for Ontology-based Data Access (OBDA) make use of the so-called extensional constraints (a.k.a. ABox dependencies). So far, extensional constraints have been mainly considered in a setting where data are represented in an ABox, instead of external data sources connected to the ontology through declarative mapping...
Schema mappings establish a correspondence between data stored in two databases, called source and target respectively. Query processing under schema mappings has been investigated extensively in the two cases where each target atom is mapped to a query over the source (called GAV, global-as-view), and where each source atom is mapped to a query ov...
In this paper we present the current version of Mastro, a system for ontologybased data access (OBDA) developed at Sapienza Università di Roma. Mastro allows users for accessing external data sources by querying an ontology expressed in a fragment of the W3C Web Ontology Language (OWL). As in data integration [5], mappings are used in OBDA to speci...
Finding an appropriate semantics for task of updating an inconsistent knowledge base is a challenging problem. In this pa-per, we consider knowledge bases expressed in Description Logics, and focus on ABox inconsistencies, i.e., the case where the TBox is consistent, but the whole knowledge base is not. Our first contribu-tion is the definition of...
View-based query answering is the problem of answering a query based only on the precomputed answers to a set of views. While this problem has been widely investigated in databases, it is largely unexplored in the context of Description Logic ontologies. Differently from traditional databases, Description Logics may express several forms of incompl...
In this paper we present Mastro, a Java tool for ontology-based data access (OBDA) developed at Sapienza Università di Roma. Mastro manages OBDA systems in which the ontology is specified in a logic of the DL-Lite family of Description Logics specifically tailored to ontology-based data access, and is connected to external data management systems t...
In this paper we introduce the notion of mapping-based knowledge base (MKB) to formalize the situation where both the extensional and the intensional level of the ontology are determined by suitable mappings to a set of (relational) data sources. This allows for making the intensional level of the ontology as dynamic as traditionally the extensiona...
This paper is motivated by two requirements arising in practical ap-plications of ontology-based data access (OBDA): the need of inconsistency-tolerant semantics, which allow for dealing with classically inconsistent speci-fications; and the need of expressing assertions which go beyond the expressive abilities of traditional Description Logics, na...
Ontology-based data management aims at accessing and using data by means of an ontology, i.e., a conceptual representation of the domain of interest in the underlying information system. This new paradigm provides several interesting features, many of which have been already proved effective in managing complex information systems. On the other han...
In this paper we study the problem of obtaining meaningful answers to queries posed over inconsistent DL − Lite ontologies. We consider different variants of inconsistency-tolerant semantics and show that for some of such variants answering
unions of conjunctive queries (UCQs) is first-order (FOL) rewritable, i.e., it can be reduced to standard eva...
We investigate an extension of Description Logics (DL) with higher-order capabilities, based on Henkin-style semantics. Our study starts from the observation that the various possibilities of adding higher-order con- structs to a DL form a spectrum of increasing expres- sive power, including domain metamodeling, i.e., using concepts and roles as pr...
The article aims at establishing a logical approach to class-based data modeling. After a discussion on class-based formalisms
for data modeling, we introduce a family of logics, called Description Logics, which stem from research on Knowledge Representation
in Artificial Intelligence. The logics of this family are particularly well suited for spec...
While classic data management focuses on the data itself, research on Business Processes considers also the context in which this data is generated and manipulated, namely the processes, the users, and the goals that this data serves. This allows ...
Recent papers address the issue of updating the instance level of knowledge bases expressed in Description Logic following a model-based approach. One of the outcomes of these papers is that the result of updating a knowledge base K is generally not expressible in the Descrip-tion Logic used to express K. In this paper we introduce a formula-based...
A schema mapping is a formal specification of the relationship holding between the databases conforming to two given schemas, called source and target, respectively. While in the general case a schema mapping is specified in terms of assertions relating two queries in some given language, various simplified forms of mappings, in particular LAV and...
The notion of class is ubiquitous in computer science and is central in many
formalisms for the representation of structured knowledge used both in
knowledge representation and in databases. In this paper we study the basic
issues underlying such representation formalisms and single out both their
common characteristics and their distinguishing fea...
Recent papers address the issue of updating the instance level of knowledge
bases expressed in Description Logic following a model-based approach. One of
the outcomes of these papers is that the result of updating a knowledge base K
is generally not expressible in the Description Logic used to express K. In
this paper we introduce a formula-based a...
Keyword search is a friendly mechanism for users to identify desired information in XML databases, and LCA is a popular concept for locating the meaningful subtrees corresponding to query keywords. Among all the LCA-based approaches, MaxMatch ...
In this paper we present MASTRO, a Java tool for ontology-based data access (OBDA) developed at Sapienza Università di Roma and at the Free University of Bozen-Bolzano. MASTRO manages OBDA systems in which the ontology is specified in DL-Lite A,id, a logic of the DL-Lite family of tractable Description Logics specifically tailored to ontology-based...
In this paper we introduce the notion of mapping-based knowledge base (MKB), to formalize those ontology-based data access (OBDA) scenarios where both the extensional and the intensional level of the ontology are determined by suitable mapping assertions involving the data sources. We study reasoning over MKBs in the context of Hi(DL-LiteR), a high...
We investigate an extension of Description Logics (DL) with higher-order capabilities, based on Henkin-style semantics. Our study starts from the observation that the various possibilities of adding higher-order constructs to a DL form a spectrum of increasing expressive power, including domain metamodeling, i.e., using concepts and roles as predic...
We aim at reasoning about actions and about high-level programs over knowledge bases (KBs) expressed in Description Logics (DLs). This is a critical issue that has resisted good, robust solutions for a long time. In particular, while welldeveloped theories of actions and high-level programs exist in AI, e.g., the ones based on the Situation Calculu...
The study of node-selection query languages for (finite) trees has been a major topic in the recent research on query languages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or monadic second-order lo...
One of the outcomes of the research work carried out on data integration in the last years is a clear architecture, comprising a global schema, the source schema and the mapping between the source and the global schema. In this chapter, we study data integration under this framework when the global schema is specified in OWL, the standard language...
We address the problem of dealing with inconsistencies in Description Logic (DL) knowledge bases. Our general goal is both to study DL semantical frameworks which are inconsistency-tolerant, and to devise techniques for an- swering unions of conjunctive queries posed to DL knowledge bases under such inconsistency-tolerant semantics. Our work is ins...
The study of node-selection query languages for (finite) trees has been a major topic in the recent research on query lan- guages for Web documents. On one hand, there has been an extensive study of XPath and its various extensions. On the other hand, query languages based on classical logics, such as first-order logic (FO) or monadic second-order...
In this paper we consider a powerful mechanism, called Regular XPath, for expressing queries and constraints over XML data, including DTDs and existential path constraints and their negation. Regular XPath extends XPath with binary relations over XML nodes specified by means two-way regular path queries. Our first contribution deals with checking s...
In data management, and in particular in data integration, data exchange, query optimization, and data privacy, the notion of view plays a central role. In several contexts, such as data integration, data mashups, and data warehousing, the need arises of designing views starting from a set of known correspondences between queries over different sch...
This paper introduces the general features of Senso Comune, an open knowledge base for the Italian language, focusing on the interplay of lexical and ontological knowledge, and outlining our approach to conceptual knowledge elicitation. Senso Comune consists of a machine-readable lexicon constrained by an ontological infrastructure. The idea at the...
We report on an experimentation of Ontology-based Data Access (OBDA) carried out in a joint project with SAPIENZA University of Rome, Free University of Bolzano, and Monte dei Paschi di Siena (MPS), where we used MASTRO for accessing, by means of an ontology, a set of data sources of the actual MPS data repository. By both looking at these sources,...
Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that ...
Research about ontology access, processing, and usage paves the way for realizing important tasks in future applications requiring well-understood formal representation formalisms as well as efficient and industrial-strength implementations. In this report, we summarize the state of the art for most important application tasks of this kind that use...
Several areas of research and various application domains have been concerned in the last years with the problem of dealing with incomplete databases. Data integration as well as the Semantic Web are notable examples. Surprisingly, while many research efforts have been focusing on several interesting issues related to incomplete databases, as query...
A Description Logic (DL) ontology is constituted by two components, a TBox that expresses general knowledge about the concepts and their relationships, and an ABox that describes the properties of individuals that are instances of concepts. We address the problem of how to deal with changes to a DL ontology, when these changes affect only the ABox,...
In this paper we present Regular XPath (RXPath), which is a natural extension of XPath with regular expressions over paths that has the same computational properties as XPath: linear-time query evaluation and exponential-time reasoning. To establish these results, we devise a unifying automata-theoretic framework based on two-way weak alternating t...
The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear architecture, comprising a global s...