Carlo Batini

Carlo Batini
Università degli Studi di Milano-Bicocca | UNIMIB · Department of Computer Science, Systems and Communications

About

253
Publications
87,552
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,065
Citations
Citations since 2016
35 Research Items
2213 Citations
2016201720182019202020212022050100150200250300350
2016201720182019202020212022050100150200250300350
2016201720182019202020212022050100150200250300350
2016201720182019202020212022050100150200250300350
Introduction

Publications

Publications (253)
Article
Full-text available
In this paper, we address the problem of assessing the social value of open data. While the number of open data initiatives increases and many data sets are currently available to lay people, common citizens and end users, a still limited number of studies specifically address how to improve open data understandability, their usability by common us...
Article
Full-text available
The importance of data quality assessment has significantly increased with the boom of information technology and the growing demand for remote sensing (RS) data. The Remote Sensing Data Quality Working Group of the International Society for Photogrammetry and Remote Sensing aimed to conduct an investigation on the principles of data quality. Liter...
Chapter
This book collects some of the best contributions to the 14th conference of the Italian Chapter of AIS (ItAIS), which was held at the University of Milano-Bicocca, in Milano, on October the 6th and the 7th. ItAIS is an established forum for scholars, researchers and practitioners involved in the Information Systems (IS) field and akin scholarly dis...
Book
This book argues that “organizing” is a broader term than managing, as it entails understanding how people and machines interact with each other; how resources, data, goods are exchanged in complex and intertwined value chains; and how lines of action and activities can be articulated using flexible protocols and often ad-hoc processes in situated...
Article
Full-text available
Our rapidly changing world requires new sources of image based information. The quickly changing urban areas, the maintenance and management of smart cities cannot only rely on traditional techniques based on remotely sensed data, but also new and progressive techniques must be involved. Among these technologies the volunteer based solutions are ge...
Article
The article discusses a model for information value assessment based on the concepts of information capacity, information utility, and information management costs. Notwithstanding that both state-of-the-art researchers and practitioners consider information as a fundamental asset, there is actually no consensus on what are the determinants of info...
Chapter
In the last 25 years my life changed a lot. In the paper I describe the evolution of my research activities, and to some extent the change in my favorite hobby, that is climbing mountains, and in my life, from Rome to Milan. Moving from Rome to Milan was initially a bit traumatic, but at end I made it. In the story, I also mention the influence in...
Article
Full-text available
The availability and accessibility of remote sensing (RS) data, cloud processing platforms and provided information products and services has increased the size and diversity of the RS user community. This development also generates a need for validation approaches to assess data quality. Validation approaches employ quality criteria in their asses...
Chapter
This book, dedicated to Ontoni Olivé, will be presented to him at the 36th International Conference on Conceptual Modeling; this tells us that conceptual modeling established as a research field about 40 years ago, when seminal works on conceptual modeling were published. Our research career started as well about 40 years ago, and modeling has been...
Article
Full-text available
The issue of data quality (DQ) is of growing importance in Remote Sensing (RS), due to the widespread use of digital services (incl. apps) that exploit remote sensing data. In this position paper a body of experts from the ISPRS Intercommission working group III/IVb “DQ” identifies, categorises and reasons about issues that are considered as crucia...
Article
Infographics are a common visual means to inform users. This paper investigates how lay people of different age, gender and educational background perceive the use of infographics for information visualization in daily tasks. We chose three topics of general interest: weather, study and work, and three infographics, one for each topic. We administe...
Article
Nowadays, repositories of services are becoming increasingly useful in the management of many public and private service provider organizations. In order to make a repository an integrated representation of all services delivered in an organization, a unified representation is desirable. Since several repositories of services, each potentially char...
Article
The article investigates the potential role of conceptual modeling for policymaking. It argues that the use of conceptual schemas may provide an effective understanding of public sector information assets, and how they might be used to satisfy the needs of constituencies, thus having a public as well as social value. The article first defines the i...
Chapter
The Search query “data quality” entered into Google returns about three million pages, and searching similarly for the term “information quality” (IQ) returns about one and a half million pages, both frequencies showing the increasing importance of data and information quality. The goal of this chapter is to show and discuss the perspectives that m...
Chapter
In Chap. 2, we have considered quality dimensions for structured data. In this chapter, we move from data quality dimensions to information quality dimensions. We will consider two coordinates for the types of information, respectively, the perceptual coordinate and the linguistic coordinate. From one side, we will explore how dimensions change acc...
Chapter
In this chapter we address object identification (IQ), the most important and the most extensively investigated information quality activity. Due to such an importance, we decided to dedicate two chapters of the book to object identification, this chapter focusing on consolidated techniques and the next one on recent advancements.
Chapter
In the previous chapters, we introduced several dimensions that are useful to describe and measure information quality in its different aspects and meanings. Focusing on structured data, database management systems (DBMSs) represent data and relative operations on it in terms of a data model and a data definition and manipulation language, i.e., a...
Chapter
In Chap. 1 we noticed that information quality is a multifaceted concept, and the cleaning of poor quality information can be performed by measuring different dimensions and setting out several different activities, with various goals. An information quality activity is any process we perform directly on information to improve their quality. An exa...
Chapter
In distributed environments, data sources are typically characterized by various kinds of heterogeneities that can be generally classified into (1) technological heterogeneities, (2) schema heterogeneities, and (3) instance-level heterogeneities. Technological heterogeneities are due to the use of products by different vendors, employed at various...
Chapter
We have seen in the Preface that the amount of information exchanged in the Web doubles every one year and a half. Besides the Web, to make a whole picture of the multitude of information used every day, we have to consider the information managed in information systems of organizations, the information exchanged by organizations, and the informati...
Chapter
In Chap. 1, we provided an intuitive concept of information quality and we informally introduced several data quality dimensions, such as accuracy, completeness, currency, and consistency.
Chapter
An image is the result of the optical imaging process which maps physical scene properties onto a two-dimensional luminance distribution; it encodes important and useful information about the geometry of the scene and the properties of the objects located within this scene [339, 611, 687].
Chapter
Measuring and improving information quality in a single organization or in a set of cooperating organizations is a complex task. In previous chapters, we discussed relevant activities for improving information quality (Chap. 7) and corresponding techniques (Chaps. 7–10). Several methodologies have been developed in the last few years that provide a...
Chapter
In this chapter, we will shortly frame information quality in healthcare as a matter of study or concern. Being aware that such a vast topic cannot be covered in one single book chapter, here we will at least orient interested readers to resources that could be consulted to get further information on this broad field of study and practice. To this...
Chapter
Full-text available
The increasing diffusion of linked data as a standard way to share knowledge on the Web allows users and public and private organizations to fully exploit structured data from very large datasets that were not available in the past. Over the last few years, linked data developed into a large number of datasets with an open access from several domai...
Chapter
Research on object identification has been producing several significant results in the last years, in different areas of computer science. As observed in [140], it is well known that in data mining projects, a large proportion of effort (20–30 % reported in [566]) is spent for understanding data and 50–70 % for data preparation. Governmental organ...
Article
Full-text available
In this paper, we discuss the application of concept of data quality to big data by highlighting how much complex is to define it in a general way. Already data quality is a multidimensional concept, difficult to characterize in precise definitions even in the case of well-structured data. Big data add two further dimensions of complexity: (i) bein...
Chapter
This chapter investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, the paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and applica...
Article
In this paper, we propose an original methodology by which to assess the construct of the situated social value of open data and we apply it to the healthcare domain in regard to information by which hospitals can be ranked to compare service providers. Our methodology encompasses a questionnaire-based user study and a method by which to rank infor...
Presentation
Linked data are becoming one of the most adopted model used to publish data on the Web. Thanks to the possibility to connect different datasets by means of the RDF features, linked data are suitable to fully exploit the nature of the Web. Even if a lot of tools are available supporting the publication of linked data, in the literature there is a la...
Conference Paper
In a previous paper, we have investigated the different dimensions of a classificatory framework suitable to support the assessment and benchmarking of the social value of open data initiatives. In this paper, we propose a methodology that compares and evaluates open data social value, and we apply it to the specific domain of hospital care. Throug...
Conference Paper
Nowadays services are fast evolving into the coarse-grained composite services by mash up technology in both business and IT levels. It is quite important for both customers and composite service providers (called "broker") to measure whether the offered services would satisfy customers' demands. However, recent researchers use only functional aspe...
Article
The paper discusses a framework for managing and evaluating ICT-enabled service portfolios along the service design phase. The framework adopts a service reuse perspective and it is made up of i) a model for the representation of a repository of services, ii) a model for the definition of a service portfolio representing current production lines of...
Article
When tens and even hundreds of schemas are involved in the integration process, criteria are needed for choosing clusters of schemas to be integrated, so as to deal with the integration problem through an efficient iterative process. Schemas in clusters should be chosen according to cohesion and coupling criteria that are based on similarities and...
Article
Full-text available
This article investigates the evolution of data quality issues from traditional structured data managed in relational databases to Big Data. In particular, the paper examines the nature of the relationship between Data Quality and several research coordinates that are relevant in Big Data, such as the variety of data types, data sources and applica...
Conference Paper
The paper discusses the concept of value of integration in two apparently distant research domains, database and service domains. In the area of virtual database integration, we address the problem of increased utility in an organization resulting from adopting data integration architectures, so to be able to querying an integrated database schema...
Article
Full-text available
Open data initiatives are characterized, in several countries, by a great extension of the number of data sets made available for access by public administrations, constituencies, businesses and other actors, such as journalists, international institutions and academics, to mention a few. However, most of the open data sets rely on selection criter...
Chapter
Full-text available
In this paper we discuss the main issues considered in the literature on information quality and several factors influencing these issues. The main goal of the paper is exploratory, aiming to identify key topics characterizing recent information quality research and their impact on future research perspectives in a context where information is incr...
Conference Paper
Full-text available
Categorization of instances in dataspaces is a difficult and time consuming task, usually performed by domain experts. In this paper we propose a semi-automatic approach to the extraction of facets for the fine-grained categorization of instances in dataspaces. We focus on the case where instances are categorized under heterogeneous taxonomies in s...
Chapter
This chapter discusses the case of ongoing experiences in the field of Service Science. The experiences refer to the Italian SMART project producing a homonymous methodology, briefly described together with a discussion of its application as an instrument for developing and structuring initiatives in Service Science education and in the design of s...
Chapter
Full-text available
Information growth makes the understanding of the value of digital information assets a key issue to information systems management. To this end, the paper discusses the results of the reconstruction of a multidisciplinary literature, addressing information value or some of the related concepts or drivers. The analysis allows identifying characteri...
Article
Open government is emerging as a core issue for increasing, on the one hand, participation of citizens, and, on the other hand, accountability, transparency, and the capability of delivering digital services by Public Administrations, with a consequent interest into public and social value as final outcomes. However, most of the open government ini...
Chapter
In this chapter, we discuss the case of the instantiation and development of a methodology for e-Government initiatives design and planning in the specific context of Mediterranean Countries. The methodology aims to support the definition of strategy implementation roadmaps that consider the fitting of e-Government vision principles, policies and t...
Article
Full-text available
In the service development life cycle it is worthwhile to distinguish between a conceptual phase, that leads to model abstract services, and a production phase that produces concrete services. Both abstract and concrete services produced by a provider organization can be organized, for reuse purposes, in structured repositories of services. As occu...
Conference Paper
PoliMaR-Web provides experts and ordinary Web users with a tool to discover suitable Web APIs among the ones published in repositories. Given a set of constraints, either soft or hard, semantic descriptions are extracted from repositories and heterogeneous sources available on the Web, and then matchmade to deliver a personalized ranked list of API...
Conference Paper
Information growth asks Public Administrations for an effective control over their information asset. Furthermore, having a global representation of the core concepts of such an asset implies to manage large set of conceptual schemas. At the state of the art, the use of repositories of conceptual schemas aims to provide a structured, global and sca...
Article
The composition of Web APIs provides a great opportunity to Web engineers that can reuse existing software components available on the Web. Finding the best API, fulfilling a set of user requirements, among the many described on the Web is a key step in order to develop an effective Web application; however, Web engineers have little support in sol...
Conference Paper
The tutorial aims to introduce to the complexity of strategic planning of information systems (IS) for eGovernment services in countries characterized by different culture, technological readiness, and legal framework. Another goal, is to raise awareness on the role of IS modelling in strategic planning in its interaction with socio-economic, organ...
Article
Full-text available
We present a Heterogenous Data Quality Methodology (HDQM) for Data Quality (DQ) assessment andimprovement that considers all types of data managed in an organization, namely structured datarepresented in databases, semistructured data usually represented in XML, and unstructured datarepresented in documents. We also define a meta-model in order to...
Conference Paper
Full-text available
Traditional data quality engineering techniques, often used and deployed within a single enterprise environment, are inadequate to cope with the rapid change of data, with a multitude of quality degrees, to be used in contemporary business models. The emerging cloud computing paradigm could potentially offer high-quality, composable data and techni...
Article
Repositories of conceptual schemas have been proposed in literature to represent data managed in complex large scale information systems. In this paper we discuss concepts, constructs, methods and quality dimensions for repositories of conceptual schemas. In particular, the paper considers the two main state of the art paradigms to organize huge am...
Article
In recent years research on eGovernment has grown rapidly at both the quantitative and qualitative levels [104]. Besides this rapid development both practitioners and scholars have considered the results of solutions and initiatives deployed in the last 10 years in the different countries involved in eGovernment programs, showing that these latter...
Article
Conceiving ICT projects is a highly creative activity, for which experience of previous projects is needed, together with an in-depth knowledge of available ICTs. However at a macro-level most relevant choices can be conducted and/or understood also by PA managers devoid of a specific skills in ICTs. This characteristic of operational planning is o...
Article
Strategic planning is the most relevant phase of the eGovernment information system life cycle for achieving a clear understanding of the alignment between the political vision, the context of intervention, and the actual ICT goals, architectures, and infrastructures.
Article
This chapter focuses on several initiatives carried out in Europe and in Italy in the last years; such initiatives are analyzed under the socio-economic point of view, providing quantitative measures that are behind the decisions previously described in the book.
Article
This chapter presents some guidelines on how to specify a new administrative process (composite external service in the terminology of Chap. 10) in the eG4M methodology on the assumption of using a service-oriented architecture (SOA) and related engineering approaches, techniques, and tools. In particular, the guidelines address the case in which a...
Article
This chapter presents a possible reference technological architecture for eGovernment projects, based on the service-oriented architecture (SOA) paradigm and related technologies and approaches. Such a reference architecture leverages the Italian experience called SPCoop (in Italian Sistema Pubblico di Cooperazione, possible English translation as...
Article
In this chapter we discuss the step of the methodology which supports the definition of priority services and their quality value targets. As for the overall methodology, the general idea is that the choice should be driven by a clear understanding of
Article
As we said in the preface, the eG4M methodology has been applied to eGovernment projects in the Mediterranean area. In this chapter we focus on the data governance part of the methodology that has been applied in the Tunisian Ministry of Agriculture, in parallel with running initiatives on the reorganization of databases managed in the different ad...
Article
The eG4M framework differs from traditional technology-driven approaches to eGovernment, considering both how ICTs affect organizations and how the social context and the organizations influence the use of technologies. Indeed, the conceptual framework underlying the methodology is based mainly on neo-institutionalism [80, 231] and social construc...
Chapter
Data and information are the fundamental resources managed by public administrations to provide services to users. So, the reader should not be surprised that at the beginning of the book we focus on the part of the eG4M methodology that deals with data. To give an example, in the Italian Central Public Administration more than 500 large-sized data...
Chapter
Quality is a first-class citizen in eG4M. Users of services do not want to lose time in their interactions with administrations, they do not want to suffer to provide information which is already present in public administration databases and they do not want to be bothered by inefficiencies and errors in administrative processes. Figures say that...
Chapter
Strategic planning of initiatives in eGovernment may effectively be performed only when adequate knowledge is available on all the facets introduced in previous chapters. Such knowledge is usually fragmented and dispersed among PA offices and sometimes among PA officers, who seldom document the information related to laws, organization, and other i...
Conference Paper
Full-text available
A conceptual framework for the automatic discovery of dependencies between data quality dimensions is described. Dependency discovery consists in recovering the dependency structure for a set of data quality dimensions measured on attributes of a database. This task is accomplished through the data mining methodology, by learning a Bayesian Network...
Conference Paper
In medium-big enterprise it is quite typical that the database architecture is defined through a sequence of projects and realizations that result a number of different and sometime overlapping data sources. This trend is worsened by merger and acquisition activities that add in existing data architecture new data sources from external organization...
Conference Paper
Integrated repositories of conceptual schemas provide organizations dealing with a large amount of data sources with an integrated view on the information they manage. Making schema repositories compliant with the Web and with the Web knowledge technologies has become crucial today. In this paper we present an approach to conceptual metadata manage...
Book
The success of public sector investment in eGovernment initiatives strongly depends on effectively exploiting all aspects of ICT systems and infrastructures. The related objectives are hardly reachable without methodological frameworks that provide a holistic perspective and knowledge on the contexts of eGovernment initiatives. Yet public administr...
Article
In large organizations the database architecture is typically built through a series of projects and realizations that result in a number of heterogeneous and overlapping data sources. This trend is worsened by merger and acquisition activities that add new data sources from external organizations to the existing data architecture. Data fragmentati...
Conference Paper
Planning activities are a relevant instrument to carry out sustainable and valuable eGovernment initiatives. The set of expertise needed for the design of eGovernment systems ranges from social to legal, economic, organizational, and technological perspectives, which have to be faced in a unique framework. The aim of the eG4M framework is to bring...
Article
In this paper we discuss the GovQual methodology for planning eGovernment initiatives in public administrations. In particular, the paper describes an application of the GovQual methodology for information systems integration at the Tunisian Ministry of agriculture and hydraulic resources. The key elements of the methodology are the multidisciplina...
Chapter