Alessandro Bozzon

Alessandro Bozzon
Delft University of Technology | TU · Faculty of Electrical Engineering, Mathematics and Computer Sciences (EEMCS)

Assistant Professor

About

223
Publications
62,485
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,356
Citations
Additional affiliations
January 2009 - January 2013
Politecnico di Milano
Education
January 2006 - December 2008
Politecnico di Milano
Field of study
October 2003 - October 2005
Poltecnico di Milano
Field of study

Publications

Publications (223)
Article
Full-text available
Machine learning (ML) training data is often scattered across disparate collections of datasets, called data silos . This fragmentation poses a major challenge for data-intensive ML applications: integrating and transforming data residing in different sources demand a lot of manual work and computational resources. With data privacy constraints,...
Article
Work on value alignment aims to ensure that human values are respected by AI systems. However, existing approaches tend to rely on universal framings of human values that obscure the question of which values the systems should capture and align with, given the variety of operational situations. This often results in AI systems that privilege only a...
Preprint
Full-text available
In the shift towards human-centered manufacturing, our two-year longitudinal study investigates the real-world impact of deploying Cognitive Assistants (CAs) in factories. The CAs were designed to facilitate knowledge sharing among factory operators. Our investigation focused on smartphone-based voice assistants and LLM-powered chatbots, examining...
Article
Full-text available
Digitally-supported participatory methods are often used in policy-making to develop inclusive policies by collecting and integrating citizen's opinions. However, these methods fail to capture the complexity and nuances in citizen's needs, i.e., citizens are generally unaware of other's needs, perspectives, and experiences. Consequently, policies d...
Article
Full-text available
The configuration of public open spaces plays a crucial role in shaping how different people use them. Nevertheless, our understanding of how the physical features of public open spaces influence the activities conducted within them, and the extent to which this impact differs across various individuals and population groups, is currently limited....
Conference Paper
Full-text available
Sustained adoption of automation is a problem for organizations, despite the promised benefits of automation and the propensity for organizations to expect it to transform their workplaces. To address this problem, previous work in HCI has mostly considered the perspectives and experiences of users interacting with automation technologies and has n...
Chapter
Web search has evolved into a platform people rely on for opinion formation on debated topics. Yet, pursuing this search intent can carry serious consequences for individuals and society and involves a high risk of biases. We argue that web search can and should empower users to form opinions responsibly and that the information retrieval community...
Article
Full-text available
The study of urban greenspaces typically relies on three types of data: people's subjective perceptions collected via questionnaires, vegetation indices derived from satellite imagery, such as the Normalized Difference Vegetation Index (NDVI), and Land Use or Land Cover maps, such as OpenStreetMap (OSM). Data on people's perceptions are essential w...
Chapter
The proliferation of pre-trained ML models in public Web-based model zoos facilitates the engineering of ML pipelines to address complex inference queries over datasets and streams of unstructured content. Constructing optimal plan for a query is hard, especially when constraints (e.g. accuracy or execution time) must be taken into consideration, a...
Chapter
Machine learning (ML) researchers and practitioners are building repositories of pre-trained models, called model zoos. These model zoos contain metadata that detail various properties of the ML models and datasets, which are useful for reporting, auditing, reproducibility, and interpretability. Unfortunately, the existing metadata representations...
Chapter
Traditionally, the popularity of classical music composers is approximated through commercial figures like album releases, record sales, or live performances. However, commercial factors only provide one piece of the overall picture. The success of community-driven platforms has profoundly changed how people consume and interact with music, and, co...
Article
Full-text available
City streets that feel safe and attractive motivate active travel behaviour and promote people’s well-being. However, determining what makes a street safe and attractive is a challenging task because subjective qualities of the streetscape are difficult to quantify. Existing evidence typically focuses on how different street features influence perc...
Chapter
Full-text available
Neighborhood safety and its perception are important determinants of citizens’ health and well-being. Contemporary urban design guidelines often advocate urban forms that encourage natural surveillance or “eyes on the street” to promote community safety. However, assessing a neighborhood’s level of natural surveillance is challenging due to its sub...
Article
Full-text available
Recent evidence underscores the importance of greenspace exposure in promoting physical activity, and in having a positive impact on mental health and cognitive development. Accessibility has been identified to be the primary motivating factor when it comes to encouraging greenspace use and, correspondingly, exposure. Existing quantitative approach...
Article
Full-text available
Machine learning (ML) practitioners and organizations are building model repositories of pre-trained models, referred to as model zoos . These model zoos contain metadata describing the properties of the ML models and datasets. The metadata serves crucial roles for reporting, auditing, ensuring reproducibility, and enhancing interpretability. Des...
Preprint
Full-text available
Machine learning (ML) practitioners and organizations are building model zoos of pre-trained models, containing metadata describing properties of the ML models and datasets that are useful for reporting, auditing, reproducibility, and interpretability purposes. The metatada is currently not standardised; its expressivity is limited; and there is no...
Article
Full-text available
Music content annotation campaigns are common on paid crowdsourcing platforms. Crowd workers are expected to annotate complex music artifacts, a task often demanding specialized skills and expertise, thus selecting the right participants is crucial for campaign success. However, there is a general lack of deeper understanding of the distribution of...
Preprint
Full-text available
In an effort to regulate Machine Learning-driven (ML) systems, current auditing processes mostly focus on detecting harmful algorithmic biases. While these strategies have proven to be impactful, some values outlined in documents dealing with ethics in ML-driven systems are still underrepresented in auditing processes. Such unaddressed values mainl...
Article
Full-text available
Artificial intelligence (AI) applications can profoundly affect society. Recently, there has been extensive interest in studying how scientists design AI systems for general tasks. However, it remains an open question as to whether the AI systems developed in this way can work as expected in different regional contexts while simultaneously empoweri...
Article
Full-text available
Background: There is increasing evidence that a complex interplay of factors within environments in which children grows up, contributes to children's suboptimal mental health and cognitive development. The concept of the life-course exposome helps to study the impact of the physical and social environment, including social inequities, on cognitiv...
Article
Full-text available
City events are getting popular and are attracting a large number of people. This increase needs for methods and tools to provide stakeholders with crowd size information for crowd management purposes. Previous works proposed a large number of methods to count the crowd using different data in various contexts, but no methods proposed using social...
Article
The future of crowd work has been identified to depend on worker satisfaction, but we lack a thorough understanding of how worker satisfaction can be increased in microtask crowdsourcing. Prior work has shown that one solution is to build tasks that are engaging. To facilitate engagement, two methods that have received attention in recent HCI liter...
Preprint
Full-text available
Many powerful Artificial Intelligence (AI) techniques have been engineered with the goals of high performance and accuracy. Recently, AI algorithms have been integrated into diverse and real-world applications. It has become an important topic to explore the impact of AI on society from a people-centered perspective. Previous works in citizen scien...
Article
Music content annotation campaigns are common on paid crowdsourcing platforms. Crowd workers are expected to annotate complicated music artefacts, which can demand certain skills and expertise. Traditional methods of participant selection are not designed to capture these kind of domain-specific skills and expertise, and often domain-specific quest...
Article
Full-text available
The automatic detection of conflictual languages (harmful, aggressive, abusive, and offensive languages) is essential to provide a healthy conversation environment on the Web. To design and develop detection systems that are capable of achieving satisfactory performance, a thorough understanding of the nature and properties of the targeted type of...
Article
Full-text available
As cities resume life in public space, they face the difficult task of retaining outdoor activity while decreasing exposure to airborne viruses, such as the novel coronavirus. Even though the transmission risk is higher in indoor spaces, recent evidence suggests that physical contact outdoors also contributes to an increased virus exposure. Given t...
Preprint
Full-text available
Hybrid crowd-machine classifiers can achieve superior performance by combining the cost-effectiveness of automatic classification with the accuracy of human judgment. This paper shows how crowd and machines can support each other in tackling classification problems. Specifically, we propose an architecture that orchestrates active learning and crow...
Preprint
Full-text available
In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a chall...
Conference Paper
Full-text available
In online crowd mapping, crowd workers recruited through crowdsourcing marketplaces collect geographic data. Compared to traditional mapping methods, where workers physically explore the area, the benefit of using online crowd mapping is the potential to be cost-effective and time-efficient. Previous studies have focused on mapping urban objects us...
Conference Paper
Full-text available
Human annotation is still an essential part of modern transcription workflows for digitizing music scores, either as a standalone approach where a single expert annotator transcribes a complete score, or for supporting an automated Optical Music Recognition (OMR) system. Research on human computation has shown the effectiveness of crowdsourcing for...
Article
In online crowd mapping, crowd workers recruited through crowdsourcing marketplaces collect geographic data. Compared to traditional mapping methods, where workers physically explore the area, the benefit of using online crowd mapping is the potential to be cost-effective and time-efficient. Previous studies have focused on mapping urban objects us...
Preprint
The way pages are ranked in search results influences whether the users of search engines are exposed to more homogeneous, or rather to more diverse viewpoints. However, this viewpoint diversity is not trivial to assess. In this paper we use existing and novel ranking fairness metrics to evaluate viewpoint diversity in search result rankings. We co...
Conference Paper
Full-text available
Due to the coronavirus pandemic, remote work from home has rapidly become a necessity around the world, drastically changing the potential landscape for the future of work. Over the last couple of decades, microtask crowdsourcing has emerged as a viable means of carrying out remote online work to earn one's living-an alternative to traditional work...
Article
Full-text available
Large-scale events are becoming more frequent in contemporary cities, increasing the need for novel methods and tools that can provide relevant stakeholders with quantitative and qualitative insights about attendees’ characteristics. In this work, we investigate how social media can be used to provide such insights. First, we screen a set of factor...
Chapter
Full-text available
Conversational agents are playing an increasingly important role in providing users with natural communication environments, improving outcomes in a variety of domains in human-computer interaction. Crowdsourcing marketplaces are simultaneously flourishing, and it has never been easier to acquire large-scale human input from online workers. Recent...
Chapter
Credit scoring is an important tool to assess the solidity of small and medium-sized enterprises (SMEs), and to unlock for them new options for credit and improvement of cash flow. Credit scoring is, in its most common form, used by (potential) creditors to predict the probability of SMEs to default in the future, as an inverse measure of creditwor...
Conference Paper
Full-text available
Up-to-date listings of retail stores and related building functions are challenging and costly to maintain. We introduce a novel method for automatically detecting, geo-locating, and classifying retail stores and related commercial functions, on the basis of storefronts extracted from street-level imagery. Specifically, we present a deep learning a...
Conference Paper
Full-text available
Crowdsourcing marketplaces have provided a large number of opportunities for online workers to earn a living. To improve satisfaction and engagement of such workers, who are vital for the sustainability of the marketplaces, recent works have used conversational interfaces to support the execution of a variety of crowdsourcing tasks. The rationale b...
Conference Paper
Full-text available
The rise in popularity of conversational agents has enabled humans to interact with machines more naturally. Recent work has shown that crowd workers in microtask marketplaces can complete a variety of human intelligence tasks (HITs) using conversational interfaces with similar output quality compared to the traditional Web interfaces. In this pape...
Conference Paper
Full-text available
This demo presents VirtualCrowd, a simulation platform for crowdsourcing campaigns. The platform allows the design, configuration, step-by-step execution, and analysis of customized tasks, worker profiles, and crowdsourcing strategies. The platform will be demonstrated through a crowd-mapping example in two cities, which will highlight the utility...
Preprint
Full-text available
Despite the high interest for Machine Learning (ML) in academia and industry, many issues related to the application of ML to real-life problems are yet to be addressed. Here we put forward one limitation which arises from a lack of adaptation of ML models and datasets to specific applications. We formalise a new notion of unfairness as exclusion o...
Preprint
Full-text available
Machine Learning (ML) is increasingly applied in real-life scenarios, raising concerns about bias in automatic decision making. We focus on bias as a notion of opinion exclusion, that stems from the direct application of traditional ML pipelines to infer subjective properties. We argue that such ML systems should be evaluated with subjectivity and...
Chapter
Full-text available
Named Entity Recognition (NER) for rare long-tail entities as e.g., often found in domain-specific scientific publications is a challenging task, as typically the extensive training data and test data for fine-tuning NER algorithms is lacking. Recent approaches presented promising solutions relying on training NER algorithms in an iterative weakly-...
Article
Full-text available
City events are being organized more frequently, and with larger crowds, in urban areas. There is an increased need for novel methods and tools that can provide information on the sentiments of crowds as an input for crowd management. Previous work has explored sentiment analysis and a large number of methods have been proposed relating to various...
Conference Paper
Full-text available
Conversational interfaces can facilitate human-computer interactions. Whether or not conversational interfaces can improve worker experience and work quality in crowdsourcing marketplaces has remained unanswered. We investigate the suitability of text-based conversational interfaces for microtask crowdsourcing. We designed a rigorous experimental c...
Conference Paper
Full-text available
Knowledge about the organization of the main physical elements (e.g. streets) and objects (e.g. trees) that structure cities is important in the maintenance of city infrastructure and the planning of future urban interventions. In this paper, a novel approach to crowd-mapping urban objects is proposed. Our method capitalizes on strategies for gener...
Chapter
Street-level imagery contains a variety of visual information about the facades of Points of Interest (POIs). In addition to general morphological features, signs on the facades of, primarily, business-related POIs could be a valuable source of information about the type and identity of a POI. Recent advancements in computer vision could leverage v...
Conference Paper
This paper describes the system that team MYTOMORROWS-TU DELFT developed for the 2019 Social Media Mining for Health Applications (SMM4H) Shared Task 3, for the end-to-end normalization of ADR tweet mentions to their corresponding MEDDRA codes. For the first two steps, we reuse a state-of-the art approach, focusing our contribution on the final ent...
Article
Full-text available
Understanding and improving the energy consumption behavior of individuals is considered a powerful approach to improve energy conservation and stimulate energy efficiency. To motivate people to change their energy consumption behavior, we need to have a thorough understanding of which energy-consuming activities they perform and how these are perf...
Article
Dialog agents like digital assistants and automated chat interfaces (e.g.chatbots) are becoming more and more popular as users adapt to conversing with their devices like with humans. In this article we present approaches and available tools for dialog management, a component of dialog agents that handles dialog context and decides the next action...
Conference Paper
Full-text available
Knowledge graphs (KGs) have proven to be effective to improve recommendation. Existing methods mainly rely on hand-engineered features from KGs (e.g., meta paths), which requires domain knowledge. This paper presents RKGE, a KG embedding approach that automatically learns semantic representations of both entities and paths between entities for char...
Chapter
Named Entity Recognition and Typing (NER/NET) is a challenging task, especially with long-tail entities such as the ones found in scientific publications. These entities (e.g. “WebKB”,“StatSnowball”) are rare, often relevant only in specific knowledge domains, yet important for retrieval and exploration purposes. State-of-the-art NER approaches emp...