PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Recommendation systems and assistants (in short, recommenders) are ubiquitous in online platforms and influence most actions of our day-to-day lives, suggesting items or providing solutions based on users' preferences or requests. This survey analyses the impact of recommenders in four human-AI ecosystems: social media, online retail, urban mapping and generative AI ecosystems. Its scope is to systematise a fast-growing field in which terminologies employed to classify methodologies and outcomes are fragmented and unsystematic. We follow the customary steps of qualitative systematic review, gathering 144 articles from different disciplines to develop a parsimonious taxonomy of: methodologies employed (empirical, simulation, observational, controlled), outcomes observed (concentration, model collapse, diversity, echo chamber, filter bubble, inequality, polarisation, radicalisation, volume), and their level of analysis (individual, item, model, and systemic). We systematically discuss all findings of our survey substantively and methodologically, highlighting also potential avenues for future research. This survey is addressed to scholars and practitioners interested in different human-AI ecosystems, policymakers and institutional stakeholders who want to understand better the measurable outcomes of recommenders, and tech companies who wish to obtain a systematic view of the impact of their recommenders.
A survey on the impact of AI-based recommenders on human
behaviours: methodologies, outcomes and future directions
LUCA PAPPALARDO, Institute of Information Science and Technologies at National Research Council
(ISTI-CNR), Italy and Scuola Normale Superiore of Pisa, Italy
EMANUELE FERRAGINA, Sciences Po, France
SALVATORE CITRARO, GIULIANO CORNACCHIA, MIRCO NANNI, GIULIO ROSSETTI, In-
stitute of Information Science and Technologies at National Research Council (ISTI-CNR), Italy
GIZEM GEZICI, FOSCA GIANNOTTI, MARGHERITA LALLI, Scuola Normale Superiore of Pisa, Italy
DANIELE GAMBETTA, GIOVANNI MAURO, VIRGINIA MORINI, VALENTINA PANSANELLA,
DINO PEDRESCHI, Department of Computer Science at University of Pisa, Italy
Recommendation systems and assistants (in short, recommenders) are ubiquitous in online platforms and inuence most
actions of our day-to-day lives, suggesting items or providing solutions based on users’ preferences or requests. This survey
analyses the impact of recommenders in four human-AI ecosystems: social media, online retail, urban mapping and generative
AI ecosystems. Its scope is to systematise a fast-growing eld in which terminologies employed to classify methodologies and
outcomes are fragmented and unsystematic. We follow the customary steps of qualitative systematic review, gathering 144
articles from dierent disciplines to develop a parsimonious taxonomy of: methodologies employed (empirical, simulation,
observational, controlled), outcomes observed (concentration, model collapse, diversity, echo chamber, lter bubble, inequality,
polarisation, radicalisation, volume), and their level of analysis (individual, item, model, and systemic). We systematically
discuss all ndings of our survey substantively and methodologically, highlighting also potential avenues for future research.
This survey is addressed to scholars and practitioners interested in dierent human-AI ecosystems, policymakers and
institutional stakeholders who want to understand better the measurable outcomes of recommenders, and tech companies
who wish to obtain a systematic view of the impact of their recommenders.
CCS Concepts: Information systems Collaborative ltering;Recommender systems.
Additional Key Words and Phrases: recommendation systems, human-AI coevolution, human-centered AI, social impact,
collaborative ltering, personalised recommendations
ACM Reference Format:
Luca Pappalardo, Emanuele Ferragina, Salvatore Citraro, Giuliano Cornacchia, Mirco Nanni, Giulio Rossetti, Gizem Gezici,
Fosca Giannotti, Margherita Lalli, and Daniele Gambetta, Giovanni Mauro, Virginia Morini, Valentina Pansanella, Dino
Pedreschi. 2024. A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and
future directions. 1, 1 (July 2024), 41 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn
Authors’ addresses: Luca Pappalardo, luca.pappalardo@isti.cnr.it, Institute of Information Science and Technologies at National Research
Council (ISTI-CNR), Via G. Moruzzi 1, Pisa, , Italy, 56124 and Scuola Normale Superiore of Pisa, Piazza dei Cavalieri, 7, Pisa, , Italy, 56126;
Emanuele Ferragina, Sciences Po, , Paris, France, emanuele.ferragina@sciencespo.fr; Salvatore Citraro, Giuliano Cornacchia, Mirco Nanni,
Giulio Rossetti, Institute of Information Science and Technologies at National Research Council (ISTI-CNR), Via G. Moruzzi 1, Pisa, Italy;
Gizem Gezici, Fosca Giannotti, Margherita Lalli, Scuola Normale Superiore of Pisa, Piazza dei Cavalieri, 7, Pisa, , Italy, 56126; Daniele Gambetta,
Giovanni Mauro, Virginia Morini, Valentina Pansanella, Dino Pedreschi, Department of Computer Science at University of Pisa, Largo Bruno
Pontecorvo 3, Pisa, Italy.
2024. XXXX-XXXX/2024/7-ART $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
, Vol. 1, No. 1, Article . Publication date: July 2024.
arXiv:2407.01630v1 [cs.IR] 29 Jun 2024
2Pappalardo and Ferragina, et al.
1 INTRODUCTION
Recommendation systems and assistants (from now on, recommenders) algorithms suggesting items or providing
solutions based on users’ preferences or requests [
99
,
105
,
141
,
166
] inuence through online platforms most
actions of our day to day life. For example, recommendations on social media suggest new social connections, those
on online retail platforms guide users’ product choices, navigation services oer routes to desired destinations, and
generative AI platforms produce content based on users’ requests. Unlike other AI tools, such as medical diagnostic
support systems, robotic vision systems, or autonomous driving, which assist in specic tasks or functions,
recommenders are ubiquitous in online platforms, shaping our decisions and interactions instantly and profoundly.
The inuence recommenders exert on users’ behaviour may generate long-lasting and often unintended eects on
human-AI ecosystems [
131
], such as amplifying political radicalisation processes [
82
], increasing CO2 emissions
in the environment [
36
] and amplifying inequality, biases and discriminations [
120
]. The interaction between
humans and recommenders has been examined in various elds using dierent nomenclatures, research methods
and datasets, often producing incongruent ndings. Consequently, the current understanding of the impact of
this interaction remains fragmentary and unsystematic.
In this survey, we analyse the impact of recommenders in four largely studied human-AI ecosystems, i.e.,
social media, online retail, urban mapping and generative AI ecosystems. Online platforms within these four
ecosystems recommend users to follow or items to consume (social media and online retail recommenders)
and provide a range of solutions to users’ requests (urban mapping and generative AI recommenders). These
ecosystems are characterised by a pervasive inuence of AI and are prototypical instances that help investigate how
recommenders inuence human behaviour. Therefore, their study is a vantage point for broadly understanding
user-recommender interactions.
Although the attention of the literature in this regard is growing fast [
27
,
131
,
133
], the terminologies employed
to dene the outcomes and the methods deployed to measure them are highly fragmented. To bridge this gap,
our survey provides a holistic overview of recent advances in the literature:
(1)
It categorises the methodologies employed to assess the inuence of recommenders on users’ behaviour
(empirical, simulation, observational and controlled studies) in four prominent human-AI ecosystems (social
media, online retail, urban mapping, and generative AI ecosystems) ;
(2)
It gathers the outcomes observed in the literature (collapse, concentration, diversity, echo chamber, l-
ter bubble, inequality, polarisation, radicalisation, volume) and standardises the terminologies in a new
parsimonious taxonomy;
(3)
It disentangles the level at which the outcomes are measured (individual, item, model, and systemic levels);
(4)
It suggests new avenues for future research and unveils some technical and methodological gaps in the
literature from a holistic point of view.
Several surveys on recommenders have been published recently, systematising domains like explainable
recommendations, knowledge-based recommendations, and deep learning-based recommendations [
155
,
166
,
167
],
applications of recommenders [
105
] and how to evaluate them [
149
], the impact of diversity in recommenders
[
91
], and bias/debias in recommenders [
27
]. To the best of our knowledge, this is the rst work that reviews
recommenders’ outcomes at various levels in dierent human-AI ecosystems, as well as the methodologies
employed to assess these impacts.
Our survey can be helpful to several public and private stakeholders. First, scholars and practitioners may obtain
guidance on recent advancements in dierent ecosystems. Second, policymakers and institutional stakeholders
may better understand measurable outcomes of actual or potential recommenders and their societal consequences,
such as polarisation, congestion, segregation, etc. Third, tech companies employing recommenders may obtain a
systematic view of the impact of their services to increase revenues and contribute to societal development.
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 3
The remainder of the paper is organised as follows. Section 2 details how we collected and classied the articles
and built our taxonomy. In Sections 3-6, we discuss the methodologies employed and the outcomes in the four
human-AI ecosystems under investigation. In Section 7, we summarise our survey ndings and suggest new
avenues for future research.
2 CONSTRUCTION OF THE SURVEY
2.1 Articles collection
We gathered 144 articles from dierent disciplines (e.g., complexity science, computational social science, computer
science, marketing, management, network science, urban studies) in leading journals, conferences as well as
recent unpublished material on the basis of the customary steps of qualitative systematic reviews [
66
]. Studies
were collected from Google Scholar, Web of Science, EBSCO, and JSTOR by scanning titles and abstracts for
keywords related to recommenders’ outcomes in social media, online retail, urban mapping and generative AI
ecosystems.
Our original results were rened by four additional steps. First, we browsed all issues of journals where the
original articles were gathered in the initial search. Second, we cross-checked the bibliography of each selected
article. Third, we called upon the expertise of two senior scholars and presented the article selection in a group
meeting with all authors. Fourth, we eliminated the articles that did not t our search denition after the initial
classication process (see Section 2.2). Note that we only consider articles that measure the eect of recommenders
on human-AI ecosystems; therefore, we exclude those only aiming to improve recommenders’ performance.
2.2 Classification process
We classied each article through the following process. We split the pool of authors into four teams, one for each
ecosystem. Each article was assigned to two coders, who independently read the paper, evaluated its relevance
for the survey, and classied it based on a preliminary taxonomy. Ecosystem teams discussed each article, solving
disagreements on the coders’ classication. This step allowed each team to present a preliminary classication
of the articles to the entire research group. During this presentation, the ecosystem teams illustrated doubts
concerning the keywords employed and the articles that were dicult to classify under the preliminary taxonomy.
These doubts were progressively solved through a series of meetings to build the nal taxonomy. Our outcomes
and their denitions are summarised in Table 1.
2.3 Taxonomy
We designed a taxonomy that classies articles based on the methodologies employed (Figure 2) and the outcomes
measured with their level of analysis (Table 1). We built the taxonomy through a consensus exercise among the
authors. Initially, the taxonomy was built through a deductive process based on the characteristics of a sample of
articles already known by the authors. All these articles have been then reclassied by the authors to validate
or question each category in the taxonomy. The iterative nature of the process allowed us to progressively
improve the initial taxonomy, proposing, in the end, a robust and comprehensive framework for the analysis of
recommenders’ outcomes.
Human-AI Ecosystems. We gather articles from four human-AI ecosystems: social media, online retail,
urban mapping, and generative AI ecosystems (see Figure 1). Articles in the social media ecosystem examine
recommenders that lter and suggest content or users to follow. Platforms in this ecosystem include Facebook,
Google News, Apple News, Instagram, X, Reddit, Gab, YouTube, and TikTok. Research in the online retail
ecosystem primarily focuses on recommenders suggesting products and services for consumption, encompassing
consumer goods, songs and movies. Platforms in this ecosystem include e-commerce and streaming giants like
Amazon, Alibaba, eBay, Netix, and Spotify. Studies within the urban mapping ecosystem focus on recommenders
, Vol. 1, No. 1, Article . Publication date: July 2024.
4Pappalardo and Ferragina, et al.
Human-AI ecosystems
Social Media Online Retail Urban Mapping Generative AI
Examples:
Social networking
Microblogging
Collaborative plat-
forms
Content communi-
ties
Examples:
E-commerce
Recommerce
Movie Streaming
Audio Streaming
Examples:
Ride-hailing
Car sharing
Navigation services
House booking
Examples:
Image generators
Text generators
Audio generators
Video generators
Fig. 1. Surveyed human-AI ecosystems: articles are categorized into social media, online retail, urban mapping, and generative
AI. The schema illustrates real-world examples within each ecosystem.
oering a variety of solutions to users’ requests. Examples include ride-hailing platforms like Uber and Lyft,
navigation services like Google Maps and Waze, accommodation rental platforms like Airbnb and Booking.com,
and point-of-interest search services like Tripadvisor and Yelp. The generative AI ecosystem encompasses research
on tools that generate content (e.g., text, image, audio, video) based on users’ prompts. Noteworthy examples
include chatGPT, LLama, Mistral, and DALL-E.
The interdisciplinary team of authors including computer scientists, complexity scientists, mathematicians,
and sociologists has been constructed to cover expertise on these four human-AI ecosystems and the dierent
methodologies employed. This classication is mirrored in the paper’s organisation. Each section corresponds
to a human-AI ecosystem, enhancing readability and accessibility for readers interested in specic application
contexts. This structure allows readers to focus on their areas of interest without the necessity of delving into
ecosystems less relevant to their concerns.
Methodologies. We systematically categorise articles within each human-AI ecosystem into empirical and
simulation studies. Within each category, we distinguish between controlled and observational studies. Empirical
studies derive insights from data produced by user and recommenders’ interactions. When datasets are large
and diverse, these studies allow for broad generalisations. However, the ability to draw universal conclusions is
constrained by specic geographic, temporal and contextual circumstances. Moreover, reproducing these studies
is challenging because data are often owned by big tech companies that are generally reluctant to share them.
Simulation studies are anchored to model-generated data, whether mechanistic, AI-driven, or based on digital
twins. They oer an alternative methodological pathway to deal with large-scale ecosystems or when data is
not readily available. These studies allow reproducibility under the same initial conditions, facilitating result
validation and verication. By manipulating parameters, scholars can scrutinise recommenders’ impacts on the
human-AI ecosystem, improving the understanding of intricate human-recommender interactions. However,
as they are based on heavy assumptions, simulations do not necessarily reect real-world dynamics and are
limited in unveiling unexpected or unintended outcomes. Simulation studies can be realised as prototypes for a
preliminary feasibility evaluation of subsequent empirical and controlled studies.
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 5
Both empirical and simulation methodologies can employ observational or controlled approaches. Controlled
studies comprehend quasi-experiments, randomised controlled trials, and A/B tests [
42
,
71
]. These studies divide
user samples into control and treatment groups exposed to dierent recommendations. Sample randomisation
may reduce selection biases, ensuring that participants in both groups have an equal chance of receiving the
recommendation. Controlled studies enable researchers to control for various factors and conditions, allowing
the isolation of the eect produced by a specic intervening variable. Their main advantage is establishing causal
relationships and attributing observed eects to the recommendation. However, a limitation stems from the
interaction among individuals [
11
]. In complex social systems, individuals within the control group can never
be isolated from the indirect eects of recommendations, as they are also inuenced by choices made by users
in the treatment group. Therefore, controlled experiments may not satisfy the Stable Unit Treatment Value
Assumption (SUTVA) for causal inference [
39
] and might not provide unbiased estimates of causal quantities
of interest. Controlled studies also have other important shortcomings: the inclusion and exclusion criteria of
the controlled settings might limit the generalisability of ndings; and there is limited exibility in adapting
to changes intercurring during the experiments. Moreover, they are hard to design because they require direct
access to platforms’ users and recommenders [
88
]. While platforms routinely conduct internal controlled studies
to validate dierent recommenders and maximise user engagement [
6
], access of external researchers to the
studies’ results is restricted.
Observational studies, whether grounded in empirical or synthetic data, operate without control, assuming a
single recommendation principle for the entire population. These studies include the analysis of Facebook users’
behaviour, Google Maps’ driver suggestions, and data gleaned from browser loggers and platform APIs [
5
]. While
oering broad insights when data is large and representative, observational studies struggle to establish causal
relationships rmly, often necessitating supplementary evidence. Additionally, they are susceptible to biases,
measurement errors, and issues related to confounding variables, which may compromise their accuracy and
reliability.
To clarify the dierences between these methodologies, we propose some prototypical examples. If we have
access to data reecting users’ behaviour on a platform and solely analyse this data, we conduct an empirical
observational study. However, if only a subset of platform users is exposed to a recommender, and we compare
the behaviours of those exposed to those who are not, this is an empirical controlled study. On the other hand,
when actual platform data are inaccessible, and we generate synthetic data through simulation tools (such as a
digital twin), we conduct a simulation study, which can be observational or controlled, as discussed above. It
is essential to acknowledge that quasi-experiments [
60
], in which an exogenous element splits the population
into two or more groups, according to our denition, must be considered controlled studies. Dierently, when
an exogenous element does not segment the population into dierent groups, studies have to be considered
observational in their methodological approach. A paper may be classied under dierent methodologies if it
employs two or more of them.
Outcomes. We dene an outcome as the result of a recommender’s inuence on a human-AI ecosystem.
We initially dened outcomes inductively and then rened them deductively. First, we let the team classify
the outcomes using the keywords in the articles. Then, in subsequent meetings, we uniformed the categories
using terms that could broadly cover these keywords. For example, popularity bias and concentration refer to
a similar kind of outcome. We opt for concentration because it ts dierent ecosystems. We also extend the
term concentration to describe situations with consensus around an attribute (e.g., opinions in social media
ecosystems). This parsimonious classication is a crucial contribution of this paper because it allows for the
standardisation of outcomes terminology across dierent elds of study.
In the literature, outcomes are measured at dierent levels: individual, item, model, and systemic. Individual
outcomes refer to the eects of recommenders on users. Users may be drivers and passengers in the urban
mapping ecosystem and sellers and buyers in the online retail ecosystem. Item outcomes refer to the eects of
, Vol. 1, No. 1, Article . Publication date: July 2024.
6Pappalardo and Ferragina, et al.
Methodologies
Empirical Simulation
Controlled Observational Controlled Observational
Fig. 2. Two-level categorization of methodologies of the surveyed articles. At a first level, we categorize articles into empirical
or simulation; at the second level we classify them as controlled or observational
recommenders on specic objects. Items may include posts on social media, products on online retail platforms,
rides in urban mapping platforms, or generated content on generative AI platforms. Systemic outcomes refer to
collective eects of recommeders.
Table 1 provides the denition and analytical level of the recommenders’ outcomes investigated in the literature
(see Figure 3 for the frequency of outcomes encountered in these studies). In our taxonomy, each outcome is
associated with a single analytical level, except for diversity and volume, which can be individual, item, and
systemic. To illustrate this point, we showcase examples from the online retail ecosystem. Various studies examine
changes in revenue and purchased products, measuring whether users spend more (individual level), specic
items are purchased more (item), and aggregate consumption increases or decreases (systemic). Similarly, they
explore whether users engage with a more diverse range of products (individual), items are purchased by a more
diverse set of users (item), and aggregate consumption diversity increases or decreases (systemic). The other
recommenders’ outcomes are: radicalisation in the social media ecosystem (individual), lter bubbles in social media
and online retail ecosystems (individual); model collapse in the generative AI ecosystem (model); concentration,
echo chamber,inequality, and polarisation in social media, online retail, and urban mapping ecosystems (systemic).
The distinction between dierent analytical levels allows us to disambiguate some outcomes that are often
confused in the literature. For example, the terms lter bubble and echo chambers are often used interchangeably
in the social media ecosystem.
Some recommenders’ outcomes appear in the literature with a dierent nomenclature. This happens both
within the same ecosystem and across dierent ecosystems; some examples help clarify this point. The terms
polarisation and fragmentation are slightly dierent: polarisation has been developed to capture dierent opinions,
mainly in contexts with a clear division into two groups; fragmentation instead indicates the presence of more
poles of polarisation. This distinction mainly comes from historical reasons, as seminal studies on polarisation
originated in the US bipartite system. The term polarisation has been extended to cover the study of political
attitudes in multi-party systems. Consequently, the term polarisation has been stretched to describe a wide array
of scenarios (at present, there are at least twelve denitions of polarisation in the literature [
23
]). For the sake of
parsimony, we merge studies on fragmentation and polarisation into one category labelled polarisation. Within
the urban ecosystem, we devised another simplication of the terms concentration and congestion. We consider
the latter to be an extreme case of the former.
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 7
It is important to acknowledge that some outcomes in the classication may partially overlap. For example, the
concentration of user purchases can also be associated with a reduction of diversity at the systemic level. However,
the opposite is not always true: diversity reduction does not imply an increase in concentration. Therefore, to
provide a detailed picture of the surveyed studies, we consider all pertinent outcomes.
Level Outcome Description Ecosystems
Individual
Diversity Variety of users’ behaviour, items consumed and users followed SM, OR, UM
Filter Bubble Conformation of items or contents with own preferences or beliefs SM, OR
Radicalization Items or individual attributes going towards an extreme SM
Volume Quantity value of some users’ attribute SM, OR, UM
Item Diversity Variety of users that consume the item SM, OR, GAI
Volume Quantity value of some items’ attribute SM, OR, UM
Model Collapse AI model degradation over time GAI
Systemic
Concentration Close gathering of people or things SM, OR, UM
Diversity Aggregate diversity of users or items SM, OR, UM
Echo Chamber Environment reinforcing opinions or item choices within a group SM, OR, UM
Inequality
Uneven distribution of resources/opportunities among group members
SM, OR, UM
Polarization
Sharp separation of users/items into groups based on some attributes
SM
Volume Aggregate volume of users’ or items’ attributes SM, OR, UM
Table 1. The table details: the definition of each outcome, its level of analysis, and the ecosystems where it can be found. We
use the following acronyms for each ecosystem: social media (SM), online retail (OR), urban mapping (UM), and generative
AI (GAI).
3 SOCIAL MEDIA ECOSYSTEM
What the ecosystem is about. The social media ecosystem includes social networking platforms, community
and non-community content systems that promote content creation and sharing, and interaction among users.
Social networking platforms include Facebook, Instagram, TikTok, and X (previously, Twitter). Community
content platforms encourage users to join interest-based communities (e.g., Reddit and Gab) or engage in video
consumption (e.g., YouTube). Non-community content platforms, like Google News or Apple News, diuse their
own content.
Main methodologies employed. There is a marked preference for observational studies (see Figure 4). This is
because empirical research can also be conducted by researchers external to the platform via data sharing or
APIs. Moreover, synthetic data may be easily gathered from agent-based and opinion dynamics models, allowing
a successive analysis. Typically, empirical observational studies in this ecosystem exploit bots to simulate user
behaviours (sock-puppet studies), collect information about the provided recommendations, or perform user
surveys. Simulation observational studies are primarily based on agent-based modelling, with a minority of works
focusing on a single user. Only a few empirical studies are controlled (see Figure 4), and this is because they
require direct access to users’ data to build control and treatment groups and to enable/disable recommenders’
, Vol. 1, No. 1, Article . Publication date: July 2024.
8Pappalardo and Ferragina, et al.
Volume-Individual
Volume-Item
Concentration
Diversity-Individual
Volume-System
Diversity-System
Filter Bubble
Inequality
Polarization
Diversity-Item
Collapse
Echo Chamber
Radicalization
0
5
10
15
20
25
30
35
40
Number of papers
37
34
31 29
25
21 19 19 18 17
14 13 13
Fig. 3. Frequency of outcomes in the selected studies.
Social Media Online Retail Urban Mapping Generative AI
0
20
40
60
80
100
% of papers
22.2%
48.2%
0.0% 0.0%
39.6%
26.8% 23.5%
0.0%0.0% 0.0%
26.5%
0.0%
38.2%
25.0%
50.0%
100.0%
Empirical.Controlled
Empirical.Observational
Simulation.Controlled
Simulation.Observational
(a)
Fig. 4. Percentage of employed methodologies for each human-AI ecosystem.
features to assess eects on users. We do not nd simulation-controlled studies as open digital twins are not
available to replicate the main characteristics of real social media platforms. There is, instead, a preference for
abstract simulations. In these studies dierent recommenders are tested across various independent experiments.
Main outcomes. Research within this ecosystem examines nearly all outcomes contained in our taxonomy,
except model collapse (see Table 2 for a comprehensive outlook). As the role of recommenders is highly pervasive
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 9
Social Media Empirical Simulation
Observational Controlled Observational Controlled
Individual
Filter Bubble
[
14
,
22
,
25
,
30
,
33
,
65
,
72
,
76
,
80
,
92
,
93, 162]
[17] [140, 142]
Radicalization
[
13
,
25
,
72
,
79
,
80
,
83, 94, 139, 160]
[113]
[
85
,
140
,
142
,
157]
Model Collapse
Systemic
Concentration [76, 87, 94, 153] [16, 113]
[
41
,
51
,
52
,
56
,
134, 135, 137]
Echo Chamber [14, 25] [126]
[
32
,
34
,
128
,
129
,
135
,
137
,
157
,
159]
Inequality
[
76
,
87
,
94
,
145
,
153]
[16, 82, 113]
[
51
,
52
,
56
,
135
,
137]
Polarization [33, 65, 162]
[
67
,
68
,
102
,
106
,
126]
[
32
,
41
,
128
,
129
,
134
,
135
,
137
,
138
,
150, 159]
Individual
Item
Systemic
Diversity
individual:
[
14
,
22
,
25
,
30
,
33
,
65
,
72
,
92
,
162
],
item:
[14, 19, 122, 169]
individual: [
17
],
item: [
102
,
113
,
126],
individual:
[
41
,
62
,
138
,
140
,
157
], systemic:
[
51
,
52
,
56
,
85
,
135, 137, 142]
Volume
individual:
[
83
], item:
[
13
,
19
,
22
,
25
,
72
,
76
,
87
,
94
,
145
,
153, 160, 169]
individual: [
16
,
17
,
67
,
68
,
102
],
item: [
16
,
67
,
68
,
82
,
126
],
systemic: [163]
individual:
[
51
,
52
,
56
,
135
,
137
,
139
], sys-
temic: [62]
Table 2. Social Media Ecosystem. Classification of selected papers based on their methodology, outcomes and level of
analysis.
in social media platforms, the literature is more extensive than in the other ecosystems under investigation: 53
out of 144 papers reviewed belong to this ecosystem.
3.1 Empirical studies
Observational studies. Ribeiro et al
. [139]
audit radicalisation pathways on YouTube’s video and channel
recommendations. By analysing users’ migration patterns across 330k videos from 349 politically related channels,
the study nds a recommendation ow from milder to more extreme (alt-right) content. Radicalization
, Vol. 1, No. 1, Article . Publication date: July 2024.
10 Pappalardo and Ferragina, et al.
Brown et al
. [25]
likewise explore whether YouTube’s algorithm pushes users into lter bubbles and echo
chambers or displays biases towards some political content. Through analysing videos’ political orientation
and surveying 527 users who navigate the platform according to randomly assigned rules, the study nds
minimal evidence of lter bubbles and echo chambers. However, it identies a platform bias leading to a stronger
amplication of moderately conservative content.
Radicalization Filter Bubble Echo Chamber Diversity.Individual
Volume.Item
Similarly, Santini et al
. [145]
focus on Brazilian elections to examine YouTube’s promotion of hyperpartisan
content. They use a non-probabilistic sampling technique and analyse the news sources recommended on the
platform by simulating the browsing behaviour of new users. The ndings highlight an increase in inequality with
preferential treatment for right-wing media outlets over similar content from left-wing media outlets.
Inequality
Volume.Item
Haroon et al
. [72]
investigate YouTube’s recommender tendency to generate lter bubbles, radicalisation
pathways, and extremist or problematic content recommendations. They rely on a sock-puppet audit using 100k
accounts designed to represent various political leanings. The study discovers a lter bubble eect, particularly
pronounced for right-leaning users. It also nds an increase in recommendations from channels linked to
extremist or conspiratorial content, particularly for users characterised by views of extreme right-wing content.
Filter Bubble Diversity.Individual Volume.Item Radicalization
Hosseinmardi et al
. [79]
investigate the role of users’ preferences on received recommendations. They retrieve
browsing histories of 310k users and prole these users based on viewing habits. The researchers then analyse on-
vs. o-platform consumption habits of users, pathways to radical political content, and the eect of session length
on content type exposure. In contrast with previous research, this study nds little evidence of the amplication
of political content and radicalisation pathways. Radicalization
Ledwich and Zaitsev
[94]
examine the role that YouTube’s recommender plays in encouraging online radicali-
sation. By examining the recommendation patterns among 800 political channels, the research nds that rather
than promoting radical or extremist content, the algorithm amplies views for mainstream media and politically
neutral content. Radicalization Inequality Concentration Volume.Item
Heuer et al
. [76]
investigate the biases behind YouTube’s video recommender. The study selects nine rel-
evant political topics in Germany and performs a sock-puppet audit based on random walks to select video
recommendations. The ndings support the disparities highlighted by Ledwich and Zaitsev
[94]
, showing that
YouTube increases recommendations for popular and mainstream content rather than radical and extreme ones,
but these recommendations do not focus on a particular topic. The study also nds an emotional shift eect
in recommendation trails: videos perceived by users as sad and negative are increasingly replaced by videos
conveying happier content. Filter Bubble Inequality Concentration Volume.Item
Ibrahim et al
. [83]
focus on YouTube’s recommender propensity to create political lter bubbles. The study
collects video recommendations via a sock-puppet audit with 360 bots that represent six personas across the US
political spectrum. The ndings show that the recommender steers users away from political extremes toward
more moderate content. This eect is more pronounced for far-right than for far-left content.
Radicalization
Volume.Individual
Cho et al
. [33]
investigate the impact of YouTube’s personalised recommender on political polarisation. The
researchers conduct a laboratory experiment with 108 undergraduate students, where they manipulate the
participants’ search and watch histories related to the 2016 US presidential election. The ndings indicate that
algorithmic recommendations contribute to the creation of a lter bubble, reinforcing individuals’ existing
political beliefs. Filter Bubble Polarization Diversity.Individual
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 11
Hosseinmardi et al
. [80]
estimate the causal impact of YouTube recommendations on the consumption of
highly partisan and radical content. The study compares the behaviours of bots designed to mimic real users’
viewing patterns with those of bots following predened rule-based trajectories. The ndings show that the
recommender does not steer users towards radical content. On the contrary, when users with strong political
views start watching moderate content, the recommender shifts their recommendations after approximately 30
videos, assisting users in breaking out of their lter bubbles. Radicalization Filter Bubble
Le Merrer et al
. [93]
examine the impact of YouTube’s personalised recommendations on generating lter
bubbles and rabbit holes. The researchers conduct a sock-puppet audit to gather video recommendations and
propose a straightforward theoretical model explaining why and how rabbit holes form on YouTube. The results
indicate that user interactions could inuence recommendations, but users are not consistently led further into
specialised content. In fact, after a certain number of interactions, YouTube’s recommender may forget previous
user preferences, breaking down users’ lter bubbles. Filter Bubble
Zhou et al
. [169]
investigate the impact of various YouTube features on video views, with a focus on the
recommender’s eectiveness in driving video popularity. By analysing metadata, related video lists, and view
statistics for hundreds of thousands of videos, the study nds that recommendations increase video views and
promote a wider variety of videos rather than just promoting the most popular ones.
Volume.Item Diversity.Item
Kirdemir et al
. [87]
inspects YouTube’s recommendation biases across dierent topics, languages, and entry
points. The study analyses the structure of video recommendation networks through PageRank distributions,
covering 257k videos and 803k recommendations. Despite variations based on factors like video language, content
topic, and the source of seed videos, all experiments reveal an increase in recommendations for a small fraction
of videos, fostering inequalities and a “richer get richer” eect. Volume.Item Concentration Inequality
Yang et al
. [162]
explore the dynamics of personalised search on Twitter using a sock-puppet audit. The
ndings indicate that factors such as following behaviour, cookies, and previous searches have a limited impact
on personalisation. However, when it comes to polarised searches, the results reveal a noticeable bias toward
one-sided views, raising concerns about lter bubbles. Filter Bubble Polarization Diversity.Individual
Using a similar sock-puppet audit methodology, Chen et al
. [30]
evaluate the impact of Twitter’s content
curation mechanism on the creation of political lter bubbles. The study nds that although the political alignment
of a bot’s initial connections inuences its exposure to political content, there is weak evidence to support the
presence of inherent political bias in the recommender. Filter Bubble Diversity.Individual
Su et al
. [153]
examine the impact of Twitter’s “Who-To-Follow” recommender, comparing the social networks
collected before and after the recommender was implemented. The ndings reveal that there is a concentration of
“follow” recommendations for the most inuential users, leading to a rich-get-richer phenomenon. The study also
identies a feedback loop where recommendations for popular users often result in more followers, exacerbating
existing network inequalities. Inequality Concentration Volume.Item
Bouchaud
[22]
explore how Twitter’s engagement-maximising recommender aects the visibility of tweets by
Members of Parliament in users’ timelines. The researchers use tunable engagement predictive models to simulate
users’ timelines and a Twitter dataset collected via a browser add-on installed by volunteers. The ndings show
that engagement-based timelines display lower ideological diversity, leading to the creation of a political lter
bubble. Additionally, the study uncovers inequalities in reach among political groups, with right-wing parties
being prioritised over left-wing ones. Filter Bubble Diversity.Individual Volume.Item
Bakshy et al
. [14]
explores the impact of Facebook’s recommender on the formation of lter bubbles and
echo chambers. The researchers examine the news consumption patterns of 10 million users in the US, focusing
on how their political beliefs align with the content they encountered in their news feeds and through their
connections with friends. The study nds that individual choices have a more signicant impact than Face-
book’s recommender in limiting exposure to diverse political news. Additionally, the researchers emphasise
, Vol. 1, No. 1, Article . Publication date: July 2024.
12 Pappalardo and Ferragina, et al.
that users’ friend networks can serve as a potential source of diverse perspectives.
Echo Chamber Filter Bubble
Diversity.Individual Diversity.Item
In a collaborative eort between Meta and a team of external researchers, González-Bailón et al
. [65]
investigate
the presence of lter bubbles in political news on Facebook during the US 2020 election. The researchers analyse
news content and assess its ideological alignment with the content 208 million users visualise and interact
with. In contrast to Bakshy et al
. [14]
, this study reveals that algorithmic curation worsens lter bubbles, with
conservative users exhibiting less diverse consumption patterns than liberals. Additionally, conservatives engage
more with news ecosystems that feature misinformation. Filter Bubble Polarization Diversity.Individual
Boeker and Urman
[19]
investigate how user actions and their attributes impact recommendations displayed
on TikTok’s “For You” page, using a sock-puppet audit. The research reveals that likes, follows, watch duration,
as well as user language and location settings all play a role in shaping the volume and nature of content
recommended to users. Among these factors, follows, video view rate, and likes are the most inuential in
determining the content presented to users. Volume.Item Diversity.Item
Baker et al
. [13]
investigate whether the recommenders on YouTube Shorts and TikTok contribute to the
radicalisation of young males. The study creates fake accounts posting as 16 and 18-year-old boys with dierent
content interests and analyses 29 hours of recommended videos. The ndings reveal that on both platforms, the
prevalence of toxic content such as reactionary right-wing, conspiracy, and manosphere content signicantly
increases once users engage with it, eventually constituting over 75% of recommendations. Furthermore, YouTube
Shorts recommends more toxic content compared to TikTok. Volume.Item Radicalization
Using sock-puppet audit, Le et al
. [92]
examine whether Google News personalises search results based on a
user’s political browsing history. Sock-puppets, representing distinct political views (pro- and anti-immigration),
browse related content and conduct identical searches on Google News. The ndings show signicant person-
alisation in Google News search results, indicating the presence of a lter bubble that reinforces the assumed
political bias of sock-puppets. Filter Bubble Diversity.Individual
Möller et al
. [122]
explore how various news recommenders aect content and topic diversity. The researchers
analyse data from a Dutch newspaper and assess recommenders based on editor choices, popularity, collaborative
ltering, and semantic ltering. The evaluation includes topic, category, tag, and tone diversity. The ndings show
that while standard recommenders maintain topic and sentiment diversity, personalised collaborative ltering
achieves the highest topic diversity. Diversity.Item
Whittaker et al
. [160]
investigate whether recommenders on YouTube, Reddit, and Gab promote radicalisation
pathways. The researchers use a sock-puppet audit, exposing bots to varying levels of extreme content. The
ndings indicate a clear trend towards the promotion of more radical content on YouTube, especially after
interacting with far-right material. Reddit and Gab do not show a signicant algorithmic promotion of extremist
content. Radicalization Volume.Item
Controlled studies. Liu et al
. [102]
investigate the impact of algorithmic recommendations on political polar-
isation using a custom video platform akin to YouTube, involving 7,851 participants. The researchers collect
videos on gun control and minimum wage, then manipulate YouTube’s recommender, submitting a balanced
version to one group of users and a slanted version to another group. The ndings indicate that while the
altered recommendations aect the diversity of videos selected and the volume of user engagement, they do not
signicantly inuence users’ political attitudes. Polarization Diversity.Item Volume.Individual
Markmann and Grimme
[113]
investigate whether YouTube’s autoplay recommender leads users to more
radical and extreme content. By using remote control of the browser, the researchers gather data from two groups
of accounts: one with personalised recommendations and one without. The ndings suggest that autoplay fosters
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 13
engagement-boosting content, which may be sensational or extreme. Overall, the diversity of recommended
content is not signicantly aected by personalisation. Radicalization Inequality Concentration Diversity.Item
Bartley et al
. [16]
examine how Twitter’s recommender impacts users’ information consumption habits. The
researchers conduct a sock-puppet audit, dividing users into a treatment group that receives personalised tweet
recommendations and a control group where recommendations are provided in inverse-chronological order. The
study nds that personalised recommendations tend to prioritise popular content, leading to a “rich get richer”
eect and increased visibility for a minority of accounts. As a result, there is a strong inequality in the visibility
of content and users on the platform. Inequality Concentration Volume.Individual Volume.Item
Huszár et al
. [82]
analyse how Twitter’s timeline algorithm aects the amplication of political content
employing proprietary data and a multi-year experiment with nearly two million users. They compare a control
group with a reverse-chronological feed against a treatment group with personalised feeds. Their ndings
reveal that mainstream right-leaning political content is consistently more amplied than left-leaning content.
Additionally, the study indicates that algorithmic amplication generally increases the visibility of mainstream
news sources in the US, while it does not disproportionately boost far-left or far-right groups compared to
moderates. Inequality Volume.Item
In a collaborative eort initiated in early 2020, Meta and a team of external researchers launched the US 2020
Facebook and Instagram Election Study
1
that resulted in the publication of four articles [
65
,
67
,
68
,
126
], three
of which are empirical controlled studies. Guess et al
. [67]
compare the behaviour of Instagram and Facebook
users in a control group receiving chronologically ordered feeds to a treatment group of users with personalised
recommendations. Chronologically ordered feeds show a decrease in platform engagement and exposure to uncivil
content yet an increase in access to political and untrustworthy information. Over three months, these changes
did not signicantly inuence polarisation levels, political knowledge, or other major attitudes. Guess et al
. [68]
extend this study by exploring the eects of reshared Facebook content on political news exposure and its impact
on political polarisation and knowledge. By comparing a control group with a standard feed to a treatment group
with reshared content removed, the researchers observe a signicant reduction in exposure to political news,
particularly from unreliable sources. Also, this reduction does not aect political polarisation or attitudes but
leads to a noticeable decrease in users’ political knowledge.
Polarization Volume.Item Volume.Individual
Nyhan
et al
. [126]
examine the impact of reducing Facebook users’ exposure to like-minded political content during
the 2020 US election. The researchers compare a treatment group of more than 20,000 users subjected to an
algorithm that considerably reduces their exposure to like-minded content to the rest of the population. The
ndings reveal that the treatment leads to increased exposure to more diverse content and a decrease in the
use of uncivil language. The study does not nd any evidence of shifts in the polarisation of beliefs, nor does it
observe the formation of echo chambers. Echo Chamber Polarization Volume.Item Diversity.Item
Ludwig et al
. [106]
investigate the impact of news recommenders on political polarisation. The researchers
conduct an online experiment involving 750 participants and divide them into four groups. Each group is exposed
to a dierent type of news recommender: content-based, content-based with positive sentiment, content-based
with negative sentiment, or no recommendations. Their ndings indicate that recommenders do not signicantly
modify polarisation levels. However, prolonged use of content-based recommenders with negative sentiment
increases aective polarisation, while a content-based recommender with balanced sentiment leads to ideological
depolarisation over time. Polarization
Yang
[163]
explores the impact of “most-viewed” news recommendations on user engagement with news
stories, recruiting 107 participants who use a website mimicking real news platforms. The participants are divided
into a treatment group that receives popularity-based recommendations and a control group that does not. The
1https://research.facebook.com/2020-election-research/
, Vol. 1, No. 1, Article . Publication date: July 2024.
14 Pappalardo and Ferragina, et al.
study collects several variables, such as recommended content features and exposure duration. The ndings show
that participants in the treatment group view more recommended news stories, engage with such content for
longer periods, and spend less time independently browsing for news stories. Volume.System
Beam
[17]
investigates the impact of personalised news recommenders on political news exposure, user engage-
ment, and political knowledge. The researchers assign 490 adult Internet users to either a generic news page or one
of four personalised conditions, diering by recommendation source (computer-generated vs. user-customised)
and content display (recommended stories only vs. all stories). The study reveals that personalisation increases
lter bubbles and reduces viewpoint diversity. However, certain design choices, such as user customisation or
displaying only recommended stories, can partially oset these eects by fostering deeper engagement and
indirectly boosting political knowledge. Filter Bubble Diversity.Individual Volume.Individual
3.2 Simulation studies
Observational studies. Sîrbu et al
. [150]
examine the impact of biasing interactions towards like-minded
individuals in synthetic social networks. The researchers introduce a recommender parameter that inuences
the probability of interacting with users who hold similar opinions. By simulating opinion evolution on a fully
connected network under bounded condence, the study reveals that stronger semantic bias in the recommender
leads to increased opinion polarisation. Polarization
Pansanella et al
. [128]
builds upon Sîrbu et al.’s research by exploring various network topologies, including
random, scale-free, and clustered networks. The study reveals that opinion polarisation persists across dierent
network topologies. Additionally, introducing a certain degree of sparsity in the network amplies the divisive
impact of recommenders on the distribution of opinions within the population. Furthermore, the researchers
indicate that the presence of homophilic communities, combined with cognitive biases, leads to the formation
of echo chambers. In an expanded version of this model, Pansanella et al
. [129]
explore the impact of adaptive
topologies, which allow connections to be changed from conicting agents to those with similar views. The study
nds that recommenders may intensify polarisation and hinder the formation of echo chambers. This is due to
the homophilic rewiring process and the evolution of opinions. Polarization Echo Chamber
Building on a dierent opinion evolution model, Valensise et al
. [159]
simulate social network sessions exposed
to a feed algorithm that adjusts the range of opinions viewed by users. The simulation accounts for bounded
condence and adaptive topologies. The study nds that a strong ltering algorithm increases polarisation, while
milder personalisation is necessary for echo chamber formation. Polarization Echo Chamber
Chitra and Musco
[32]
explore the impact of recommenders on social network polarisation using an opinion
dynamics model. A recommender encourages connections among users with similar viewpoints, thus creating a
similarity bias. The ndings show that a greater bias results in increased polarisation and the creation of echo
chambers within clustered networks. Polarization Echo Chamber
Perra and Rocha
[137]
examine the impact of dierent network topologies and timeline ltering strategies,
such as random, chronological, reverse chronological, semantic ordering, and nudging. The researchers represent
users’ opinions as binary variables, simulating a two-party system, and nd that algorithmic ltering exacerbates
initial inequalities and reduces the visibility of minority opinions. The study also highlights that semantic or
temporal biases in highly clustered networks lead to opinion polarisation and the formation of echo chambers.
Additionally, combining semantic ltering and nudging in networks with spatial correlations impedes convergence,
reinforcing echo chambers that resist nudged opinions.
Polarization Echo Chamber Inequality Concentration
Diversity.System Volume.Individual
Peralta et al
. [135]
investigate the interactions between semantic ltering and network topology. Semantic
ltering is adjusted using a bias parameter that hides a portion of the population from the agent. The stronger
the bias, the more contrasting opinions are hidden. The study employs mathematical analyses and simulations of
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 15
extended binary opinion models considering pairwise and group interactions. The ndings show that semantic bias
drives opinion polarisation and echo chamber formation in modular networks, while it fosters only polarisation
in non-modular networks. When the bias is below a certain level, it encourages consensus around a single opinion
after pairwise or small-group interactions, whereas interacting in larger groups encourages polarisation. In
subsequent work, Peralta et al
. [134]
expand their model to include algorithmic nudging, where the algorithm
exhibits a bias towards one of two opinions. The simulations reveal that if the social platform favours the opinion
of the minority group, it promotes polarisation. Conversely, if the visibility of the minority opinion is hindered, it
leads the population towards consensus.
Concentration Echo Chamber Inequality Polarization Diversity.System
Volume.Individual
Gausen et al
. [62]
investigate the impact of dierent recommenders on the spread of information in news feeds.
Using an agent-based model to simulate information diusion and opinion evolution, the researchers compare
a random recommender with three ltering strategies: chronological, belief-based, and popularity-based. The
ndings reveal that belief-based and popularity-based recommenders increase the spread of information, while
the random recommender decreases the amount of content shared. Additionally, belief-based recommenders lead
to a higher belief purity of agents’ feeds, decreasing content diversity. Volume.System Diversity.Individual
Törnberg et al
. [157]
investigate echo chambers and toxicity using Large Language Models (LLM) to simulate
social media interactions. The study evaluates three feed algorithms: one that shows posts from followed users,
one that shows posts from all users, and one that ranks posts by likes from the opposite party to bridge the gap
between dierent viewpoints. In the simulation, LLM agents select and interact with news stories to simulate
a day of activity. The results indicate that the rst algorithm reduces toxicity but creates echo chambers; the
second produces the opposite eect, while the bridging algorithm mitigates echo chambers and reduces toxicity.
Radicalization Echo Chamber Diversity.Individual
De Marzo et al
. [41]
investigate how collaborative ltering aects opinion polarisation, specically focusing
on the impact of recommendations on content exploration. The study examines a non-networked population in
which users are exposed to either user-user collaborative ltering or a matrix factorisation algorithm. The ndings
indicate that over time, the population tends to converge towards a state of consensus, where all users become
highly similar. User-user collaborative ltering does not lead to polarisation but increases diversity in clicked
items compared to scenarios without algorithmic assistance. Conversely, the matrix factorisation algorithm
contributes to opinion polarisation within the population. Concentration Polarization Diversity.Individual
Cinus et al
. [34]
look into the long-term evolution of opinions when people recommenders are used on
synthetic networks with tunable levels of homophily and segregation. The ndings reveal that when initial
network conditions are homophilic and non-modular, following link recommendations leads to the formation
of echo chambers. This eect becomes absent or reversed if networks are already segregated or heterophilic.
Additionally, the more personalised the recommendations, such as those based on the algorithmic bias proposed
by Sîrbu et al. [150], the more they contribute to the rise of echo chambers. Echo Chamber
Similarly, Ramaciotti Morales and Cointet
[138]
explore how the evolution of links, combined with an opinion
evolution model, impacts polarisation. The study shows that when there is no biased assimilation, i.e., the
tendency to be more inuenced by similar opinions, some recommenders reduce polarisation, while others
slightly increase it. However, with high levels of biased assimilation, all recommenders lead to smaller increases
in polarisation compared to what is caused by sole cognitive biases on a xed population. This indicates that
recommenders often expose users to more diverse connections, mitigating polarisation compared to what would
be achieved through user choice alone. Polarization Diversity.Individual
Fabbri et al
. [51]
investigate how homophily and dierent types of recommenders aect minority groups
in social networks. The researchers compare random link prediction with link prediction algorithms based
on network topology, random walk, and collaborative ltering. They then create recommendations in a social
, Vol. 1, No. 1, Article . Publication date: July 2024.
16 Pappalardo and Ferragina, et al.
network divided into a majority cluster and a minority cluster. The ndings indicate that recommendations
increase the visibility of the minority group if they are homophilic. If this is not the case, the majority class receives
increased visibility. Additionally, hubs within the minority group receive even greater visibility, exacerbating
existing inequalities. In a follow-up of this study, Fabbri et al
. [52]
examine the long-term impacts of user-
recommendations feedback loops. The ndings indicate that recommenders increase the visibility of the minority
groups with homophilic initial conditions and exacerbate concentration (rich-get-right eect) in the long term.
Inequality Concentration Diversity.System Volume.Individual
Ferrara et al
. [56]
investigate the impact of user recommenders on networks with two distinct groups, one
being the minority category. The researchers consider various recommenders suggesting new connections while
removing existing random links. These include personalised page rank (PPR), egocentric random walks (WTF),
friends-of-friends recommender (2H), common-following (CF) users recommenders, and Node2Vec (N2V). The
ndings indicate that networks become more closely connected with repeated recommendations, regardless of
the algorithm used. However, not all tested algorithms exhibit a concentration (rich get richer) eect: Node2Vec
prevents the network from increasing inequalities. Overall, CF can increase or decrease the visibility of the
minority, N2V maintains a balanced impact, and PPR, WTF, and 2H generally maintain the status quo but may
decrease minority visibility. Volume.System Inequality Concentration Diversity.System Volume.Individual
Jiang et al
. [85]
present a framework to study feedback loops between user’s choices and recommendations. In
this framework, individual users engage with content recommenders using various strategies such as random
selection, oracle-based methods, and reinforcement learning algorithms. The study nds that, compared to a ran-
dom recommender, both oracle and reinforcement-learning recommenders lead to a fast model degeneration. This
degeneration is characterised by a decrease in item diversity and user interests.
Radicalization Diversity.Individual
Rossi et al
. [142]
examine the inuence of recommenders on news platforms on user opinions and engagement.
In the simulations, users interact with a popularity-based recommender with random exploration, which suggests
articles supporting or opposing a topic. The ndings show that the recommender prioritises articles that align
with the user’s existing opinions and tends to radicalise initially extreme users.
Filter Bubble Diversity.Individual
Radicalization
Ribeiro et al
. [140]
examine YouTube’s amplication paradox. This refers to the discovery that sock-puppet
audits reveal amplication of problematic content due to recommenders, while user data suggest recommenders
are not the primary driver of attention towards this content. The researchers build a recommender based on
collaborative ltering to simulate recommendations. Moreover, they develop an agent-based model where users
consume content based on their preferences. The results help explain the paradox: users who blindly follow
recommendations are exposed to more extreme content, while user choices tend to attenuate such content.
Radicalization Volume.Individual Filter bubble Diversity.Individual
4 ONLINE RETAIL ECOSYSTEM
What the ecosystem is about. The online retail ecosystem includes platforms that allow customers to buy
products or services, e.g., Amazon, eBay or Alibaba for products, Netix or Spotify for movie and music streaming,
respectively. This ecosystem appears more heterogeneous than the others and includes studies from various
disciplines (e.g., computer science, marketing, management and economics).
Main methodologies employed. Overall, empirical studies outweigh the simulations. This is mainly because
platforms have a strong interest in maximising revenues, and therefore, understanding the impact of recom-
menders in real situations is crucial. Most empirical studies analyse users’ activity on e-commerce platforms,
while simulation studies tend to build models of user tastes based on ad-hoc assumptions or data gathered from
platforms. Typically, they also compare content-based recommenders and collaborative ltering. Among empirical
studies, there is a prevalence of controlled over observational studies (see Figure 4), as it is easier than in other
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 17
Online Retail Empirical Simulation
Observational Controlled Observational Controlled
Individual Filter Bubble [124] [29] [125]
Radicalization
Model Collapse
Systemic
Concentration [58, 78] [96, 97, 164] [57, 59, 111, 161]
Echo Chamber [63]
Inequality
Polarization
Individual
Item
Systemic
Diversity
individual:
[
9
,
63
,
124
], sys-
temic: [28, 130]
individual: [
8
,
77
,
96
98
,
100
,
164
],
item: [
118
], sys-
temic: [
47
,
77
,
95
,
117, 118, 164]
individual:
[
10
,
57
,
59
,
125
],
item: [
74
],
systemic:
[10, 26, 74, 111]
Volume
individual:
[
58
,
78
,
124
],
item: [
28
,
44
,
130
]
individual: [
29
,
47
,
77
,
96
,
98
,
103
],
item: [
95
,
97
],
systemic: [9]
Table 3. Online Retail Ecosystem. Classification of selected papers based on their methodology, outcomes and level of
analysis.
ecosystems to divide users into control and treatment groups. Moreover, interactions between the two groups
are often weak and manageable. We do not nd simulation-controlled studies. See Table 3 for a comprehensive
outlook of the outcomes studied in this ecosystem.
Main outcomes. Overarching concerns relate to volumes and diversity of sales, views, and clicks, as well as
implications of customers’ engagement on their decision quality, retention, and product ratings. Concentration is
also investigated, while lter bubbles and echo chambers are considered in the analysis.
4.1 Empirical studies
Observational studies. Dias et al
. [44]
examine the impact of LeShop’s recommender on sales over 21 months.
The study nds that the amount of money shoppers spend on recommended items increases over time. This leads
to accrued sales at the item level and the growth of direct revenues. Additionally, the study nds an increase in
indirect revenues, i.e., those related to purchases of items recommended in previous sessions and purchases of
non-recommended items from previously recommender categories. Volume.Item
Nguyen et al
. [124]
explore the impact of item-item collaborative ltering on MovieLens users. The ndings
reveal an overall diversity decrease in the movies viewed and purchased. However, this eect is less pronounced
for users who follow recommendations, as they tend to consume a wider variety of movies compared to those who
, Vol. 1, No. 1, Article . Publication date: July 2024.
18 Pappalardo and Ferragina, et al.
ignore recommendations. Additionally, the recommendation-following users actively seek out diverse movies,
which helps reduce the risk of creating lter bubbles. These users also tend to give more positive ratings to the
recommended items. Diversity.Individual Filter Bubble Volume.Individual
Ge et al
. [63]
analyse clicking and purchasing behaviours using real-world data consisting of user clicks,
purchases and browse logs from Alibaba Taobao. To measure the impact of recommenders on users, the researchers
follow the strategy proposed by Nguyen et al
. [124]
and separate all users into “following” and “ignoring”
groups. The study shows that personalised recommendations reinforce cluster formation in click-behaviors (echo
chambers), i.e., there is a strengthening trend over time for the “following” group of users. Moreover, the set
of suggested products is less diverse for the “following” group in comparison to the “ignoring” group. This is
because personalised recommendations shrink the scope of the oered content, and therefore, the gap further
enlarges over time. Echo Chamber Diversity.Individual
Anderson et al
. [9]
investigate how Spotify’s recommender impacts the diversity of streaming content users
listen to. The researchers split user streaming behaviour into two categories: user-driven listening, where users
actively seek out specic music or listen to playlists created by other users, and algorithm-driven listening, where
users listen to algorithmically personalised playlists (e.g., Discover Weekly) or radio stations generated by Spotify’s
algorithm. The study nds that personalised recommendations lead to greater diversity in streaming at the
individual level, with user-driven listening showing more diversity than algorithm-driven listening. Furthermore,
users who listen to a diverse range of songs are signicantly less likely to leave the platform and more likely to
become paying subscribers. Diversity.Individual
Chen et al
. [28]
analyse a dataset sourced from Amazon to examine the eects of recommendations and
consumer feedback on sales. The ndings indicate that more recommendations are associated with high sales
volume, but consumer ratings do not have a signicant impact on sales. However, the number of consumer
reviews positively correlates with sales volume. The study also nds that recommendations lead to increased
diversity at the systemic level, indicating that they are more eective for less-popular books than for popular
ones. Diversity.Systemic Volume.Item
Pathak et al
. [130]
analyse a dataset from Amazon and Barnes & Noble to explore how the strength of
recommendations (i.e., the number of books pointing to a particular book and their popularity) impacts book
sales and prices. The study nds that stronger recommendations lead to increased sales volume and higher prices.
Additionally, the recommender may contribute to increased diversity in book sales, a phenomenon referred to in
the paper as a long-tail eect. Diversity.Systemic Volume.Item
Fleder et al
. [58]
analyse consumer behaviour in time in an online music store. The store uses a free software
add-on to Apple’s iTunes to provide personalised recommendations to registered users through a combination of
content- and user-based collaborative ltering. To account for potential confounding factors, the researchers
employ propensity score matching to match registered and registered users. The ndings show that recommenda-
tions lead to an increase in commonality among consumers. This occurs because individual consumers purchase a
greater volume of songs and a more similar mix of products after receiving the recommendations.
Concentration
Volume.Individual
Hosanagar et al
. [78]
employ the same research design and reveal that personalised recommendations have two
main eects. On an individual level, personalised recommendations increase sales volume, making it more likely
for users to purchase the same songs. At a systemic level, there is a concentration of purchases as consumers tend
to buy a more similar mix of products after receiving the recommendations. Concentration Volume.Individual
Controlled studies. Anderson et al
. [9]
investigate the impact of Spotify’s recommender on the diversity of
content that users listen to. The researchers randomly split users into three test groups exposed to dierent
recommenders: a popularity ranker (sorting songs based on their popularity), a relevance ranker (sorting songs
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 19
based on their relevance to the user’s tastes), and a learned ranker (a neural network trained on user preferences).
The study nds that the relevance ranker is more eective than the popularity ranker for generalist users (those
who listen to a wide variety of songs) and specialist users (those who prefer specic genres), resulting in an
overall increase in the number of songs streamed. Although the learned ranker employs a broader set of features,
it performs similarly to the relevance ranker. Volume.Systemic
Yi et al
. [164]
conduct a laboratory experiment to explore the impact of product recommendations on search
(e.g., computers) and experience goods (e.g., music). The participants are divided into a treatment group that
receives recommendations and a control group that does not. The ndings show that users in the treatment
group visualise a wider range of products. However, they end up purchasing fewer products and concentrating
their purchases on the most popular items. This eect is more pronounced for search goods.
Concentration
Diversity.Individual Diversity.Systemic
Matt et al
. [117]
split users on an online music store into four randomised treatment groups exposed to
dierent recommenders: content-based ltering, collaborative ltering, bestseller recommender, and random
recommender. A control group receives no recommendations. The results indicate that, compared to the baseline,
all recommenders (except content-based ltering) lead to an increase in sales diversity. In a subsequent study,
Matt et al
. [118]
introduce two additional randomised treatment groups. These groups are exposed to variants of
collaborative ltering and bestseller recommender, which are trained on data describing other users’ ratings and
purchases. Recommending niche products and blockbusters with the same probability, content-based ltering
increases diversity at the item level. At the systemic level, the study nds that recommenders have varying
eects on sales diversity: the random recommender increases it, collaborative ltering decreases it, and other
recommenders (including the collaborative ltering variant) have no eect. Neither of the bestseller recom-
mender variants diers from the baseline. The other recommenders have no noticeable eect on sales diversity.
Diversity.Item Diversity.Systemic
Alves et al
. [8]
examine the eects of nudging customers of a book recommendation app towards genres they
do not normally prefer. All participants in the experiment receive book recommendations from their preferred
genres as well as from other genres. Users in the treatment group receive enhanced recommendations with
various types of nudges, such as the popularity or appreciation of the suggestions by other users or experts. Users
in the control group are not provided with any nudging. While these enhanced recommendations diversify the
selection of books, the nudging also decreases the time spent on the app. Diversity.Individual
Lee and Hosanagar
[96]
investigate the impact of dierent recommenders on movie sales at a top retailer in
North America. Customers are randomly assigned to one of four groups: a control group with no recommendations,
three treatment groups exposed to purchase-based collaborative ltering (who bought this also bought that),
view-based collaborative ltering (who viewed this also viewed that), and recently-viewed recommender (recently
viewed items). The study nds that purchase-based collaborative ltering signicantly increases the average
number of views per individual and the average number of purchases compared to the control group. In contrast,
the eects of view-based collaborative ltering and recent-views-based recommenders are not statistically
signicant. Both collaborative ltering algorithms increase sales diversity at the individual level but decrease
aggregate sales diversity. This indicates that both algorithms encourage users to purchase the same products,
leading to a concentration eect at the systemic level. The recently-viewed recommender decreases sales diversity
at the systemic level but has no eect at the individual level.
Diversity.Individual Volume.Individual Concentration
Liang and Willemsen
[100]
examine the behaviour of four random groups of Spotify users over six weeks. These
groups are composed on the basis of algorithm personalisation and the visual presentation of recommendations.
In their rst session, users are randomly assigned to either a representative or a personalised initial playlist. Then,
they are further assigned to one of two visual presentations. The study reveals an initial increase in the diversity
of music exploration within the playlist, driven by nudging techniques such as default initial playlists and visual
, Vol. 1, No. 1, Article . Publication date: July 2024.
20 Pappalardo and Ferragina, et al.
anchors. However, this heightened exploration gradually diminishes over time. The residual eect on the change
in users’ proles indicates the potential (long-term) benets of combining nudging with personalisation in
exploration tools. Diversity.Individual
Long et al
. [103]
employ data from 1.6 million Alibaba customers to examine how the quantity of recom-
mended products impacts on consumers’ search and purchase behaviours. The researchers leverage Alibaba’s
recommendation technology and randomly assign consumers to one of four treatment groups, each receiving a
dierent number of recommended products. The ndings reveal that increasing the number of recommended
products boosts the probability of purchasing those items. However, this probability decreases as the number of
recommended products continues to rise. Purchase probability declines mainly because consumers reduce the
number of searches as a consequence of choice overload. Volume.Individual
Lee et al
. [95]
investigate the impact of recommendations on views and sales of cosmetics and clothes on
mobile and PC channels. The researchers split customers into a treatment group exposed to collaborative ltering
trained on users’ recent views and a control group exposed to best-selling items. The study nds that collaborative
ltering increases views and sales volume when users access the platform through mobile devices. When users
access the platform through PCs, the recommender only increases the volume of views. These outcomes are
particularly pronounced for the most expensive items. Moreover, collaborative ltering increases view diversity on
both mobile and PC platforms, but it has no signicant impact on sales diversity.
Diversity.Systemic Volume.Item
Donnelly et al
. [47]
investigate the impact of personalised recommendations generated by Wayfair’s collabora-
tive ltering on consumption patterns in the context of online furniture shopping. In the experiment, a treatment
group of 95% of customers exposed to Wayfair’s recommender is compared to a control group of 5% of the
customers exposed to popularity-based recommendations. The study nds that the recommender encourages
users to engage in more searches, increasing the number of clicks and positively inuencing purchase probability
at the individual level. Furthermore, the ndings indicate that Wayfair’s recommendations increase diversity in
searches and sales at the systemic level. Diversity.Systemic Volume.Individual
Holtz et al
. [77]
examine the impact of personalised recommendations on podcast consumption among
approximately 900,000 Spotify premium users across seventeen countries. Users in the treatment group are
exposed to personalised recommendations based on their historical listening behaviour, while those in the
control group are exposed to the most popular podcasts. The study nds that at the individual level, personalised
recommendations lead to an increased volume of podcasts listened to, but a decrease in podcast streaming
diversity. However, at the systemic level, personalised recommendations increase podcast streaming diversity.
Diversity.Individual Diversity.Systemic Volume.Individual
Chen et al
. [29]
examine how recommendations aect the relationship between lter bubbles and consumers’
preferences and decision quality on the e-commerce platforms Jingdong and Taobao. The researchers dene
decision quality as the ability of users to select the best products according to domain experts. The study
distinguishes between personalised recommendations for users with personal accounts and non-personalised
recommendations for users without them. The ndings show that recommendations reinforce individual consumer
preferences, creating a lter bubble eect and reducing decision quality. Filter bubbles limit the variety of products
available to consumers, potentially leading to a decline in decision quality. Filter Bubble Volume.Individual
Lee and Hosanagar
[97]
explore the impact of collaborative ltering on sales diversity using data from
a randomised eld experiment conducted on top online retailers. The researchers split users into a control
group with no recommendations and two treatment groups exposed to view-based collaborative ltering (who
viewed this also viewed that) and purchase-based collaborative ltering (who purchased this also purchased
that). The study nds that the two recommenders increase sales diversity at the individual level, leading to
a decrease in views and sales diversity at the systemic level. As similar users explore the same products, this
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 21
results in a concentration eect. At the item level, both recommenders generate an increase in views and sales.
Diversity.Individual Concentration Volume.Item
Li et al
. [98]
conduct three experiments in a laboratory and a real-life online bookstore. Each experiment involves
a control group of users receiving no recommendations and treatment groups exposed to recommendations
based on a basket value. In the rst laboratory experiment, three treatment groups are exposed to item-based
collaborative ltering, best-selling, and random recommendations. The researchers nd that collaborative ltering
provides the highest basket value. In a subsequent laboratory experiment based on collaborative ltering only,
the study nds that recommending three products of the same type is the most eective way to increase basket
value.
In the real-life experiment, the treatment group receives recommendations about three dierent products from
memory-based collaborative ltering. The ndings show that this recommender leads to an increase in diversity
in consumers’ consideration sets, as well as an increase in views and sales.
Diversity.Individual Volume.Individual
4.2 Simulation studies
Observational studies. Noordeh et al
. [125]
measure the impact of collaborative ltering on content consumption
on MovieLens. The study reveals that prolonged exposure to recommendations decreases content diversity and
fosters the emergence of lter bubbles. Furthermore, once a lter bubble is established, it becomes challenging
for users to break out of it. Diversity.Individual Filter Bubble
Hazrati and Ricci
[74]
employ log data from three Amazon services (Kindle, Games, and Apps) to analyse the
eects of recommendations on the evolution of users’ choices over time. The simulation combines a choice model
with ve recommenders. Three recommenders oer personalised recommendations: popularity-based collabo-
rative ltering, low popularity-based collaborative ltering (penalising the score with the inverse popularity),
and factor model (mapping users and items into a common latent factor space). Additionally, the study includes
two non-personalised recommenders, namely popularity-based and average rating, as well as a baseline case
with no recommendations. The study nds that personalised recommendations lead to a greater increase in sales
diversity compared to non-personalised recommendations, both at the item and systemic levels. Furthermore, at
the systemic level, the low popularity-based collaborative ltering and the factor model increase sales diversity
for the Kindle dataset. However, for the Games dataset, only the low popularity-based collaborative ltering
increased sales diversity compared to the baseline case. Diversity.Item Diversity.Systemic
Mansoury et al
. [111]
design a method for simulating the feedback loop of user-recommender interactions by
analysing the progressive eects of three dierent recommenders: user-based collaborative ltering, Bayesian
personalised ranking, and a recommender suggesting the most popular items. The ndings reveal that all
recommenders lead to a progressive reduction in diversity and increased concentration. This eect is particularly
pronounced for users who are underrepresented in the training dataset (e.g., female users).
Concentration
Diversity.Systemic
Wu et al
. [161]
compare various recommenders trained on MovieLens data, specically a user-based collabora-
tive ltering, a content-based recommender and a baseline condition with no recommendations. The ndings
reveal that the content-based recommender decreases sales concentration, whereas user-based collaborative
ltering increases it. Moreover, the impact of these eects depends on how well the recommendations align with
consumer awareness. For instance, suggesting popular products to consumers already aware of them has little
impact. Recommending niche products could signicantly inuence consumer behaviour. Concentration
Aridor et al
. [10]
design a model in which products have both intrinsic and user-specic values. In this model,
users (unaware of item values) make choices on the basis of their beliefs and risk aversion. This baseline condition
is compared to one where users are exposed to recommendations that allow them to combine their value with the
intrinsic value of items. The study shows that the more users become risk-averse, the more they consume items
, Vol. 1, No. 1, Article . Publication date: July 2024.
22 Pappalardo and Ferragina, et al.
similar to those they previously considered valuable. This leads to lter bubbles that narrow their consumption
patterns. Recommendations help reduce these lter bubbles, but at the cost of diminishing the diversity of items
consumed at the systemic level. Diversity.Individual Diversity.Systemic
Chaney et al
. [26]
explore how training recommenders using data from users inuenced by automatic rec-
ommendations can lead to algorithmic confounding. The researchers compare the eects of six recommenders
(popularity-based, content ltering, matrix factorisation, social ltering, and random) with an ideal benchmark,
which recommends items based on the true utility of users. The study nds that a single training session leads to
a small homogenisation in user behaviour, which then reverts to the ideal case. However, repeated training causes
a greater homogenisation of user behaviour, with the eect becoming more pronounced with each cycle through
the loop. This homogenisation occurs both at the local level (users behave more like their nearest neighbours)
and population level (users become more similar on average) for all recommenders (except for the random
recommender). Diversity.Systemic
Fleder and Hosanagar
[59]
perform a simulation where users are exposed to collaborative ltering and have a
certain probability of accepting the recommender’s suggestion. The outcome is compared to that resulting from
the same process, except for that when recommendations are not enabled. The study reveals a concentration
eect towards a few items. A subsequent study [
57
] employs the same simulation settings to demonstrate
that recommendations can increase sales diversity at the individual level, but decrease it at the systemic level.
Concentration Diversity.Individual
5 URBAN MAPPING ECOSYSTEM
What the ecosystem is about. The urban mapping ecosystem encompasses a variety of recommenders designed
to satisfy the needs of city dwellers. It includes navigation platforms suggesting travel routes (e.g. Google Maps
or TomTom); house-renting services helping users nd accommodation (e.g., Airbnb, Booking.com); e-mobility
platforms providing users with taxi, ride-hailing or car-pooling services (e.g., Uber and Lyft); and platforms
suggesting point-of-interest to users (e.g., Tripadvisor and Yelp).
Main employed methodologies. There is a predominance of simulation over empirical studies (see Figure
4), mainly because data are typically owned by big-tech companies that are reluctant to share them. For what
concerns navigation and e-mobility platforms, empirical controlled studies are dicult to perform. This is because
it is unlikely to avoid interactions between users in the control and treatment groups and other vehicles travelling
on the streets. This would mean a violation of the Stable Unit Treatment Value Assumption for causal inference
[
39
]. Moreover, several exogenous factors (e.g., sudden storms, strikes, accidents) may potentially bias the eect
of the recommender at any time. These factors complicate the attribution of the observed outcomes to the
recommender. Scholars tend to choose simulation-controlled studies to mitigate these issues.
Main outcomes. Most studies focus on the systemic level, investigating inequality, diversity, and trac congestion
(extreme urban concentration). Most studies are concerned with volume at all levels of analysis, assessing the
impact of recommenders on various quantities (e.g., CO2 emissions, travel time, and cost for users in ride-hailing
and car-sharing platforms). See Table 4 for a comprehensive outlook.
5.1 Empirical studies
Observational studies. Falek et al
. [55]
perform a comparative analysis of various routing algorithms, nding
that a strategy without re-routing (the route is established before vehicle departure based on actual travel times)
consistently yields travel times that closely approach the best possible solution. In contrast, a strategy based on
continuous re-routing (the route is adjusted while the vehicle is travelling based on actual travel times) is the
best algorithm for congested areas. Concentration
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 23
Urban Mapping Empirical Simulation
Observational Controlled Observational Controlled
Individual Filter Bubble
Radicalisation
Model Collapse
Systemic
Concentration [55, 70, 115] [35, 50, 86] [36, 136, 156]
Echo Chamber [89]
Inequality [48, 165] [1, 21, 86] [31]
Polarization
Individual
Item
Systemic
Diversity systemic [37] systemic [38]
Volume
individual:
[
84
,
104
,
144
,
146
,
165
], item:
[
70
,
84
,
104
], sys-
temic: [84, 115]
individual:
[
90
,
107
110
,
123
], item:
[
2
,
7
,
37
,
61
,
108
110
,
123
,
170
],
systemic:
[
2
,
7
,
15
,
35
37
,
50
,
53
,
54
,
90
,
121, 152]
individual: [
3
,
12
,
31
,
136
,
158
], sys-
temic: [
3
,
12
,
36
,
38
,
136
,
156
,
158
,
170]
Table 4. Urban Mapping Ecosystem. Classification of selected papers based on their methodology, outcomes and level of
analysis.
Schwieterman
[146]
observes that transportation network companies (e.g., Uber and Lyft) in Chicago contribute
to reducing travel times compared to public transit, but are also slightly more costly on average for users. Moreover,
during peak weekday hours, the prices are marginally higher than at other times, suggesting that transportation
network companies may use surge pricing to respond to mobility demand. Volume.Individual
Santi et al
. [144]
employ a large dataset of taxi trips in New York City to model the collective benets of ride-
sharing as a function of prolonged travel time. They nd that ride-sharing reduces users’ travel time, cumulative
trip length, and service cost. However, it entails an increase in the number of taxi passengers.
Volume.Individual
Jalali et al
. [84]
use GPS trajectories from private vehicles to investigate the potential impact of ride-sharing
in a Chinese city. They discover that ride-sharing reduces: the number of trips, drivers’ total travelled distance,
and emissions. This is especially true if users are willing to walk to drivers within 3 km.
Volume.Individual
Volume.Item Volume.Systemic
Martinez and Viegas
[115]
develop an agent-based model to examine the impact of moving from private
transportation to a shared and self-driving vehicle eet (taxis and mini-buses) in Lisbon. Their study reveals that
implementing the full-sharing scenario could substantially reduce CO2 emissions, congestion levels, and travel
distances. Sharing vehicles leads to more intensive vehicle utilisation, signicantly increasing vehicles’ daily
usage and travel distances. Concentration Volume.Systemic
, Vol. 1, No. 1, Article . Publication date: July 2024.
24 Pappalardo and Ferragina, et al.
Lotze et al
. [104]
propose a strategy in which bus routes and user stops’ positions change adaptively with trac
demand. They observe a decrease in buses’ route length and travel times, albeit at the expense of users being
required to walk signicant distances during their trips to reach dynamically adjusted stops.
Volume.Individual
Volume.Item
Hanna et al
. [70]
analyse the impact of lifting Jakarta’s "three-in-one" high-occupancy vehicle policy (HOV),
which restricted certain roads at specic hours to vehicles with a minimum of three occupants. By gathering data
on road travel times from Google Maps before and after the policy lifting, the researchers uncover noticeable
eects of HOV on trac congestion: lifting the policy increased travel times both on high-occupancy roads and
alternative routes, and both during and outside HOV periods. Concentration Volume.Item
A few works [
48
,
89
,
165
] focus on the empirical analysis of data from Airbnb. Koh et al. [
89
] analyse the
diversity of the user base on the Airbnb platform across ve cities in three continents. The study observes a
predominantly young, female, and white user base, even in cities with a diverse racial composition. This creates
an echo chamber eect where similar demographics tend to cluster. The authors also observe a similar homophily
tendency between female hosts and guests and a relevant homophily tendency regarding race, while no tendency
is highlighted in age. EchoChamber
Similarly, Edelman and Luca
[48]
analyse pictures of New York City landlords on Airbnb and observe revenue
inequalities: non-black hosts’ houses are about 12% more expensive than those of black hosts, even when the
houses have similar attributes like the number of bedrooms, type of room, and user ratings. Inequality
Zhang et al
. [165]
investigate the impact of Airbnb’s smart-pricing algorithm on racial disparities in daily host
revenue. The researchers collect data on venue prices, host race (inferred from prole pictures), host revenues,
and venue occupancy rates before and after hosts adopt the smart-pricing algorithm. They nd that the algorithm
reduces venue prices, increases host revenues, and decreases the revenue gap between white and black hosts.
Inequality Volume.Individual
5.2 Simulation studies
Observational studies. Johnson et al
. [86]
investigate the impact on urban trac of three routing criteria: scenic
routing optimises routes for aesthetic enjoyment; safety routing avoids areas with higher rates of accidents or
crime; and simplicity routing, where route complexity is reduced on the basis of the number of intersections and
actions needed to traverse it (i.e., going straight or turning). Simulations in San Francisco, New York City, London,
and Manila show that scenic routing leads to more complex routes, potentially increasing the risk of accidents
and negatively aecting driver safety. Additionally, it diverts trac from highways to parks, popular areas, tourist
destinations, and slower roads. Safety routing, though to a lesser degree than scenic routing, also generates more
complex routes and redirects trac away from identied unsafe zones. Simplicity routing amplies trac on
highways but does not explicitly favour or avoid any particular region. Concentration Inequality
Mehrvarz et al
. [121]
compare the impact of vehicle routing incorporating sustainability variables (e.g., fuel
consumption, engine load, acceleration rate, speed, road slope) with traditional routing that prioritises travel
time or distance. The study nds that fastest routes are not necessarily the most sustainable and that sustainable
routing might reduce fuel consumption by about 5%. Volume.Systemic
Barth et al
. [15]
introduce a method for reducing energy consumption and emissions in navigation services. The
method combines mobile-source energy and emission models with advanced route optimisation algorithms. The
study applies this method in several case studies across Southern California, showing substantial energy savings
and reduced emissions compared to navigation services that minimise distance or travel time.
Volume.Systemic
Colak et al
. [35]
introduce a centralised strategy that optimises route choices to alleviate urban congestion
while considering varying levels of social good awareness. The study shows that routing solutions mimicking
, Vol. 1, No. 1, Article . Publication date: July 2024.
A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions 25
socially optimal congurations decrease time lost in congestion by up to 30%, with individual travel time reduction
ranging between one and three minutes. Concentration Volume.Systemic
Cornacchia et al. [
37
] introduce METIS, a trac assignment algorithm designed to optimise vehicle routing
by oering diverse alternatives. The study employs a trac simulator (SUMO) to conduct a simulation across
Florence, Rome, and Milan, evaluating the impact of METIS on various urban metrics, including CO2 emissions
and road coverage. The study reveals that METIS produces a more equitable distribution of trac on the road
network than other state-of-the-art routing algorithms, increases road coverage and mitigates CO2 emissions
considerably. Diversity.Systemic Volume.Item Volume.Systemic
Maciejewski et al. [
108
110
] employ oating car data and a trac simulator (MATSim) to investigate the
impact of taxi eets on trac in Berlin and Barcelona. The study evaluates two dispatching strategies: the
"nearest-idle-taxi" approach, where the closest available taxi is dispatched to the rst available request; and the
"demand-supply balancing strategy," which classies system states into oversupply and undersupply conditions.
The demand-supply balancing strategy outperforms the nearest-idle-taxi approach, considerably reducing waiting
time for both drivers and passengers. Volume.Individual Volume.Item
In another work, Maciejewski
[107]
evaluates three taxi dispatching strategies: a “no-scheduling strategy”
(NOS) that assigns the nearest empty taxi to each request; a “one-time schedule strategy” (OTS) that assigns new
customers to the taxi soonest available after current trips; and a “re-scheduling strategy” (RES) that recalculates
assignments after each drop-o. Although NOS performs well under light system loads, slightly outperforming the
other strategies in reducing passenger waiting times, RES is more eective as demand increases.
Volume.Individual
Erhardt et al
. [50]
examine the eects of ride-hailing on San Francisco’s trac congestion using simulation
software (SF-CHAMP). They compare trac volumes in 2010 before signicant ride-hailing activity with
those in 2016 when such services were available. Findings highlight that ride-hailing increases congestion, mainly
because about 50% of vehicles’ miles travelled are with no passengers. The study also nds a 62% increase in
weekday vehicle hours of delay from 2010 to 2016 versus a 22% increase under a hypothetical scenario without
ride-hailing. Concentration Volume.Systemic
Zhu and Prabhakar
[170]
introduce a combinatorial optimisation model for long-term taxi trip assignment to
minimise the number of taxis required and idle time. Simulations in New York City demonstrate that the model
eectively reduces by 28% the taxi eet size needed to complete all trips and cuts by 32% per taxi average idle
time. Volume.Individual
Kucharski et al
. [90]
show how ride-pooling services can signicantly accelerate the spread of COVID-19 and
similar diseases. The study nds that a small number of infected travellers can transmit the virus to hundreds
of users. Therefore, they propose a mitigation strategy. This strategy contains the virus within smaller groups
and breaks up the dense contact network by implementing xed matches among co-travellers.
Volume.Individual
Volume.Systemic
Fagnant and Kockelman
[53]
use agent-based modelling to evaluate the environmental impact of shared
autonomous vehicles (SAV) compared to conventional vehicle ownership and usage patterns. Their simulations
indicate that a single SAV can replace eleven traditional vehicles. Despite a projected 10% increase in travel
distances, the overall impact of SAVs remains favourable for reducing emissions compared to non-SAV trips.
Additionally, the study suggests that centralised global strategies for SAV relocation are more eective in
mitigating environmental impacts than localised approaches. In subsequent research, Fagnant and Kockelman
[54]
investigate the impact of SAV on travel costs and service times in Austin employing a Dynamic Ride-Sharing
strategy (DRS). DRS brings together multiple users with similar origin and destination points at the same time.
The ndings show that DRS reduces average service times and travel costs for SAV users, presenting potential
benets for both autonomous taxis and travellers. Volume.Systemic
, Vol. 1, No. 1, Article . Publication date: July 2024.
26 Pappalardo and Ferragina, et al.
Afèche et al
. [1]
use a game-theoretic model to analyse ride-hailing services, focusing on passenger-driver
matches in a spatial network. They assess the impact of admission control (accepting or rejecting ride requests
based on destination) and positioning control (relocating drivers to high-demand areas). The study compares three
approaches: centralised control with strict admission and repositioning; minimal control with open admission
and decentralised repositioning; and optimal admission control which combines centralised and decentralised
repositioning. Results show that while decentralised repositioning can result in drivers idling in low-demand
zones, admission control reduces such ineciencies. However, this approach can also lead to rejecting requests
from less busy areas. This may exacerbate inequality in service access and potentially decrease driver satisfaction.
Inequality
Agarwal et al
. [2]
explore the eects of ride-hailing surge pricing on the demand for traditional taxi services in
Singapore, focusing on the interaction between ride-hailing apps’ dynamic pricing and taxi bookings. They nd
that a 10% increase in ride-hailing surge prices results in a 2.6% increase in taxi bookings within the same region
and time interval. Furthermore, including surge pricing factors into demand prediction models enhances the
accuracy by 12-15%, underscoring the practical utility of surge data beyond pricing adjustments.
Volume.Item
Volume.Systemic
Bokányi and Hannák
[21]
conduct an agent-based simulation to study the impact of ride-hailing matching
algorithms. The study nds that the “nearest algorithm”, which assigns passengers to the closest vehicle, exacer-
bates inequality among drivers’ gains and is aected by the spatial location of pick-ups and drop-os. Conversely,
a “poorest algorithm”, which prioritises drivers with lower earnings, reduces gain disparities. Moreover, with
outward ows, it also boosts average driver gains. Inequality
Alonso-Mora et al
. [7]
present a ride-sharing algorithm for assigning passenger requests to a eet of vehicles
of varying capacity (i.e., number of passengers), validating its performance using New York City taxi data. The
results show that 2000 vehicles (15% of the taxi eet) of capacity ten or 3000 of capacity four can serve 98% of the
demand within a mean waiting time of 2.8 min and a mean trip delay of 3.5 min. Moreover, the study nds that
increasing vehicle capacity improves service rate and reduces the mean distance travelled by vehicles in the eet.
Volume.Item Volume.Systemic
Mori et al. [
123
] explore the advantages of integrating ride-sharing taxis with traditional taxi services through
trac simulation and dynamic vehicle allocation. The ndings indicate that increasing the number of vehicles
decreases the average time from booking to arrival. This eect is especially pronounced for ride-sharing taxis,
although it reduces vehicle occupancy rates. Volume.Individual Volume.Item
Storch et al
. [152]
employ a game-theoretic approach to investigate the incentives (nancial discounts, expected
detours and trip uncertainty, and the inconvenience of sharing a vehicle with strangers) aecting ride-sharing
adoption. The study identies two distinct adoption regimes: one characterised by decreased sharing as demand
rises and another by consistent sharing regardless of demand levels. The simulation reveals a discontinuous
transition <