Conference PaperPDF Available

Nudging Towards Health in a Conversational Food Recommender System Using Multi-Modal Interactions and Nutrition Labels

Authors:

Abstract and Figures

Humans engage with other humans and their surroundings through various modalities, most notably speech, sight, and touch. In a conversation, all these inputs provide an overview of how another person is feeling. When translating these modalities to a digital context, most of them are unfortunately lost. The majority of existing conversational recommender systems (CRSs) rely solely on natural language or basic click-based interactions. This work is one of the first studies to examine the influence of multi-modal interactions in a conversational food recommender system. In particular, we examined the effect of three distinct interaction modalities: pure textual, multi-modal (text plus visuals), and multi-modal supplemented with nutritional labeling. We conducted a user study (=195) to evaluate the three interaction modalities in terms of how effectively they supported users in selecting healthier foods. Structural equation modelling revealed that users engaged more extensively with the multi-modal system that was annotated with labels, compared to the system with a single modality, and in turn evaluated it as more effective.
Content may be subject to copyright.
Nudging Towards Health in a Conversational Food
Recommender System Using Multi-Modal Interactions and
Nutrition Labels
Giovanni Castiglia
1
,Ayoub El Majjodi
2
,Federica Calò
1
,Yashar Deldjoo
1
,Fedelucio Narducci
1
,
Alain Starke2,3 and Christoph Trattner2
1Polytechnic University of Bari, Bari, Italy
2Department of information science and media studies, University of Bergen, Bergen, Norway
3Marketing and Consumer Behaviour Group, Wageningen University & Research, Wageningen, The Netherlands
Abstract
Humans engage with other humans and their surroundings through various modalities, most notably speech, sight, and
touch. In a conversation, all these inputs provide an overview of how another person is feeling. When translating these
modalities to a digital context, most of them are unfortunately lost. The majority of existing conversational recommender
systems (CRSs) rely solely on natural language or basic click-based interactions.
This work is one of the rst studies to examine the inuence of multi-modal interactions in a conversational food
recommender system. In particular, we examined the eect of three distinct interaction modalities: pure textual, multi-modal
(text plus visuals), and multi-modal supplemented with nutritional labeling. We conducted a user study (
𝑁
=195) to evaluate
the three interaction modalities in terms of how eectively they supported users in selecting healthier foods. Structural
equation modelling revealed that users engaged more extensively with the multi-modal system that was annotated with
labels, compared to the system with a single modality, and in turn evaluated it as more eective.
Keywords
Personalization, Health, Food recommendation, Digital Nudges, Nutrition labels
1. Introduction and Context
Conversational recommender systems (CRSs) represent
a hotly debated area of study in the eld of information
seeking [
1
,
2
]. They combine the power of recommen-
dation algorithms with conversational strategies. Using
multi-turn conversations, CRSs are able to collect users’
nuanced and dynamic preferences in more depth, which
can enhance recommendation outcomes and user experi-
ence. CRSs are utilized in a variety of domains, including
medical diagnosis [
3
], e-commerce [
4
], and entertain-
ment [
5
,
6
]. Only a few studies have investigated their
merit for food recommendation [
7
], and in particular for
encouraging users to make healthier food decisions.
Over 60% of all deaths are caused by non-
communicable diseases, which are preventable by
tackling risk factors, such as attaining a healthy food
RecSys’22: 4th Workshop of Knowledge-aware and Conversational
Recommender Systems, Seattle, WA, USA
g.castiglia@studenti.poliba.it (G. Castiglia);
ayoub.majjodiu@uib.no (A. E. Majjodi); f.calo8@studenti.poliba.it
(F. Calò); yashar.deldjoo@poliba.it (Y. Deldjoo);
fedelucio.narducci@poliba.it (F. Narducci); alain.starke@uib.no
(A. Starke); christoph.trattner@uib.no (C. Trattner)
https://www.christophtrattner.info/ (C. Trattner)
0000-0002-7478-5811 (A. E. Majjodi); 0000-0002-6767-358X
(Y. Deldjoo); 0000-0002-9255-3256 (F. Narducci);
0000-0002-9873-8016 (A. Starke); 0000-0002-1193-0508 (C. Trattner)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License
Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
intake [
8
]. While our food decisions are driven by
our overall preferences, the food selection process is
extremely contextual and inuenced by a variety of
factors, such as the user’s mood and dietary constraints.
Moreover, many of the decisions are made spontaneously
and consumers’ judgments are inuenced by factors
unrelated to the food content, such as their perception
of the food’s visual characteristics [
9
]. For instance,
the packaging of items with nutritional labels can
serve to highlight the nutritious nature of the food
(cf. [
10
]). Moreover, people generally prefer food that
has a more visually appealing presentation, such as
food that is presented in an attractive way [
11
]. People
are willing to pay extra for food whose ingredients
are tastefully/attractively organized, and restaurants
strive to generate Instagram-friendly photographs by
enhancing the color composition of their plates.
To surface eective and healthy food recommenda-
tions it is crucial to understand these underlying decision
factors. Regrettably, the large majority of existing con-
versational recommender systems [
12
,
13
] only consider
a single type of interaction, such as natural language or
click-based interaction, thereby neglecting a wealth of
information in the actual imaging of meals [
14
]. The goal
of the present work at hand is to employ a new conver-
sational model for food recommendation that permits
more natural, multi-modal user-system interaction.
To attain this goal, this paper introduces a multi-
modal conversational food recommender system (MM-
CFRS). It implements dierent user-system interaction
modes, along with nutrition labelling in order to assist the
user in making dietary decisions. Our objective is to ex-
amine the eects of three distinct interaction modes: pure
textual, multi-modal (text plus visuals), and multi-modal
supplemented with nutritional labeling. While multi-
modal conversational information seeking (MMCIS) is
gaining attention by the research in the RecSys/IR/HCI
communities [
15
,
1
,
16
], only a few practical studies have
been published that focus on topics other than food and
health, such as conservational systems on tourism [
17
]
and fashion [
18
,
19
]. In the eld of food recommendation,
Elsweiler et al.
[20]
provide a good frame of reference
for recent advances in the eld of food recommender
systems in general. Specically for conversational sys-
tems, Barko-Sherif et al. [
21
] investigate the possibility
for conversational preference elicitation in a food recom-
mender environment, using a Wizard of Oz study design
(see also [
22
]). Using a between-groups approach, they
compare spoken and text-input chat interfaces and re-
ported that such interfaces are useful for users to describe
their needs and preferences. In other studies, Samagaio
et al. [
23
] present a RASA-based chatbot that can rec-
ognize and categorize user intentions in the conversa-
tion aimed to elicit food preferences for recommendation
purposes. Another study of Samagaio et al. [
24
] ap-
plies more knowledge-based elements based on word em-
bedding to optimize conversational ingredient retrieval.
These studies, however, focus less on aspects pertaining
to health, health labelling, or elicitation modalities. In
a non-conversational recommender context, El Majjodi
et al. [
25
] recently indicated that nutritional labels can
reduce user’s choice diculty in the context of conven-
tional non-conversational recommendation system. The
primary distinction between our work and previous stud-
ies is the lack of multiple modalities (typically only text
is used), as well as that only a few studies (e.g., [
25
]) have
used nutrition labelling.
To summarize, the goal of this study is to compare the
impact of three user-system interaction and explanation
modalities (textual, multi-modal, and multi-modal with
nutritional labels) on both behavioral aspects (what type
of recipe is chosen? How healthy is that recipe?) and
evaluation aspects (how does the user evaluate the sys-
tem or their chosen recipe?). Using a mediation analysis
(structural equation modelling), we answer the following
research question:
RQ: To what extent do dierent interaction modal-
ities aect a user’s recipe choices and evaluation
in a conversational food recommendation sce-
nario?
To address this question, we consider dierent dimen-
sions of analysis. This includes system interaction length,
presentation time, healthiness of recipes chosen and a
user’s level of choice satisfaction and experienced system
eectiveness.
2. System Design
In this section we describe the features of our conversa-
tional food recommender system, which supports users
in making healthier choices.1
We designed a system-driven conversation in which the
system requires user feedback (response/input) to con-
tinue. The main steps of the conversational ow are
shown in Figure 1. Users can interact with the system us-
ing both buttons and textual messages
2
. The main steps
of the interaction are reported below:
Food category acquisition: The user was presented
with a choice of four dierent food categories
that were considered in this work: Pasta, Salad,
Dessert, and Snack.
User constraints acquisition: The user was then
prompted to indicate any potential dietary con-
straints. Initially, the system used an interface
with a single checkbox for each of the most preva-
lent intolerances and allergies: Lactose, Meat, Al-
cohol, Seafood, Reux, Cholesterol, Diabetes. Af-
terwards, the system asked the user to disclose a
list of ingredients she could not consume.
Preference elicitation: According to the con-
straints specied by the user, the user was
prompted to submit preferences for ve of the
dishes proposed by the system. Each dish was
accompanied with two buttons: “Like” and “Skip”.
The skip option was provided to encourage users
to inspect an addition dish, which was retrieved
from the randomly sorted menu. The retrieval
was based on a random active learning strategy.
This way, users were encouraged to like ve
dishes they were interested in, aer which the
user prole was built by the system.
Processing: The system constructed the user pro-
le by analyzing the user’s ve preferences from
the previous stage. The cosine similarity was
computed between the user prole and each of
the available foods in the catalog, to provide a list
of dishes from which recommendations would
be selected. The algorithm also provided a list
of dishes ranked according to their healthiness
(based on their FSA score; see Section 3).
1
Code and recipe data used for the implementing the chatbot will
be released in a GitHub repository aer peer-review and linked in
the paper.
2An anonymized video demo of the three versions of our system is
available online at https://tinyurl.com/mtzxr2sw
Figure 1: Our conversational recommender system flow.
For each food category we built a matrix con-
taining the TF-IDF representation (dish vs. ingre-
dient) of dishes in the catalog. The higher the
TF-IDF score, the greater the ingredient’s signi-
cance to this dish (as opposed to other dishes).
Recommendation and explanation: The system
provided two personalized recommendations,
based on the user’s preferences. The system con-
strained the retrieval to ensure that the two op-
tions diered in terms of healthiness, so that one
option was healthier than the other. Thus, the
algorithm provided a description of the suggested
dishes. Specically, it explained why the second
dish was healthier than the rst and why the ad-
vice was made. The user would then be prompted
to select one or request a new recommendation.
The two recommended dishes were chosen using
the following strategy: The rst dish would be
the most similar to the user prole, while the sec-
ond dish (the healthier alternative) was selected
from a list of most similar dishes ranked on their
FSA scores, selecting the healthiest one (i.e. with
the lowest FSA score).
Three dierent interaction modes were implemented
by modifying the values associated with the two manipu-
lated variables: interaction
𝐼
and explanation
𝐸
, according
to Table 1.
In the Pure text version (T + T), the system communi-
cates with the user solely through text, displaying sim-
ply the dish titles and oering textual explanations of
the food recommendations. In the Multi-modal version
(MM + T), the system engages the user in a multi-modal
Table 1
Dierences between three implementations of the system.
Interaction Mode I E
Pure text (T) T T
Multi-modal (MM) MM T
Multi-modal with labels (MM-Label) MM MM
manner by displaying the name and image of each dish
throughout the dialogue. However, the supplied explana-
tion remains textual. For the rst dish, the explanation
can be like ”I recommend these dish because I know
that you have diet constraints due to: meat, zucchini.
The rst dish I proposed contains ingredients that you
might like: carrot, lemon, tuna, olive oil”. For the sec-
ond recommendation, the explanation further provides
information about macro nutrients quantities of the two
recommended dishes and can be in the form of ”The
second dish I proposed has less calories (54 Kcal) than
the rst one (123 Kcal) and has less fats than the rst
one. The third version MM-Label (MM + MM) likewise
employs a multi-modal interaction approach, but it also
makes use of nutritional explanations in the form of a
front-of-package nutrition label with FSA’s Multiple Traf-
c Lights (MTL) [
25
]. MTL nutrition labels depicted the
intake adequacy of a dish in terms of energy and nutri-
tional content, along ve dimensions: energy (kcal), fat,
saturates, sugars, and salt. This adequacy, per serving
and per 100g, was depicted using the colors green, yellow
and red, where green indicated a dish to adhere to the
nutritional intake guideline, while red indicated that the
content was unacceptable. These labels were generated
Figure 2:
The three implementations of the system. Some details displayed on the interface, such as the chatbot’s and authors’
names are anonymized and will be added aer peer review.
for each dish by following the directives of Food Standard
Agency and UK department of health [26].
Figure 2depicts a snapshot of the chatbot prototype,
visualizing the dierent interaction phases.
In the Textual (T) version, the user received recom-
mendations identied by only the names of the dishes
(e.g., Cupcake Princess’ Vanilla Cupcakes, Floating Island
II). The recommendations were followed by textual ex-
planations, based on the ingredients in the dish that the
user likes. A comparative analysis of the nutritional facts
(e.g., ‘less sugars’) would also be provided. In the Multi-
modal (MM) version, the system additionally provided
images of the recommended dishes. The explanation was
similar to the one presented in the Tversion. Finally,
the Multi-modal with labels (MM-Label) version provided
nutritional labels that were annotated to the depicted
images (e.g., Sugar 2.3g, Fat 10.7g, etc.) presented with
red, yellow, and/or green colors according to the FSA
score. As stated previously, following the presentation of
the recommendations, we provide the user with an expla-
nation that helps her comprehend the health benets of
the second alternative above the rst, which is the dish
that best matches her preferences. This is accomplished
either by text (T and MM variants) or a multiple trac
light nutritional label (MM-Label).
The user can accept one of the two dishes proposed or
can ask for another recommendation.
3. Experimental Evaluation
To evaluate the extent to which dierent versions of the
chatbot aected users’ evaluations and decisions, we re-
cruited 195 participants from Amazon MTurk to use our
system. Participants had to have a hit rate of 95% at least
and were compensated with 2 dollars. On average, user
required around 15 minutes to complete the study.
3
Users
3
The research conformed to the ethical standards of the Norwegian
Centre for Research Data (NSD). The collected data will also released
Table 2
Questionnaire items used in the confirmatory factor analysis. Alpha denotes Cronbach’s Alpha, AVE denotes the Average
Variance Explained, indicating construct validity if AVE > 0.5. Items in gray and without loading were omitted from analysis.
Choice Satisfaction did not form a sensible aspect, because of a lack of construct validity.
Aspect Item Loading
I think, I would enjoy eating the dish I have chosen in the end
Choice Satisfaction I would recommend the dish I’ve chosen in the end to others
My chosen dish could become my favorite
It was easy to make my final choice on the dish 0.737
I interacted a lot with the system before getting the dish of my choice
System Eectiveness The explanation influenced my final choice of dish
I think, that I would use this system frequently
Alpha = 0.740 I found the system easy to use and understand 0.724
AVE = 0.534 I felt very confident using the system 0.661
I would imagine that most people would learn to use this system very quickly 0.722
performed the processes outlined in Section 2, interact-
ing with our chatbot for preference elicitation, evaluating
recipe recommendations, selecting one recipe, and evalu-
ating the experience. A user’s experience was evaluated
through choice satisfaction and system eectiveness, us-
ing questionnaire items that were evaluated on 5-point
Likert scales.
Chosen recipes were evaluated according to their
healthiness. This was evaluated using the FSA score [
27
].
Each recipe was scored between 4 and 12, where 4 indi-
cated that all four nutrients (sugar, fat, saturated fat, salt)
adhered to nutritional guidelines per 100g [
9
,
28
], while
12 would indicate that a recipe was unhealthy because
of all nutritional contents being too high.
The responses to the evaluation questionnaire item
were submitted to a conrmatory factor analysis (CFA;
see Table 2). Unfortunately, we could not infer a reli-
able construct for choice satisfaction, as the variance
explained by the questionnaire items was too low, while
Cronbach’s Alpha was only acceptable (0.60). Other items
were dropped from the system eectiveness aspect be-
cause of low factor loadings.
We organized the dierent factors (e.g., conversation
time, condition factors) and aspects (i.e., system eective-
ness) into a path model using Structural Equation Mod-
elling. Figure 3depicts the resulting model, which had de-
cent t statistics:
𝜒2(17) = 28.064
,
𝑝 < 0.05
,
𝐶𝐹 𝐼 = 0.969
,
𝑇 𝐿𝐼 = 0.954
,
𝑅𝑀𝑆𝐸𝐴 = 0.058
,
90% 𝐶𝐼
:
[0.009, 0.095]
.
The relevant AVEs of the aspects was suciently high to
form a path model [29].
Our analysis revealed that the MM-Label condition
with nutrition labels (MM-label) stood out in terms of
how long users interacted with our chatbot. Figure 3
illustrates this, while the use of multi-modal approaches
alone had no eect on the interaction or evaluation fac-
tors considered. For MM-Label, our mediation analysis
suggested that in the MM-Label condition, the conversa-
online in the project’s GitHub repository aer peer-review.
tion duration was signicantly longer (
𝑝 < 0.05
) than in
the text-based condition . This indicated that the usage
of nutrition labels aected conversation time, on top of
the other modalities.
The duration of the conservation aected, in turn, the
evaluation of the user. Inferred from our conrmatory
factor analysis (cf. Table 2), users who interacted with
the chatbot for longer periods of time indicated greater
levels of system eectiveness (
𝑝 < 0.01
). This indicated
that an extended engagement did not frustrate users. In-
stead, it indicated that they were enthusiastic about using
the system. Figure 3also shows that the healthiness of
chosen recipes was not signicantly related to any of the
other aspects or factors. Note that the MM-Label condi-
tion led the healthiest recipe choices, but the dierences
with the other conditions were not signicant.
4. Conclusion and Future Work
We have presented a novel chatbot-like recommender
system that introduces multi-modality in interaction with
user, presentation of results and explanation of the rec-
ommendations with nutrition labels in a conversational
scenario. We have designed and analyzed the impact
of three distinct version of our chatbot: pure textual,
multi-modal (use of text and images), and multi-modal
supplemented with nutritional labels.
Our experimental evaluation reveals that our chatbot
is the most eective when accompanied by explanatory
labels. This is indicated by the length of conversation, as
well as by the user’s evaluation of the system eective-
ness.
Limitations to this study could be viewed from dif-
ferent viewpoints. In terms of analysis, we have been
unable to infer the choice satisfaction evaluation aspect.
Other research have demonstrated that decision satisfac-
tion is a good predictor of post-interaction engagement
with selected item, such as for household energy con-
Figure 3:
Structural Equation Model (SEM). Numbers on the arrows represent the
𝛽
-coeicients, standard errors are denoted
between brackets. Eects between the subjective constructs are standardized and can be considered as correlations, other
eects show regression coeicients. Aspects are grouped by color: Objective system aspects are purple, behavioral indicators
are blue (note: the FSA score represents recipe unhealthiness) and experience aspects are orange. The thinner arrows are
non-significant relations, in addition: ∗∗∗ 𝑝 < 0.001,∗∗ 𝑝 < 0.01,𝑝 < 0.05.
servation [
30
]. Moreover, rather than relying solely on
system-driven interaction, it might be intriguing and nat-
ural to investigate user-driven scenarios in which users
might query the system with an image and textual query.
The food categories considered in this work (pasta, salad,
dessert, snack) could additionally be expanded to include
more meal categories and their combinations, such as
to create a complete meat (rst dish, second dish and
vegetables). On top of that, the distinctions between var-
ious label modalities are an additional intriguing topic
we wish to investigate more in-depth [31].
References
[1]
H. Zamani, J. R. Trippas, J. Dalton, F. Radlinski,
Conversational information seeking, arXiv preprint
arXiv:2201.08808 (2022).
[2]
D. Jannach, A. Manzoor, W. Cai, L. Chen, A survey
on conversational recommender systems, ACM
Computing Surveys 54 (2022) 1–36. doi:
10.1145/
3453154
.
[3]
P. Cordero, M. Enciso, D. López, A. Mora, A conver-
sational recommender system for diagnosis using
fuzzy rules, Expert Systems with Applications 154
(2020) 113449. doi:
10.1016/j.eswa.2020.113449
.
[4]
D. Griol, J. Milina, From voicexml to multimodal
mobile apps: development of practical conversa-
tional interfaces, ADCAIJ Adv. Distrib. Comput.
Artif. Intell. J. 5 (2016) 43.
[5]
F. Narducci, P. Basile, M. de Gemmis, P. Lops, G. Se-
meraro, An investigation on the user interac-
tion modes of conversational recommender sys-
tems for the music domain, User Model. User
Adapt. Interact. 30 (2020) 251–284. URL: https://
doi.org/10.1007/s11257-019-09250-7. doi:
10.1007/
s11257-019- 09250-7
.
[6]
A. Iovine, F. Narducci, G. Semeraro, Conversa-
tional recommender systems and natural language:
: A study through the converse framework, De-
cis. Support Syst. 131 (2020) 113250. URL: https:
//doi.org/10.1016/j.dss.2020.113250. doi:
10.1016/j.
dss.2020.113250
.
[7]
C. Trattner, D. Elsweiler, Food recommendations,
in: Collaborative recommendations: Algorithms,
practical challenges and applications, World Scien-
tic, 2019, pp. 653–685.
[8]
R. Y. Toledo, A. A. Alzahrani, L. Martinez, A food
recommender system considering nutritional infor-
mation and user preferences, IEEE Access 7 (2019)
96695–96711.
[9]
A. D. Starke, M. C. Willemsen, C. Trattner, Nudg-
ing healthy choices in food search through visual
attractiveness, Frontiers in Articial Intelligence 4
(2021) 621743.
[10]
E. J. Van Loo, C. Grebitus, J. Roosen, Explaining
attention and choice for origin labeled cheese by
means of consumer ethnocentrism, Food Quality
and Preference 78 (2019) 103716.
[11]
Y. Peng, J. B. Jemott III, Feast for the eyes: Eects
of food perceptions and computer vision features
on food photo popularity., International Journal of
Communication (19328036) 12 (2018).
[12]
C. Zhou, Y. Jin, K. Zhang, J. Yuan, S. Li, X. Wang,
Musicrobot: Towards conversational context-
aware music recommender system, in: Interna-
tional Conference on Database Systems for Ad-
vanced Applications, Springer, 2018, pp. 817–820.
[13]
J. Schaer, T. Hollerer, J. O’Donovan, Hypothetical
recommendation: A study of interactive prole ma-
nipulation behavior for recommender systems, in:
The Twenty-Eighth International Flairs Conference,
2015, pp. 507–512.
[14]
Y. Deldjoo, M. Schedl, P. Cremonesi, G. Pasi, Rec-
ommender systems leveraging multimedia content,
ACM Computing Surveys (CSUR) 53 (2020) 1–38.
[15]
Y. Deldjoo, J. R. Trippas, H. Zamani, Towards multi-
modal conversational information seeking, in: Pro-
ceedings of the 44th International ACM SIGIR con-
ference on research and development in Informa-
tion Retrieval, 2021, pp. 1577–1587.
[16]
R. G. Sousa, P. M. Ferreira, P. M. Costa, P. Azevedo,
J. P. Costeira, C. Santiago, J. Magalhaes, D. Semedo,
R. Ferreira, A. I. Rudnicky, et al., ifetch: Multimodal
conversational agents for the online fashion market-
place, in: Proceedings of the 2nd ACM Multimedia
Workshop on Multimodal Conversational AI, 2021,
pp. 25–26.
[17]
L. Liao, L. H. Long, Z. Zhang, M. Huang, T.-S. Chua,
Mmconv: an environment for multimodal conversa-
tional search across multiple domains, in: Proceed-
ings of the 44th International ACM SIGIR Confer-
ence on Research and Development in Information
Retrieval, 2021, pp. 675–684.
[18]
S. Moon, S. Kottur, P. A. Crook, A. De, S. Pod-
dar, T. Levin, D. Whitney, D. Difranco, A. Beirami,
E. Cho, et al., Situated and interactive multimodal
conversations, arXiv preprint arXiv:2006.01460
(2020).
[19]
Y. Yuan, W. Lam, Conversational fashion image
retrieval via multiturn natural language feedback,
in: Proceedings of the 44th International ACM SI-
GIR Conference on Research and Development in
Information Retrieval, 2021, pp. 839–848.
[20]
D. Elsweiler, H. Hauptmann, C. Trattner, Food
recommender systems, in: Recommender Systems
Handbook, Springer, 2022, pp. 871–925.
[21]
S. Barko-Sherif, D. Elsweiler, M. Harvey, Conversa-
tional agents for recipe recommendation, in: Pro-
ceedings of the 2020 Conference on Human Infor-
mation Interaction and Retrieval, 2020, pp. 73–82.
[22]
A. Steinfeld, O. C. Jenkins, B. Scassellati, The oz
of wizard: simulating the human for interaction
research, in: Proceedings of the 4th ACM/IEEE in-
ternational conference on Human robot interaction,
2009, pp. 101–108.
[23]
Á. Mendes Samagaio, H. Lopes Cardoso, D. Ribeiro,
A chatbot for recipe recommendation and prefer-
ence modeling, in: EPIA Conference on Articial
Intelligence, Springer, 2021, pp. 389–402.
[24]
Á. M. Samagaio, H. Lopes Cardoso, D. Ribeiro, En-
riching word embeddings with food knowledge for
ingredient retrieval, in: 3rd Conference on Lan-
guage, Data and Knowledge (LDK 2021), Schloss
Dagstuhl-Leibniz-Zentrum für Informatik, 2021.
[25]
A. El Majjodi, A. D. Starke, C. Trattner, Nudging
towards health? examining the merits of nutrition
labels and personalization in a recipe recommender
system, in: Proceedings of the 30th ACM Confer-
ence on User Modeling, Adaptation and Personal-
ization, 2022, pp. 48–56.
[26]
Department of Health and Social Care UK, Front
of Pack nutrition labelling guidance, 2016. URL:
https://www.gov.uk/government/publications/
front-of-pack-nutrition-labelling-guidance.
[27]
D. of Health UK, F. S. Agency, Guide
to creating a front of pack (fop) nutri-
tion label for pre-packed products sold
through retail outlets (2016). URL: https:
//assets.publishing.service.gov.uk/government/
uploads/system/uploads/attachment_data/file/
566251/FoP_Nutrition_labelling_UK_guidance.pdf.
[28]
C. Trattner, D. Elsweiler, Investigating the healthi-
ness of internet-sourced recipes: implications for
meal planning and recommender systems, in: Pro-
ceedings of the 26th international conference on
world wide web, ACM, New York, NY, USA, 2017,
pp. 489–498.
[29]
B. P. Knijnenburg, M. C. Willemsen, Evaluating
recommender systems with user experiments, in:
Recommender systems handbook, Springer, 2015,
pp. 309–352.
[30]
A. Starke, M. Willemsen, C. Snijders, Eective user
interface designs to increase energy-ecient behav-
ior in a rasch-based energy recommender system,
in: Proceedings of the eleventh ACM conference
on recommender systems, 2017, pp. 65–73.
[31]
Y. Deldjoo, M. Schedl, B. Hidasi, Y. Wei, X. He, Mul-
timedia recommender systems: Algorithms and
challenges, in: Recommender systems handbook,
Springer, 2022, pp. 973–1014.
Article
Recommender systems have transformed our digital experiences in many regards. We enumerate six of their positive effects on the economy and humans, such as greater user satisfaction, time savings, broadening user horizons, and positive behavioral nudging. However, it is crucial to acknowledge the potential downsides inherent in their design. One significant concern is that these algorithms often prioritize the interests of the company deploying them, aiming to maximize profits and user engagement rather than solely focusing on enhancing user experience. Therefore, we also list and consider two use cases and six negative long-term impacts on humans, including addiction, reduced ability to think critically, less autonomy, and weakened human relationships caused by more and more human-like virtual assistants. Despite the undeniable utility of recommender systems, it is imperative to approach them critically, advocating for transparency, ethical considerations, and user empowerment to ensure that they serve as tools for enrichment rather than exploitation. To accomplish this, the idea and challenges of responsible recommender systems (RRSs) are presented. RRSs extend common recommender systems with components related to individual human values and goals as well as widely accepted well-being and lifestyle guidelines.
Conference Paper
Full-text available
p>Food recommender systems show personalized recipes to users based on content liked previously. Despite their potential, often recommended (popular) recipes in previous studies have turned out to be unhealthy, negatively contributing to prevalent obesity problems worldwide. Changing how foods are presented through digital nudges might help, but these are usually examined in non-personalized contexts, such as a brick-and-mortar supermarket. This study seeks to support healthy food choices in a personalized interface by adding front-of-package nutrition labels to recipes in a food recommender system. After performing an offline evaluation, we conducted an online study (N = 600) with six different recommender interfaces, based on a 2 (non-personalized vs. personalized recipe advice) x 3 (No Label, Multiple Traffic Light, Nutri-Score) between-subjects design. We found that recipe choices made in the non-personalized scenario were healthier, while the use of nutrition labels (our digital nudge) reduced choice difficulty when the content was personalized.</p
Article
Full-text available
Recommender systems are software applications that help users to find items of interest in situations of information overload. Current research often assumes a one-shot interaction paradigm, where the users’ preferences are estimated based on past observed behavior and where the presentation of a ranked list of suggestions is the main, one-directional form of user interaction. Conversational recommender systems (CRS) take a different approach and support a richer set of interactions. These interactions can, for example, help to improve the preference elicitation process or allow the user to ask questions about the recommendations and to give feedback. The interest in CRS has significantly increased in the past few years. This development is mainly due to the significant progress in the area of natural language processing, the emergence of new voice-controlled home assistants, and the increased use of chatbot technology. With this article, we provide a detailed survey of existing approaches to conversational recommendation. We categorize these approaches in various dimensions, e.g., in terms of the supported user intents or the knowledge they use in the background. Moreover, we discuss technological approaches, review how CRS are evaluated, and finally identify a number of gaps that deserve more research in the future.
Article
Full-text available
Recipe websites are becoming increasingly popular to support people in their home cooking. However, most of these websites prioritize popular recipes, which tend to be unhealthy. Drawing upon research on visual biases and nudges, this paper investigates whether healthy food choices can be supported in food search by depicting attractive images alongside recipes, as well as by re-ranking search results on health. After modelling the visual attractiveness of recipe images, we asked 239 users to search for specific online recipes and to select those they liked the most. Our analyses revealed that users tended to choose a healthier recipe if a visually attractive image was depicted alongside it, as well as if it was listed at the top of a list of search results. Even though less popular recipes were promoted this way, it did not come at the cost of a user’s level of satisfaction.
Conference Paper
Full-text available
Recent research on conversational information seeking mostly focuses on uni-modal interactions and information items. In this perspective paper, we highlight the importance of moving towards developing and evaluating multi-modal conversational information seeking (MMCIS) systems as they enable us to leverage richer context, overcome errors, and increase accessibility. We bridge the gap between the multi-modal and conversational information seeking research and provide a formal definition for MMCIS. We also discuss potential opportunities and research challenges in designing, implementing, and evaluating MMCIS systems. Based on this research, we also propose and implement a practical open-source framework for facilitating research on the topic.
Article
Full-text available
Recommender systems have become a popular and effective means to manage the ever-increasing amount of multimedia content available today and to help users discover interesting new items. Today's recommender systems suggest items of various media types, including audio, text, visual (images), and videos. In fact, scientific research related to the analysis of multimedia content has made possible effective content-based recommender systems capable of suggesting items based on an analysis of the features extracted from the item itself. The aim of this survey is to present a thorough review of the state-of-the-art of recommender systems that leverage multimedia content, by classifying the reviewed papers with respect to their media type, the techniques employed to extract and represent their content features, and the recommendation algorithm. Moreover, for each media type, we discuss various domains in which multimedia content plays a key role in human decision-making and is therefore considered in the recommendation process. Examples of the identified domains include fashion, tourism, food, media streaming, and e-commerce.
Article
Graded implications in the framework of Fuzzy Formal Concept Analysis are used as the knowledge guiding the recommendations. An automated engine based on fuzzy Simplification Logic is proposed to make the suggestions to the users. Conversational recommender systems have proven to be a good approach in telemedicine, building a dialogue between the user and the recommender based on user preferences provided at each step of the conversation. Here, we propose a conversational recommender system for medical diagnosis using fuzzy logic. Specifically, fuzzy implications in the framework of Formal Concept Analysis are used to store the knowledge about symptoms and diseases and Fuzzy Simplification Logic is selected as an appropriate engine to guide the conversation to a final diagnosis. The recommender system has been used to provide differential diagnosis between schizophrenia and schizoaffective and bipolar disorders. In addition, we have enriched the conversational strategy with two strategies (namely critiquing and elicitation mechanism) for a better understanding of the knowledge-driven conversation, allowing user’s feedback in each step of the conversation and improving the performance of the method.
Article
Digital Assistants (DA) such as Amazon Alexa, Siri, or Google Assistant are now gaining great diffusion, since they allow users to execute a wide range of actions through messages in natural language. Even though DAs are able to complete tasks such as sending texts, making phone calls, or playing songs, they do not yet implement recommendation facilities. In this paper, we investigate the combination of Digital Assistants and Conversational Recommender Systems (CoRSs) by designing and implementing a framework named ConveRSE (Conversational Recommender System framEwork), for building chatbots that can recommend items from different domains and interact with the user through natural language. Since a CoRS architecture is generally composed of different elements, we performed an in-vitro experiment with two synthetic datasets, to investigate the impact that each component has on the CoRS in terms of recommendation accuracy. Additionally, an in-vivo experiment was carried out to understand how natural language influences both the cost of interaction and recommendation accuracy of a CoRS. Experimental results have revealed the most critical components in a CoRS architecture, especially in cold-start situations, and the main issues of the natural-language-based interaction. All the dialogues have been collected in a public available dataset.