Miriam Redi’s research while affiliated with Grenoble Alpes University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (27)


Quantifying Engagement with Citations on Wikipedia. (Part 2) (The translation and original text of the article are presented)
  • Article

December 2020

·

4 Reads

·

3 Citations

Scientific and Technical Libraries

·

R. West

·

M. Redi

·

Wikipedia is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia was not conceived as a source of original information, but as a gateway to secondary sources: according to Wikipedia’s guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers’ interactions with citations. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0,29% overall; 0,56% on desktop; 0,13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources, and references about life events (births, deaths, marriages, etc.) are particularly popular. Taken together, our findings deepen our understanding of Wikipedia’s role in a global information economy where reliability is ever less certain, and source attribution ever more vital.


Quantifying Engagement with Citations on Wikipedia. (Part 1)

October 2020

·

9 Reads

·

2 Citations

Scientific and Technical Libraries

Википедия является одним из самых посещаемых сайтов в интернете и распространённым источником информации для многих пользователей. В качестве энциклопедии Википедия задумывалась не как источник оригинальной (окончательной) научной информации, а, скорее, как ворота к более глубоким и точным источникам. В соответствии с базовыми принципами Википедии факты должны быть подкреплены надёжными источниками, которые отражают полный спектр всех мнений по данной теме. Хотя цитаты лежат в основе функционирования Википедии, пока мало что известно о том, как пользователи работают с ними. Чтобы закрыть этот пробел, мы создали клиентские (пользовательские) инструменты для ведения записей (журналов) всех взаимодействий со ссылками, идущими из англоязычных статей Википедии на цитируемые ссылки в течение одного месяца, и провели первый анализ взаимодействия читателей с цитатами. Результаты показывают, что в целом вовлечённость в цитаты низкая. Около 300 просмотров страниц приводят к входу на одну ссылку – это составляет всего 0,29%, в том числе 0,56% при работе с настольным компьютером (на рабочем столе) и 0,13% при работе на мобильных устройствах. Сопоставление факторов, связанных с переходами по ссылке, показывает, что переходы происходят чаще на более коротких страницах и на страницах относительно низкого качества. Исходя из этого можно предположить, что ссылки чаще всего требуются, когда Википедия не содержит информацию, которую ищет пользователь. Кроме того, мы обратили внимание, что источники открытого доступа и ссылки о жизненных событиях (рождения, смерти, браки и т.д.) особенно популярны. Собранные воедино, наши выводы углубляют понимание роли Википедии в глобальной информационной экономике, где надёжность становится всё менее определённой, а значение источников становится всё более важным. Справочный формат ACM для ссылок: Тициано Пиккарди, Мириам Реди, Джованни Колавицца и Роберт Вест. 2020. Количественная оценка взаимодействия с цитатами в Википедии. В трудах: Веб-конференция 2020 (WWW’20), 20–24 апр. 2020 г., Тайбэй, Тай-вань. ACM, Нью-Йорк, штат Нью-Йорк, США. 12 стр. https://doi.org/10.1145/3366423.3380300.



Figure 2: Distribution of Wikipedia articles by (a) popularity (number of pageviews), (b) page length (number of characters in wikicode), and (c) quality (increasing from left to right; "GA" for "Good Articles", "FA" for "Featured Articles") (Sec. 3.5).
Figure 4: Relative frequency of citation-related events (Sec. 3.2), split into desktop (green, left bars) and mobile (blue, right bars) in April 2019 (Sec. 4.1).
Figure 5: Relative position in page of clicked vs. unclicked references, for references with hyperlinks (Sec. 4.3).
Figure 9: Comparison of page-specific click-through rate for low-(yellow) vs. high-quality (blue) articles, as function of popularity (Sec. 5.2). Error bands: bootstrapped 95% CIs.
Figure 10: Comparison of page-specific click-through rate for short (yellow) vs. long (blue) articles, as function of popularity (Sec. 5.3). Error bands: bootstrapped 95% CIs.

+1

Quantifying Engagement with Citations on Wikipedia
  • Preprint
  • File available

January 2020

·

421 Reads

Wikipedia, the free online encyclopedia that anyone can edit, is one of the most visited sites on the Web and a common source of information for many users. As an encyclopedia, Wikipedia is not a source of original information, but was conceived as a gateway to secondary sources: according to Wikipedia's guidelines, facts must be backed up by reliable sources that reflect the full spectrum of views on the topic. Although citations lie at the very heart of Wikipedia, little is known about how users interact with them. To close this gap, we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers' interaction with citations on Wikipedia. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). Matched observational studies of the factors associated with reference clicking reveal that clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources and references about life events (births, deaths, marriages, etc) are particularly popular. Taken together, our findings open the door to a deeper understanding of Wikipedia's role in a global information economy where reliability is ever less certain, and source attribution ever more vital.

Download

Image Recommendation for Wikipedia Articles

January 2020

·

1,069 Reads

Multimodal learning, which is simultaneous learning from different data sources such as audio, text, images; is a rapidly emerging field of Machine Learning. It is also considered to be learning on the next level of abstraction, which will allow us to tackle more complicated problems such as creating cartoons from a plot or speech recognition based on lips movement. In this paper, we will introduce a basic model to recommend the most relevant images for a Wikipedia article based on state-of-the-art multimodal techniques. We will also introduce the Wikipedia multimodal dataset, containing more than 36,000 high-quality articles



Fig. 1. Three types of frameworks about deep multimodal representation. (a) Joint representation aims to learn a shared semantic subspace.(b) Coordinated representation framework learns separated but coordinated representations for each modality under some constraints. (c) intermediate representation framework translates one modality into another and keep their semantics consistent.[1]
Fig. 2. Example of Coordinated Representation learning pipeline[6]
Time Plan
Image Recommendation for Wikipedia Articles [Preprint 2]

December 2019

·

58 Reads

Multimodal learning, which is simultaneous learning from different data sources such as audio, text, images; is a rapidly emerging field of Machine Learning. It is also considered to be learning on the next level of abstraction, which will allow us to tackle more complicated problems such as creating cartoons from a plot or speech recognition based on lips movement. In this paper, we propose to research whether state-of-the-art techniques of multimodal learning, will solve the problem of recommending the most relevant images for a Wikipedia article. In other words, we need to create a shared text-image representation of an abstract notion which paper describes, so that having only a text description machine would ”understand” which images would visualize the same notion accurately.


Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability

May 2019

·

75 Reads

·

57 Citations

Wikipedia is playing an increasingly central role on the web, and the policies its contributors follow when sourcing and fact-checking content affect million of readers. Among these core guiding principles, verifiability policies have a particularly important role. Verifiability requires that information included in a Wikipedia article be corroborated against reliable secondary sources. Because of the manual labor needed to curate Wikipedia at scale, however, its contents do not always evenly comply with these policies. Citations (i.e. reference to external sources) may not conform to verifiability requirements or may be missing altogether, potentially weakening the reliability of specific topic areas of the free encyclopedia. In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. First, we construct a taxonomy of reasons why inline citations are required, by collecting labeled data from editors of multiple Wikipedia language editions. We then crowdsource a large-scale dataset of Wikipedia sentences annotated with categories derived from this taxonomy. Finally, we design algorithmic models to determine if a statement requires a citation, and to predict the citation reason . We evaluate the accuracy of such models across different classes of Wikipedia articles of varying quality, and on external datasets of claims annotated for fact-checking purposes.


Figure 2: Citation reason distribution from the small-scale (166 sentences) crowdsourcing experiment.
Figure 3: Confusion matrix indicating the agreement between Mechanical Turk workers ("non-experts") and Wikipedia editors ("experts"). The darker the square, the higher the percent agreement between the two groups
Figure 4: Citation Need model with RNN and global attention, using both word and section representations.
Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability

February 2019

·

183 Reads

Wikipedia is playing an increasingly central role on the web,and the policies its contributors follow when sourcing and fact-checking content affect million of readers. Among these core guiding principles, verifiability policies have a particularly important role. Verifiability requires that information included in a Wikipedia article be corroborated against reliable secondary sources. Because of the manual labor needed to curate and fact-check Wikipedia at scale, however, its contents do not always evenly comply with these policies. Citations (i.e. reference to external sources) may not conform to verifiability requirements or may be missing altogether, potentially weakening the reliability of specific topic areas of the free encyclopedia. In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. First, we construct a taxonomy of reasons why inline citations are required by collecting labeled data from editors of multiple Wikipedia language editions. We then collect a large-scale crowdsourced dataset of Wikipedia sentences annotated with categories derived from this taxonomy. Finally, we design and evaluate algorithmic models to determine if a statement requires a citation, and to predict the citation reason based on our taxonomy. We evaluate the robustness of such models across different classes of Wikipedia articles of varying quality, as well as on an additional dataset of claims annotated for fact-checking purposes.



Citations (12)


... Some researchers have focused on user behavior related to reference usage on Wikipedia. Piccardi et al. (2020) found that engagement with citations on Wikipedia is generally low, but references are more frequently looked up when the information is not included. ...

Reference:

Open access improves the dissemination of science: insights from Wikipedia
Quantifying Engagement with Citations on Wikipedia
  • Citing Conference Paper
  • April 2020

... As a fundamental principle of Wikipedia, verifiability refers to the practice of ensuring that claims are supported by reliable sources (Petroni et al. 2023;Redi et al. 2019;Wong et al. 2021). Wong et al. (2021) highlight how the reliability and quality of Wikipedia's content are crucial, not only for human users but also for AI systems that utilize Wikipedia as training data or source of information. ...

Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia's Verifiability
  • Citing Conference Paper
  • May 2019

... Additionally, the latest research findings in the topics of this article can be found in survey papers. Zhu et al. [19] and Rudinac et al. [20] discuss the latest achievements in story summarising. Baltrušaitis et al. [21] and Guo et al. [22] survey the recent advances in multimodal machine learning and present them in a common taxonomy. ...

Rethinking Summarization and Storytelling for Modern Social Multimedia
  • Citing Chapter
  • January 2018

Lecture Notes in Computer Science

·

Tat-Seng Chua

·

·

[...]

·

... Online Video Delivery: Our work overlaps with studies looking at user generated (UGC) video platforms, e.g., [3,6,30]. These include user behaviour analysis [14,35] to understand consumption patterns [7,8,11,13,18]. Facebook Live departs from the traditional UGC model of interaction, with users live broadcasting through their mobile device and online social network (rather than simply uploading videos). ...

Like at First Sight: Understanding User Engagement with the World of Microvideos

Lecture Notes in Computer Science

... Therefore, this mentioned factor is selected for aesthetic analysis (moderate complexity) of the university library websites. The feature of moderate complexity is reported as an important factor of aesthetics in several studies [9,11,13,14,15,16,17,18,19]. ...

Bridging the Aesthetic Gap: The Wild Beauty of Web Imagery
  • Citing Conference Paper
  • June 2017

... Public opinion has become a crucial source in an individual's decision-making process regarding a product (Edara et al., 2023;Kaur & Sharma, 2023). Furthermore, sentiment analysis is a popular field of research, as it offers benefits for various aspects, ranging from social media and public opinion (Bhargav, 2022;Keakde, 2022;Talaat, 2023;Uma, 2022), stock markets and finance (Chong, 2022), brand management and marketing (Kumar, 2022;Win, 2022), elections and political analysis (Keakde, 2022;Sutriawan, 2023;Talaat, 2023), customer service and feedback (Bharathi, 2023;Bhargav, 2022), health and medical research (Che, 2023;Chong, 2022), entertainment and film (Zheng, 2019), education (Derisma, 2020;Keakde, 2022), environmental issues and natural disasters (Behl, 2021;Navarro, 2023;Nguyen, 2023;Pappas, 2017;V. Priya, 2016;Ragini, 2018), and travel and tourism (Gholipour, 2020;Luo, 2021;P. ...

Multilingual visual sentiment concept clustering and analysis

International Journal of Multimedia Information Retrieval

... Regarding color preferences, Skelton and Franklin (2020) also argued that infant looking behavior, as well as adult color preferences, are at least partially rooted in the sensory mechanisms of color vision. Especially with older age, there are systematic changes in the physiology of the eye that affect how images are perceived but we could assume that the changes due to historic context-dependent "image cultures" (Redi et al., 2016) might have a stronger influence on aesthetic preferences compared to physiological aging effects. Differences in digital affinity and peer-group effects might also exert stronger influences than research so far revealed. ...

What Makes Photo Cultures Different?
  • Citing Conference Paper
  • October 2016

... We coded two major variables, valence and arousal of an image, using image sentiment analysis software based on machine learning tools (Complura [46] and IBM Watson) for each video's thumbnail (image). The Complura System applied visual sentiment ontology consisting of more than 15,630 adjective-noun pairs (ANP) [46,47] and constructed a dataset by inputting tags and meta-data for over seven million images. ...

Complura: Exploring and Leveraging a Large-scale Multilingual Visual Sentiment Ontology

... There is no explicit paring between languages/cultures and images. Pappas et al. (2016) conduct a crowdsourcing experiment to annotate the sentiment score of visual concepts from 11 languages associated with 16,000 multilingual visual concepts. The MVSO dataset (Jou et al. 2015) is used as the source of visual concepts, and the photo-sharing service Flickr is used as the source of images. ...

Multilingual Visual Sentiment Concept Matching

... Photographs containing clothing fashions are necessary to visual-based computer recognition, with research in clothes modeling [6,7,13,53], computer-aided fashion design [48], similar style retrieval [4,29,34,52], fashion style recommendation [5,30], Internet shopping [21,31], person identification [9,39,49], clothing tracking in surveillance videos [15,45,54], and content-based image retrieval [11,31,35]. ...

Novel methods for semantic and aesthetic multimedia retrieval
  • Citing Article
  • May 2013