Victoria Rubin

Victoria Rubin
The University of Western Ontario | UWO · Faculty of Information and Media Studies

MA Linguistics/PhD Information Science&Technology

About

81
Publications
393,911
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,622
Citations
Introduction
I specialize in information retrieval and natural language processing techniques that enable analyses of texts to identify, extract, and organize structured knowledge. I study complex human information behaviors that are, at least partly, expressed through language such as deception, uncertainty, credibility, and emotions. Multilingual information access and information organization are my other research interests: http://victoriarubin.fims.uwo.ca/
Additional affiliations
July 2006 - present
The University of Western Ontario
Position
  • Associate Professor, Director of the Language & Information Technology Research Lab (LiT.RL)
September 2001 - May 2004
Syracuse University
Position
  • Research Assistant
Education
September 2001 - May 2006
Syracuse University
Field of study
  • Information Studies

Publications

Publications (81)
Conference Paper
Full-text available
This research examines the concept of 'fake news' in the context of information literacy (IL) in a post-secondary educational setting. Educators' perceptions shape both IL curricula and classroom discussions with students. We conducted 18 interviews with members of 3 integral groups implementing IL education (8 professors, 6 librarians, 4 departmen...
Conference Paper
Full-text available
Native ads are ubiquitous in the North American digital news context. Their form, content and presentational style are practically indistinguishable from regular news editorials, and thus are often mistaken for informative content by newsreaders. This advertising practice is deceptive, in that it exploits loopholes in human digital literacy. Despit...
Article
Full-text available
The LiT.RL News Verification Browser is a research tool for news readers, journalists, editors or information professionals. The tool analyzes the language used in digital news web pages to determine if they are clickbait, satirical news, or falsified news, and visualizes the results by highlighting content in color-coded categories. Although the c...
Article
Abstract: Automatic clickbait detection is a relatively novel task in natural language processing (NLP) and machine learning (ML). "Clickbait" is a hyperlink created primarily to attract attention to its target content. This article introduces a binary classifier, the Language and Information Technology Research Lab (LiT.RL, pronounced "literal") C...
Book
Full-text available
Only the Introductory Chapter is freely available. The full book/e-book/individual chapters/e-chapters are available via Springer Publishers https://link.springer.com/book/10.1007/978-3-030-95656-1 or via any other major book retailer including Amazon or Barnes&Noble (in US or Canada) or Lehmanns (in Europe). Introductory Chapter Abstract How do...
Chapter
The complexity in finding solutions to the socio-technological problem of mis- and disinformation lies in our human nature. The mind requires practical skills and digital literacy in order to overcome this problem. Socio-political and economic systems incentivize the spread of the infodemic across toxic digital media environments and require the pu...
Chapter
Chapter 7 focuses on artificially intelligent (AI) systems that can help the human eye identify fakes of several kinds and call them out for the benefit of the public good. I explain, in plain language, the principles behind the AI-based methodologies employed by automated deception detectors, clickbait detectors, satirical fake detectors, rumor de...
Chapter
Chapter 2 focuses on deception as a communicative behavior and establishes, in broad strokes, what deceptive strategies can be used by deceivers and mass disinformers, and what motivates deceptive communication. We consider definitions of deception and typological distinctions, assuming some deceptive strategies have found their way into the digita...
Chapter
Chapter 1 frames the problem of deceptive, inaccurate, and misleading information in the digital media content and information technologies as an infodemic. Mis- and disinformation proliferate online, yet the solution remains elusive and many of us run the risk of being woefully misinformed in many aspects of our lives including health, finances, a...
Chapter
Many practices in marketing, advertising, and public relations, presented in Chapter 6, have the intent to persuade and manipulate the public opinion from the onset of their endeavors. I lay out marketing communications strategies and dissect the anatomy of the ad revenue model. I review key ideas in advertising standards and self-regulation polici...
Chapter
Chapter 4 establishes that truth can be seen from different philosophical perspectives, and our methods for connecting beliefs to reality and establishing facts matter for determining the resulting cumulative knowledge. We may not all agree on what truth is, but there is little doubt that truth matters. It is essential to us—as individuals and as a...
Chapter
Chapter 5 focuses on empirical knowledge as it is applied in three sample professions using stepwise procedures to establish facts, detect lies, or discern truth. Law enforcement, scientific inquiry, and investigative reporting each use well-established traditions for truth-seeking, systematic ways of collecting strong supportive evidence, and cond...
Chapter
Chapter 3 surveys credibility and trust research in information science, human–computer interaction, psychology, communication, and other social sciences. Several models explain the process of credibility assessment, dissecting it into stages and offering components for online content evaluation. Multiple predictive indicators have been considered...
Article
Artificially Intelligent (AI) systems are pervasive, but poorly understood by their users and, at times, developers. It is often unclear how and why certain algorithms make choices, predictions, or conclusions. What does AI transparency mean? What explanations do AI system users desire? This panel discusses AI opaqueness with examples in applied co...
Conference Paper
Full-text available
Artificially Intelligent (AI) systems are pervasive, but poorly understood by their users and, at times, developers. It is often unclear how and why certain algorithms make choices, predictions, or conclusions. What does AI transparency mean? What explanations do AI system users desire? This panel discusses AI opaqueness with examples in applied co...
Chapter
This chapter describes a study that interviewed 18 participants (8 professors, 6 librarians, and 4 department chairs) about their perceptions of ‘fake news' in the context of their educational roles in information literacy (IL) within a large Canadian university. Qualitative analysis of the interviews reveals a substantial overlap in these educator...
Article
Purpose The purpose of this paper is to treat disinformation and misinformation (intentionally deceptive and unintentionally inaccurate misleading information, respectively) as a socio-cultural technology-enabled epidemic in digital news, propagated via social media. Design/methodology/approach The proposed disinformation and misinformation triang...
Conference Paper
Full-text available
With the problem of ‘fake news’ in the digital media, there are efforts at creation of awareness, automation of ‘fake news’ detection and news literacy. This research is descriptive as it pulls evidence from the content of online fabricated news for the features that distinguish fabrications from the legitimate political news around the time of the...
Article
Full-text available
This paper offers a conceptual basis and describes elements for a multi-layered system to provide information users (newsreaders) with credible information and improve the work processes of the online news (content) producers. I overview criteria of excellence (what editors consider newsworthy) and how reporters (and traditional newsroom profession...
Article
Native advertising, paid for by corporate funding, may fool news readers into thinking that they are reading investigative journalism editorials. Such misleading practice constitutes an internal threat to the profession of journalism and may further deteriorate mainstream media trust. If information users are unaware of the Native Ads original prom...
Article
Full-text available
Clickbait is a class of internet content characterized by attention-grabbing headlines, but is criticized for being shallow, misleading, or deceptive. Information sciences can offer a range of solutions to clickbaiting, but the field lacks a concrete, unifying definition of the phenomenon. This posteraddresses this need by investigating perceptions...
Article
Full-text available
This research examines the concept of ‘fake news’ in the context of information literacy (IL) in a post‐secondary educational setting. Educators' perceptions shape both IL curricula and classroom discussions with students. We conducted 18 interviews with members of 3 integral groups implementing IL education (8 professors, 6 librarians, 4 departmen...
Conference Paper
Full-text available
The News Verification Suite aims to provide users with a set of functions to verify information in the news. This paper offers a conceptual basis and a vision of system elements towards automated fact-checking in news production, curation, and consumption. The traditional model of journalism is compared to 'news sharing a.s.a.p.', highlighting simi...
Conference Paper
Full-text available
Native advertising, paid for by corporate funding, may fool news readers into thinking that they are reading investigative journalism editorials. Such misleading practice constitutes an internal threat to the profession of journalism and may further deteriorate mainstream media trust. If information users are unaware of the Native Ads original prom...
Presentation
Full-text available
I conclude that social media requires content verification analysis with a combination of previously known approaches for deception detection, as well as novel techniques for debunking rumors, credibility assessment, factivity analysis and opinion mining. Hybrid approaches may include text analytics with machine learning for deception detection, ne...
Chapter
An op-ed commissioned by Tom Zeller Jr, a former New York Times editor, now the Editor-in-Chief of the Undark Magazine, out of MIT. The article was published on 23 November 2016: http://undark.org/article/education-and-automation-tools-for-navigating-a-sea-of-fake-news/. Introduced as: “Every man should have a built-in automatic crap detector opera...
Chapter
The main premise of this chapter is that the time is ripe for more extensive research and development of social media tools that filter out intentionally deceptive information such as deceptive memes, rumors and hoaxes, fake news or other fake posts, tweets and fraudulent profiles. Social media users’ awareness of intentional manipulation of online...
Conference Paper
Full-text available
Satire is an attractive subject in deception detection research: it is a type of deception that intentionally incorporates cues revealing its own deceptiveness. Whereas other types of fabrications aim to instill a false sense of truth in the reader, a successful satirical hoax must eventually be exposed as a jest. This paper provides a conceptual o...
Conference Paper
Full-text available
Tabloid journalism is often criticized for its propensity for exaggeration, sensationalization, scare-mongering, and otherwise producing misleading and low quality news. As the news has moved online, a new form of tabloidization has emerged: ‘clickbaiting.’ ‘Clickbait’ refers to “content whose main purpose is to attract attention and encourage visi...
Conference Paper
Full-text available
A fake news detection system aims to assist users in detecting and filtering out varieties of potentially deceptive news. The prediction of the chances that a particular news item is intentionally deceptive is based on the analysis of previously seen truthful and deceptive news. A scarcity of deceptive news, available as corpora for predictive mode...
Conference Paper
Full-text available
This research surveys the current state-of-the-art technologies that are instrumental in the adoption and development of fake news detection. " Fake news detection " is defined as the task of categorizing news along a continuum of veracity, with an associated measure of certainty. Veracity is compromised by the occurrence of intentional deceptions....
Conference Paper
Full-text available
Widespread adoption of internet technologies has changed the way that news is created and consumed. The current online news environment is one that incentivizes speed and spectacle in reporting, at the cost of fact-checking and verification. The line between user generated content and traditional news has also become increasingly blurred. This post...
Article
Full-text available
Purpose – The purpose of this paper is to respond to Urquhart and Urquhart’s critique of the previous work entitled “Discourse structure differences in lay and professional health communication”, published in this journal in 2012 (Vol. 68 No. 6, pp. 826-851, doi: 10.1108/00220411211277064). Design/methodology/approach – The authors examine Urquhar...
Conference Paper
Full-text available
News verification is a process of determining whether a particular news report is truthful or deceptive. Deliberately deceptive (fabricated) news creates false conclusions in the readers' minds. Truthful (authentic) news matches the writer's knowledge. How do you tell the difference between the two in an automated way? To investigate this question,...
Article
This paper furthers the development of methods to distinguish truth from deception in textual data. We use rhetorical structure theory (RST) as the analytic framework to identify systematic differences between deceptive and truthful stories in terms of their coherence and structure. A sample of 36 elicited personal stories, self-ranked as truthful...
Article
Full-text available
In hopes of sparking a discussion, I argue for much needed research on automated deception detection in Asian languages. The task of discerning truthful texts from deceptive ones is challenging, but a logical sequel to opinion mining. I suggest that applied computational linguists pursue broader interdisciplinary research on cultural differences an...
Conference Paper
Full-text available
This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference error...
Article
Full-text available
This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference error...
Article
We investigate health care provider and lay consumer perspectives in online health communication, information sharing, and use to improve communication that supports healthy everyday life behavior. With Rhetorical Structure Theory analysis, we differentiate discourse structure patterns and communicative goals in provider and consumer answers regard...
Article
The geographic clues embedded in MARC records have the potential to transform the ways in which library materials are searched and accessed. Focussing on cartographic materials, this study examines the MARC fields used to catalogue maps within two Canadian university library systems.Les indices géographiques que contiennent les fiches MARC ont le p...
Article
Full-text available
This research presents the results of a case study on potential users of Cross Language Information Retrieval (CLIR) systems –international students at the University of Western Ontario. The study is designed to test their awareness of Multi-Lingual Information Access (MLIA) tools on the internet and in select electronic databases. The study also i...
Article
Though not new to online gamers, griefing – an act of play intended to cause grief to game players – is understudied in LIS scholarship. We expand on the definition of griefing for library contexts by considering its deceptive elements and examining gamers’ attitudes in a gaming forum and an e-mail survey.Bien que connu des joueurs de jeux vidéo, l...
Article
This paper analyzes naturally occurring descriptions of chance encounters as found in blogs. We develop a model of serendipity that describes facets of the phenomenon and their interconnections, and examine the applicability of this model to accounts of everyday chance encounters.Cet article analyse les occurrences naturelles des descriptions de ha...
Article
Being innovative is a popular but ambiguous maxim in LIS. To elucidate how institutions use, and what they mean by the concept, we examine white literature and survey website features of 160 libraries across US and Canada. We identify patterns in the language and ethos of modern innovative librarianship.Être novateur est une maxime populaire bien q...
Conference Paper
Full-text available
This study analyses 545 sample fanfiction stories (fics) in their stylistic feature variation by popularity and across eleven 'fandoms' in creative writing forums. Lexical richness, average sentence and paragraph lengths are isolated as promising measures for a text classifier to use in predicting a fic's likely popularity in its fandom. Cette ét...
Article
Full-text available
Purpose – Though not new to online gamers, griefing – an act of play intended to cause grief to game players – is fairly understudied in LIS scholarship. The purpose of this paper is to expand the inventory of griefing varieties, consider their deceptive elements and examine attitudes towards the phenomenon. Design/methodology/approach – The author...
Data
Information Manipulation is an umbrella term we use for a variety of distortions that occur in the process of transmitting information in the information channel (between human agents via artifacts and various presentation formats). Extending the classical Shannon-Weaver's model of information transmission, we consider alternative outcomes of the t...
Article
Full-text available
An earlier version of this paper was presented at the 2011 Canadian Association for Information Science conference. The authors wish to thank the Everydayhelth.com forum participants whose publicly available questions and answers illuminate new perspectives on lay and professional health communication. The authors are also grateful for suggestions...
Article
Full-text available
This paper reviews advances in geospatial information systems and applications involving geospatial information and natural language. We discuss the role of geographically aware information access in human information behaviours such as information seeking, retrieval, and use, and highlight the role of automation in enriching current geospatial met...
Article
Full-text available
Recent improvements in effectiveness and accuracy of the emerging field of automated deception detection and the associated potential of language technologies have triggered increased interest in mass media and general public. Computational tools capable of alerting users to potentially deceptive content in computer–mediated messages are invaluable...
Conference Paper
Full-text available
Some researchers have suggested that opportunities for serendipitous discovery of information may be limited in the online environment as a result of technological facilitation of information behavior. In response, they suggest building tools that enhance opportunities for serendipity. Based on our model of everyday serendipity, we offer design sug...
Conference Paper
Full-text available
This research presents the results of a case study on potential users of Cross Language Information Retrieval (CLIR) systems --- international students at a Canadian University. The study is designed to test their awareness of Multi-Lingual Information Access (MLIA) tools on the internet and in select electronic databases. The study investigates ho...
Article
One of the novel research directions in Natural Language Processing and Machine Learning involves creating and developing methods for automatic discernment of deceptive messages from truthful ones. Mistaking intentionally deceptive pieces of information for authentic ones (true to the writer's beliefs) can create negative consequences, since our ev...
Article
This paper extends information quality (IQ) assessment methodology by arguing that veracity/deception should be one of the components of intrinsic IQ dimensions. Since veracity/deception differs contextually from accuracy and other well-studied components of intrinsic IQ, the inclusion of veracity/deception in the set of IQ dimensions has its own c...
Article
Full-text available
The Information Manipulation Classification Theory offers a systematic approach to understanding the differences and similarities among various types of information manipulation (such as falsification, exaggeration, concealment, misinformation or hoax). We distinguish twelve salient factors that manipulation varieties differ by (such as intentional...
Article
Full-text available
Though innovation is a popular theme of LIS literature, its specific meaning for libraries remains obscure. Clarifying the implicit definition of innovation in librarianship can facilitate a more meaningful use of the term. To do so, we employ a ground-up exploration of innovation through the white literature in conjunction with a detailed survey o...
Article
Full-text available
Introduction. This paper explores serendipity in the context of everyday life by analyzing naturally occurring accounts of chance encounters in blogs. Method. We constructed forty-four queries related to accidental encounters to retrieve accounts from GoogleBlog. From among the returned results, we selected fifty-six accounts that provided a rich d...
Article
Full-text available
:Though innovation is a popular theme of LIS literature, its specific meaning for libraries remains obscure. Clarifying the implicit definition of innovation in librarianship can facilitate a more meaningful use of the term. To do so, we employ a ground-up exploration of innovation through the white literature in conjunction with a detailed survey...
Article
Deception detection remains novel, challenging, and important in natural language processing, machine learning, and the broader LIS community. Computational tools capable of alerting users to potentially deceptive content in computer-mediated messages are invaluable for supporting undisrupted, computer-mediated communication, information seeking, c...
Article
Full-text available
In this panel we will discuss the importance of knowledge organization and information organization in library and information science curricula and the emerging trends both inside and outside of library and information science which will affect the curriculum in coming years.
Article
Full-text available
Purpose – Conversational agents are natural language interaction interfaces designed to simulate conversation with a real person. This paper seeks to investigate current development and applications of these systems worldwide, while focusing on their availability in Canadian libraries. It aims to argue that it is both timely and conceivable for Can...
Article
Serendipity has received much attention from library and information science, psychology, and computer science. Yet not much is known about serendipity in the context of everyday information behavior. In general, a key challenge in the study of serendipity is obtaining accounts of serendipitous experiences that provide insight into the phenomenon....
Article
We present a comparative study of abstracts and machine-generated summaries. This study bridges two hitherto independent lines of research: the descriptive analyses of abstracts as a genre and the testing of summaries produced by automatic text summarization (ATS). A pilot sample of eight articles was gathered from Library and Information Science A...
Article
Deception in computer-mediated communication is defined as a message knowingly and intentionally transmitted by a sender to foster a false belief or conclusion by the perceiver. Stated beliefs about deception and deceptive messages or incidents are content analyzed in a sample of 324 computer-mediated communications. Relevant stated beliefs are obt...
Article
This article introduces a type of uncertainty that resides in textual information and requires epistemic interpretation on the information seeker’s part. Epistemic modality, as defined in linguistics and natural language processing, is a writer’s estimation of the validity of propositional content in texts. It is an evaluation of chances that a cer...
Conference Paper
Full-text available
This paper defines a concept of “trust incident accounts” as verbal reports of empirical episodes in which a trustor has reached a state of positive or negative expectations of a trustee’s behavior under associated risks. Such expectations are equated to trust and distrust. Correspondingly, and present a sharp contrast with hypocritical use of trus...
Conference Paper
Full-text available
Texts exhibit subtle yet identifiable mo- dality about writers' estimation of how true each statement is (e.g., definitely true or somewhat true). This study is an analy- sis of such explicit certainty and doubt markers in epistemically modalized statements for a written news discourse. The study systematically accounts for five levels of writer's...
Conference Paper
Full-text available
Thesis
Full-text available
This study empirically derives a framework for analyzing certainty about written propositions. CERTAINTY, or EPISTEMIC MODALITY, is a linguistic expression of an estimation of the likelihood that a particular state of affairs is, has been, or will be true. The study describes how explicitly marked certainty can be predictably and dependably identi...
Chapter
Full-text available
This chapter presents a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. Our contribution is in a proposed categorization model and analytical framework for certainty identification. Certainty is presented as a type of subjective information available in text...
Conference Paper
Full-text available
Credibility is a perceived quality and is evaluated with at least two major components: trustworthiness and expertise. Weblogs (or blogs) are a potentially fruitful genre for exploration of credibility assessment due to public disclosure of information that might reveal trustworthiness and expertise by webloggers (or bloggers) and availability of a...
Article
Full-text available
The huge increase in volume of online literature has led to a parallel surge in research into methods for retrieving meaningful information from this textual data—"content extraction" has emerged as a prominent field in natural language computing. However, little progress has as yet been made in determining the pragmatic content of a doc-ument, 'hi...
Article
Full-text available
We present an empirically verified model of discernable emotions, Watson and Tellegen’s Circumplex Theory of Affect from social and personality psychology, and suggest its usefulness in NLP as a potential model for an automation of an eight-fold categorization of emotions in written English texts. We developed a data collection tool based on the mo...
Article
Full-text available
We present a theoretical framework and preliminary results for manual categorization of explicit certainty information in 32 English newspaper articles. The explicit certainty markers were identified and categorized according to the four hypothesized dimensions – perspective, focus, timeline, and level of certainty. One hundred twenty one sentences...
Article
Full-text available
The authors describe the difficulties of translating classifications from a source language and culture to another language and culture. To demonstrate these problems, kinship terms and concepts from native speakers of fourteen languages were collected and analyzed to find differences between their terms and structures and those used in English. Us...

Questions

Questions (30)
Question
What's your go-to book on misinformation, disinformation, or fake news? If you have one, please share. Also, why do you like it? How recent is it? Is there anything amiss in that favourite book of yours?
Do you care to read about broader or more specific issues in
how information pollution, toxicity, or manipulations in online news and/or social media
relates to your professional or everyday life?
I'm very curious about your perspective, if you'd like to share.
Thanks much!
Question
I'm curious to hear from practitioners in the newsroom. 1. Are there any mundane tasks that you are tired of doing and you wished you could off-load to some "magic wand" technology? 2. What would you imagine it doing for you? If you had a "crystal ball", what would you wish it were able to tell you to do your job better? (No, not the future! Rather, from the realm of what's knows as of now, or from the past, no matter how recent.) Thanks for your insights!
Question
While the media hype about "fake news" may have died down in North America a bit, the issue of various kinds of viral deception in digital environments (or CMC) have not gone away. Not in North America, nor worldwide. I'm looking to see which research institutions, labs, or perhaps IT or social media companies (beyond the obvious major players like Facebook and Twitter), have picked up the issue and made it their priority for research and development since the 2016 US Elections made it "big news". If you have seen (or produced) any reports or publications on fake news identification, fact-checking or news verification (in both scientific conferences/journals and media reports of those), I'd very much appreciate being alerted! Are you aware of any ongoing projects which are relevant to combating "fake news" and improving news quality and credibility? Please send me a link, no matter what language the information you find is in. Thank you very much!
Question
I'm looking for cases of justified deception. If you see anything reported in the news, would you kindly provide a link? If you have a story to tell or an opinion to share, I'd also be very curious.
Is it always (morally) wrong to lie?
An example may include an episode on This American Life "In Defense of Ignorance" in which "
Lulu Wang tells the story of an elaborate attempt to keep someone ignorant — her grandmother — and how her family pulled it off". The grandmother (who lived in China) was not told of her terminal illness. One important fact was concealed from her: she had cancer and her doctor predicted she only had 6 months to live. This morally debatable act of withholding the diagnosis (at least in the North American context of the 21st century) is apparently customary in China, and some other parts of the world (Russia, for instance). Apparently it was common in patience care in Canada in 1950s as well. The lies is told as justifiable since the Chinese grandmother lived another 3 years after the diagnosis but who knows how the knowledge of her terminal illness would have impacted her, had she been told the prognosis. This is just one example.
Another one is found in the Guardian  by an American philosopher (based in the UK), James Garvey in his article "Peter Gleick lied, but was it justified by the wider good?" (Feb., 27, 2012).
I'm aware of the philosophical debate on whether lies are justifiable (e.g., the murderer at the doorstep question: would you lie about your family members sleeping in the house?). But what's I'm looking for is recent examples documented in the press in which lying may be acceptable for a reason. I would much appreciate the help of the ResearchGate community to trace them down. Thank you very much!
VR.

Network

Cited By

Projects

Project (1)
Project
How do you tell when a text is deceptive? How do you tell when news are fake?