Nicholas Vincent's research while affiliated with Northwestern University and other places

Publications (18)

Preprint
Full-text available
Retracted scientific articles about COVID-19 vaccines have proliferated false claims about vaccination harms and discouraged vaccine acceptance. Our study analyzed the topical content of 4,876 English-language tweets about retracted COVID-19 vaccine research and found that 27.4% of tweets contained retraction-related misinformation. Misinformed twe...
Preprint
Full-text available
Retracted research discussed on social media can spread misinformation, yet we lack an understanding of how retracted articles are mentioned by academic and non-academic users. This is especially relevant on Twitter due to the platform's prominent role in science communication. Here, we analyze the pre and post retraction differences in Twitter eng...
Article
Full-text available
In this preview, we highlight what we believe to be the major contributions of the review and discuss opportunities to build on the work, including by closely examining the incentive structures that contribute to our dataset culture and by further engaging with other disciplines.
Article
A growing body of work has highlighted the important role that Wikipedia's volunteer-created content plays in helping search engines achieve their core goal of addressing the information needs of hundreds of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pa...
Article
Tech users currently have limited ability to act on concerns regarding the negative societal impacts of large tech companies. However, recent work suggests that users can exert leverage using their role in the generation of valuable data, for instance by withholding their data contributions to intelligent technologies. We propose and evaluate a new...
Preprint
Many powerful computing technologies rely on data contributions from the public. This dependency suggests a potential source of leverage: by reducing, stopping, redirecting, or otherwise manipulating data contributions, people can influence and impact the effectiveness of these technologies. In this paper, we synthesize emerging research that helps...
Preprint
A growing body of work has highlighted the important role that Wikipedia's volunteer-created content plays in helping search engines achieve their core goal of addressing the information needs of millions of people. In this paper, we report the results of an investigation into the incidence of Wikipedia links in search engine results pages (SERPs)....
Preprint
Identifying strategies to more broadly distribute the economic winnings of AI technologies is a growing priority in HCI and other fields. One idea gaining prominence centers on "data dividends", or sharing the profits of AI technologies with the people who generated the data on which these technologies rely. Despite the rapidly growing discussion a...
Article
Researchers and the media have become increasingly interested in protest users, or people who change (protest use) or stop (protest non-use) their use of a company's products because of the company's values and/or actions. Past work has extensively engaged with the phenomenon of technology non-use but has not focused on non-use (nor changed use) in...
Article
Search engines are some of the most popular and profitable intelligent technologies in existence. Recent research, however, has suggested that search engines may be surprisingly dependent on user-created content like Wikipedia articles to address user information needs. In this paper, we perform a rigorous audit of the extent to which Google levera...
Preprint
Search engines are some of the most popular and profitable intelligent technologies in existence. Recent research, however, has suggested that search engines may be surprisingly dependent on user-created content like Wikipedia articles to address user information needs. In this paper, we perform a rigorous audit of the extent to which Google levera...
Conference Paper
The public is increasingly concerned about the practices of large technology companies with regards to privacy and many other issues. To force changes in these practices, there have been growing calls for “data strikes.” These new types of collective action would seek to create leverage for the public by starving business-critical models (e.g. reco...
Article
In many traditional labor markets, women earn less on average compared to men. However, it is unclear whether this discrepancy persists in the online gig economy, which bears important differences from the traditional labor market (e.g., more flexible work arrangements, shorter-term engagements, reputation systems). In this study, we collected self...
Conference Paper
The extensive Wikipedia literature has largely considered Wikipedia in isolation, outside of the context of its broader Internet ecosystem. Very recent research has demonstrated the significance of this limitation, identifying critical relationships between Google and Wikipedia that are highly relevant to many areas of Wikipedia-based research and...

Citations

... One of the most studied platforms are web search engines -almost half of the auditing works reviewed by Bandy [5] were focused on Google alone -as a plethora of concerns have been raised about representation, biases, copyrights, transparency and discrepancies in their outputs. Research has analysed issues in areas such as elections [6][7][8][9][10][11], filter bubbles [12][13][14][15][16][17], personalised results [18,19], gender and race biases [20][21][22], health [23][24][25], source concentration [10,[26][27][28][29], misinformation [30], historical information [31,32] and dependency on user-generated content [33,34]. ...
... Even after the dataset is withdrawn, the dataset can still be found on the Internet, and the dataset can still potentially be used by machine learning systems [30]. A great deal of machine learning systems using the withdrawn dataset has been criticized by the academic community [6], [9], [36]. However, in the application community of machine learning systems, or instead the industry, perceptions related to the disappearance of datasets due to licensing of datasets to the unavailability of the system have proliferated [2], [38]. ...
... In the last 12 months, more than a dozen AFR tools have been proposed (e.g., [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32]). While most are constrained to research prototypes, a few of these tools have produced public software releases and gained significant media attention [19], [22], [33]. ...
... During the following years, Google introduced other content highly dependable on the search topic and variable in coverage and quality [37]. It is common to have this panel populated by Wikipedia information [60]. The graphics of this element was stable over time, following the improvements in Google's interface design. ...
... Ensuring fairness requires a critical look at how inclusive is the approach in factoring demographics and local context in the development process. Through data generated leverage [31], the public can exert a certain degree of power to tackle algorithmic unfairness by demanding changes or neutralising societal power imbalances [9,12,25,24,32]. ...
... While many volunteer-supported, for-profit technologies achieve great financial success, they set up inequitable power structures in the technology sector. Online volunteers who provide the crucial labor supporting these companies are subject to worsening working conditions (Matias, 2016), monetization without consent (Arrieta-Ibarra et al., 2018;Li et al., 2019;Vincent et al., 2021), and potentially exploitation (Terranova, 2000). More broadly, online volunteers often have little power to shape the technology they co-create Preprint for ICWSM '22 with for-profit companies (Vincent et al., 2021). ...
... Creager and Zemel [2021] show how algorithmic recourse can be improved through coordination. Vincent et al. [2019] examine the effectiveness of data strikes. Extending this work to the notion of data leverage, describe various ways of "reducing, stopping, redirecting, or otherwise manipulating data contributions" for different purposes. ...
... With the exception of the persona in Scenario 3, who reflects the common characteristics of food deliverers (i.e. male, young, and of an immigrant background [84]), the demographics of characters in our stories are intentionally non-representative of the general gig work population to encourage the consideration of more marginalized populations of laborers (women, elders, etc.), who often face issues such as bias, harassment, and pay gaps, all of which intersect with algorithmic control [9,41,42,61,84]. ...
... In recent years, through methods including API and web scraping, we have been able to acquire longitudinal data on online communities and study the long-term institutional development of many thousands of groups at the same time [45,[47][48][49][50][51]. In this research, we monitored over 20,000 Minecraft servers, which allow various user activities, including building with blocks, gathering resources, and interacting with each other. ...