Christopher M. Danforth's research while affiliated with University of Vermont and other places

Publications (272)

Article
Full-text available
The COVID-19 pandemic disrupted the mobility patterns of a majority of Americans beginning in March 2020. Despite the beneficial, socially distanced activity offered by outdoor recreation, confusing and contradictory public health messaging complicated access to natural spaces. Working with a dataset comprising the locations of roughly 50 million d...
Preprint
While recent studies have focused on quantifying word usage to find the overall shapes of narrative emotional arcs, certain features of narratives within narratives remain to be explored. Here, we characterize the narrative time scale of sub-narratives by finding the length of text at which fluctuations in word usage begin to be relevant. We repres...
Article
Full-text available
The U.S. stock market is one of the largest and most complex marketplaces in the global financial system. Over the past several decades, this market has evolved at multiple structural and temporal scales. New exchanges became active, and others stopped trading, regulations have been introduced and adapted, and technological innovations have pushed...
Article
Medical systems in general, and patient treatment decisions and outcomes in particular, can be affected by bias based on gender and other demographic elements. As language models are increasingly applied to medicine, there is a growing interest in building algorithmic fairness into processes impacting patient care. Much of the work addressing this...
Preprint
Full-text available
The COVID-19 pandemic disrupted the mobility patterns of a majority of Americans beginning in March 2020. Despite the beneficial, socially distanced activity offered by outdoor recreation, confusing and contradictory public health messaging complicated access to natural spaces. Working with a dataset comprising the locations of roughly 50 million d...
Article
Background: Mental health challenges are thought to affect approximately 10% of the global population each year, with many of those affected going untreated because of the stigma and limited access to services. As social media lowers the barrier for joining difficult conversations and finding supportive groups, Twitter is an open source of languag...
Article
Full-text available
The relationship between nature contact and mental well-being has received increasing attention in recent years. While a body of evidence has accumulated demonstrating a positive relationship between time in nature and mental well-being, there have been few studies comparing this relationship in different locations over long periods of time. In thi...
Article
Full-text available
When building a global brand of any kind—a political actor, clothing style, or belief system— developing widespread awareness is a primary goal. Short of knowing any of the stories or products of a brand, being talked about in whatever fashion—raw fame—is, as Oscar Wilde would have it, better than not being talked about at all. Here, we measure, ex...
Article
Full-text available
We explore the relationship between context and happiness scores in political tweets using word co-occurrence networks, where nodes in the network are the words, and the weight of an edge is the number of tweets in the corpus for which the two connected words co-occur. In particular, we consider tweets with hashtags #imwithher and #crookedhillary,...
Article
Objective: Consumption of traditional and social media markedly increased at the start of the COVID-19 pandemic as new information about the virus and safety guidelines evolved. Much of the information concerned restrictions on daily living activities and the risk posed by the virus. The term doomscrolling is used to describe the phenomenon of ele...
Article
Full-text available
Sentiment-aware intelligent systems are essential to a wide array of applications. These systems are driven by language models which broadly fall into two paradigms: Lexicon-based and contextual. Although recent contextual models are increasingly dominant, we still see demand for lexicon-based models because of their interpretability and ease of us...
Article
Full-text available
Measuring the specific kind, temporal ordering, diversity, and turnover rate of stories surrounding any given subject is essential to developing a complete reckoning of that subject’s historical impact. Here, we use Twitter as a distributed news and opinion aggregation source to identify and track the dynamics of the dominant day-scale stories arou...
Article
Full-text available
Working from a dataset of 118 billion messages running from the start of 2009 to the end of 2019, we identify and explore the relative daily use of over 150 languages on Twitter. We find that eight languages comprise 80% of all tweets, with English, Japanese, Spanish, Arabic, and Portuguese being the most dominant. To quantify social spreading in e...
Article
Full-text available
A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content. However, collapsing the texts’ rich stories into a single number is often conceptually perilous, and it is difficult to confidently interpret interesting or unexpected textual patterns...
Preprint
We define `ousiometrics' to be the study of essential meaning in whatever context that meaningful signals are communicated, and `telegnomics' as the study of remotely sensed knowledge. From work emerging through the middle of the 20th century, the essence of meaning has become generally accepted as being well captured by the three orthogonal dimens...
Preprint
Full-text available
We explore the relationship between context and happiness scores in political tweets using word co-occurrence networks, where nodes in the network are the words, and the weight of an edge is the number of tweets in the corpus for which the two connected words co-occur. In particular, we consider tweets with hashtags #imwithher and #crookedhillary,...
Preprint
BACKGROUND Mental health challenges are thought to affect approximately 10% of the global population each year, with many of those affected going untreated because of the stigma and limited access to services. As social media lowers the barrier for joining difficult conversations and finding supportive groups, Twitter is an open source of language...
Preprint
Sentiment-aware intelligent systems are essential to a wide array of applications including marketing, political campaigns, recommender systems, behavioral economics, social psychology, and national security. These sentiment-aware intelligent systems are driven by language models which broadly fall into two paradigms: 1. Lexicon-based and 2. Contex...
Article
Full-text available
Sleep loss has been linked to heart disease, diabetes, cancer, and an increase in accidents, all of which are among the leading causes of death in the United States. Population-scale sleep studies have the potential to advance public health by helping to identify at-risk populations, changes in collective sleep patterns, and to inform policy change...
Preprint
Proverbs are an essential component of language and culture, and though much attention has been paid to their history and currency, there has been comparatively little quantitative work on changes in the frequency with which they are used over time. With wider availability of large corpora reflecting many diverse genres of documents, it is now poss...
Preprint
The forecasting of political, economic, and public health indicators using internet activity has demonstrated mixed results. For example, while some measures of explicitly surveyed public opinion correlate well with social media proxies, the opportunity for profitable investment strategies to be driven solely by sentiment extracted from social medi...
Article
Full-text available
In real time, Twitter strongly imprints world events, popular culture, and the day-to-day, recording an ever-growing compendium of language change. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing curation of ov...
Preprint
Full-text available
The murder of George Floyd by police in May 2020 sparked international protests and renewed attention in the Black Lives Matter movement. Here, we characterize ways in which the online activity following George Floyd's death was unparalleled in its volume and intensity, including setting records for activity on Twitter, prompting the saddest day in...
Article
Full-text available
In September 2017, Hurricane Maria made landfall across the Caribbean region as a category 4 storm. In the aftermath, many residents of Puerto Rico were without power or clean running water for nearly a year. Using both English and Spanish tweets from September 16 to October 15 2017, we investigate discussion of Maria both on and off the island, co...
Preprint
Full-text available
Data scientists across disciplines are increasingly in need of exploratory analysis tools for data sets with a high volume of features. We expand upon graph mining approaches for exploratory analysis of high-dimensional data to introduce Sirius, a visualization package for researchers to explore feature relationships among mixed data types using mu...
Preprint
Mental health challenges are thought to afflict around 10% of the global population each year, with many going untreated due to stigma and limited access to services. Here, we explore trends in words and phrases related to mental health through a collection of 1- , 2-, and 3-grams parsed from a data stream of roughly 10% of all English tweets since...
Article
Full-text available
We study collective attention paid towards hurricanes through the lens of n -grams on Twitter, a social media platform with global reach. Using hurricane name mentions as a proxy for awareness, we find that the exogenous temporal dynamics are remarkably similar across storms, but that overall collective attention varies widely even among storms cau...
Preprint
Evolving out of a gender-neutral framing of an involuntary celibate identity, the concept of `incels' has come to refer to an online community of men who bear antipathy towards themselves, women, and society-at-large for their perceived inability to find and maintain sexual relationships. By exploring incel language use on Reddit, a global online m...
Article
Full-text available
The past decade has witnessed a marked increase in the use of social media by politicians, most notably exemplified by the 45th President of the United States (POTUS), Donald Trump. On Twitter, POTUS messages consistently attract high levels of engagement as measured by likes, retweets, and replies. Here, we quantify the balance of these activities...
Preprint
Consumption of traditional and social media markedly increased at the start of the COVID-19 pandemic as new information about the virus and safety guidelines evolved. Much of the information concerned restrictions on daily living activities and the risk posed by the virus. The term ``doomscrolling'' was used to describe the phenomenon of elevated n...
Article
Full-text available
Human mortality is in part a function of multiple socioeconomic factors that differ both spatially and temporally. Adjusting for other covariates, the human lifespan is positively associated with household wealth. However, the extent to which mortality in a geographical region is a function of socioeconomic factors in both that region and its neigh...
Preprint
Medical systems in general, and patient treatment decisions and outcomes in particular, are affected by bias based on gender and other demographic elements. As language models are increasingly applied to medicine, there is a growing interest in building algorithmic fairness into processes impacting patient care. Much of the work addressing this que...
Preprint
Full-text available
Sleep loss has been linked to heart disease, diabetes, cancer, and an increase in accidents, all of which are among the leading causes of death in the United States. Population-scale sleep studies have the potential to advance public health by helping to identify at-risk populations, changes in collective sleep patterns, and to inform policy change...
Article
Full-text available
In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as...
Article
Full-text available
Abstract We introduce a qualitative, shape-based, timescale-independent time-domain transform used to extract local dynamics from sociotechnical time series—termed the Discrete Shocklet Transform (DST)—and an associated similarity search routine, the Shocklet Transform And Ranking (STAR) algorithm, that indicates time windows during which panels of...
Article
Chimera states — the coexistence of synchrony and asynchrony in a nonlocally-coupled network of identical oscillators — are often used as a model framework for epileptic seizures. Here, we explore the dynamics of chimera states in a network of modified Hindmarsh–Rose neurons, configured to reflect the graph of the mesoscale mouse connectome. Our mo...
Preprint
Real-world complex systems often comprise many distinct types of elements as well as many more types of networked interactions between elements. When the relative abundances of types can be measured well, we further observe heavy-tailed categorical distributions for type frequencies. For the comparison of type frequency distributions of two systems...
Preprint
Maintaining the integrity of long-term data collection is an essential scientific practice. As a field evolves, so too will that field's measurement instruments and data storage systems, as they are invented, improved upon, and made obsolete. For data streams generated by opaque sociotechnical systems which may have episodic and unknown internal ru...
Preprint
Measuring the specific kind, temporal ordering, diversity, and turnover rate of stories surrounding any given subject is essential to developing a complete reckoning of that subject's historical impact. Here, we use Twitter as a distributed news and opinion aggregation source to identify and track the dynamics of the dominant day-scale stories arou...
Preprint
A common task in computational text analyses is to quantify how two corpora differ according to a measurement like word frequency, sentiment, or information content. However, collapsing the texts' rich stories into a single number is often conceptually perilous, and it is difficult to confidently interpret interesting or unexpected textual patterns...
Preprint
In real-time, Twitter strongly imprints world events, popular culture, and the day-to-day; Twitter records an ever growing compendium of language use and change; and Twitter has been shown to enable certain kinds of prediction. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spread...
Preprint
In September 2017, Hurricane Mar\'ia made landfall across the Caribbean region as a category 4 storm. In the aftermath, many residents of Puerto Rico were without power or clean running water for nearly a year. Using both English and Spanish tweets from September 16 to October 15 2017, we investigate discussion of Mar\'ia both on and off the island...
Preprint
Full-text available
The relationship between nature contact and mental well-being has received increasing attention in recent years. While a body of evidence has accumulated demonstrating a positive relationship between time in nature and mental well-being, there have been few studies comparing this relationship in different locations over long periods of time. In thi...
Preprint
Human mortality is in part a function of multiple socioeconomic factors that differ both spatially and temporally. Adjusting for other covariates, the human lifespan is positively associated with household wealth. However, the extent to which mortality in a geographical region is a function of socioeconomic factors in both that region and its neigh...
Preprint
The past decade has witnessed a marked increase in the use of social media by politicians, most notably exemplified by the 45th President of the United States (POTUS), Donald Trump. On Twitter, POTUS messages consistently attract high levels of engagement as measured by likes, retweets, and replies. Here, we quantify the balance of these activities...
Article
Full-text available
Stretched words like ‘heellllp’ or ‘heyyyyy’ are a regular feature of spoken language, often used to emphasize or exaggerate the underlying meaning of the root word. While stretched words are rarely found in formal written language and dictionaries, they are prevalent within social media. In this paper, we examine the frequency distributions of ‘st...
Preprint
Sleep loss has been linked to heart disease, diabetes, cancer, and an increase in accidents, all of which are among the leading causes of death in the United States. Population-scale sleep studies have the potential to advance public health by helping to identify at-risk populations, changes in collective sleep patterns, and to inform policy change...
Preprint
Using a random 10% sample of tweets authored from 2019-09-01 through 2020-03-25, we analyze the dynamic behavior of words (1-grams) used on Twitter to describe the ongoing COVID-19 pandemic. Across 24 languages, we find two distinct dynamic regimes: One characterizing the rise and subsequent collapse in collective attention to the initial Coronavir...
Preprint
We study collective attention paid towards hurricanes through the lens of $n$-grams on Twitter, a social media platform with global reach. Using hurricane name mentions as a proxy for awareness, we find that the exogenous temporal dynamics are remarkably similar across storms, but that overall collective attention varies widely even among storms ca...
Preprint
In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as...
Preprint
Working from a dataset of 118 billion messages running from the start of 2009 to the end of 2019, we identify and explore the relative daily use of over 150 languages on Twitter. We find that eight languages comprise 80% of all tweets, with English, Japanese, Spanish, and Portuguese being the most dominant. To quantify each language's level of bein...
Preprint
Complex systems often comprise many kinds of components which vary over many orders of magnitude in size: Populations of cities in countries, individual and corporate wealth in economies, species abundance in ecologies, word frequency in natural language, and node degree in complex networks. Comparisons of component size distributions for two compl...
Article
Foreign power interference in domestic elections is an existential threat to societies. Manifested through myriad methods from war to words, such interference is a timely example of strategic interaction between economic and political agents. We model this interaction between rational game players as a continuous-time differential game, constructin...
Article
Full-text available
Using the most comprehensive source of commercially available data on the US National Market System, we analyze all quotes and trades associated with Dow 30 stocks in calendar year 2016 from the vantage point of a single and fixed frame of reference. We find that inefficiencies created in part by the fragmentation of the equity marketplace are rela...
Article
Objective: Serious illness conversations are complex clinical narratives that remain poorly understood. Natural Language Processing (NLP) offers new approaches for identifying hidden patterns within the lexicon of stories that may reveal insights about the taxonomy of serious illness conversations. Methods: We analyzed verbatim transcripts from...
Preprint
When building a global brand of any kind---a political actor, clothing style, or belief system---developing widespread awareness is a primary goal. Short of knowing any of the stories or products of a brand, being talked about in whatever fashion---raw fame---is, as Oscar Wilde would have it, better than not being talked about at all. Here, we meas...
Article
Full-text available
With more people living in cities, we are witnessing a decline in exposure to nature. A growing body of research has demonstrated an association between nature contact and improved mood. Here, we used Twitter and the Hedonometer, a world analysis tool, to investigate how sentiment, or the estimated happiness of the words people write, varied before...
Preprint
Foreign power interference in domestic elections is an age-old, existential threat to societies. Manifested through myriad methods from war to words, such interference is a timely example of strategic interaction between economic and political agents. We model this interaction between rational game players as a continuous-time differential game, co...
Preprint
Full-text available
Chimera states---the coexistence of synchrony and asynchrony in a nonlocally-coupled network of identical oscillators---are often used as a model framework for epileptic seizures. Here, we explore the dynamics of chimera states in a network of modified Hindmarsh-Rose neurons, configured to reflect the graph of the mesoscale mouse connectome. Our mo...
Preprint
Full-text available
This project examined perceptions of the vegan lifestyle using surveys and social media to explore barriers to choosing veganism. A survey of 510 individuals indicated that non-vegans did not believe veganism was as healthy or difficult as vegans. In a second analysis, Instagram posts using #vegan suggest content is aimed primarily at the female ve...
Preprint
Stretched words like `heellllp' or `heyyyyy' are a regular feature of spoken language, often used to emphasize or exaggerate the underlying meaning of the root word. While stretched words are rarely found in formal written language and dictionaries, they are prevalent within social media. In this paper, we examine the frequency distributions of `st...
Preprint
We introduce an unsupervised pattern recognition algorithm termed the Discrete Shocklet Transform (DST) by which local dynamics of time series can be extracted. Time series that are hypothesized to be generated by underlying deterministic mechanisms have significantly different DSTs than do purely random null models. We apply the DST to a sociotech...
Article
Full-text available
Natural hazards are becoming increasingly expensive as climate change and development are exposing communities to greater risks. Preparation and recovery are critical for climate change resilience, and social media are being used more and more to communicate before, during, and after disasters. While there is a growing body of research aimed at und...
Data
Time-series frequency of all selected terms during Hurricane Irene at 1, 3, 12, and 24-hour resolutions. (PDF)
Data
Time-series frequency of all selected terms during Louisiana flooding at 1, 3, 12, and 24-hour resolutions. (PDF)
Data
Time-series frequency of all selected terms during Midwest/Southeast tornadoes at 1, 3, 12, and 24-hour resolutions. (PDF)