Cecilia R. Aragon

Cecilia R. Aragon
University of Washington | UW · Department of Human Centered Design and Engineering

PhD

About

258
Publications
56,841
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,737
Citations
Introduction
Cecilia R. Aragon is a Professor in the Department of Human Centered Design and Engineering, University of Washington Seattle. Her research focuses on human-centered data science/AI, enabling humans to explore and gain insight from vast data sets. Current projects include visual analytics of large text data, games for good, aviation safety.
Additional affiliations
January 2005 - December 2011
Education
December 2004 - December 2004
University of California, Berkeley
Field of study
  • Computer Science

Publications

Publications (258)
Conference Paper
Full-text available
There has been much recent interest in the development of tools to foster remote collaboration and shared creative work. An open question is: what are the guidelines for this process? What are the key socio-technical preconditions required for a geographically distributed group to collaborate effectively on creative work, and are they different fro...
Conference Paper
Full-text available
A randomized strategy for maintaining balance in dynamically changing search trees that has optimal expected behavior is presented. In particular, in the expected case an update takes logarithmic time and requires fewer than two rotations. Moreover, the update time remains logarithmic, even if the cost of a rotation is taken to be proportional to t...
Conference Paper
Full-text available
Computational and experimental sciences produce and collect ever- larger and complex datasets, often in large-scale, multi-institution projects. The inability to gain insight into complex scientific phenomena using current software tools is a bottleneck facing virtually all endeavors of science. In this paper, we introduce Sunfall, a collaborative...
Article
The Latinx diaspora in the United States is a rapidly growing and complex demographic who face intersectional harms and marginalizations in sociotechnical systems and are currently underserved in CSCW research. While the field understands that algorithms and digital content are experienced differently by marginalized populations, more investigation...
Preprint
Full-text available
The Latinx diaspora in the United States is a rapidly growing and complex demographic who face intersectional harms and marginalizations in sociotechnical systems and are currently underserved in CSCW research. While the field understands that algorithms and digital content are experienced differently by marginalized populations, more investigation...
Article
The success of online communities is driven by continuous active community participation, but motivating silent members (lurkers) to participate might push them faster than they are prepared for and demand extra labor from active members. This article explores user participation in online fan fiction communities through empirical interviews of once...
Article
Full-text available
In this article we discuss workplace contexts for post-secondary STEM faculty and staff with disabilities and provide pragmatic and immediate steps we can all take to advance equitable disability-informed policies and practices. Findings: Faculty with apparent and/or unseen disabilities experience myriad barriers in their academic workplaces, barri...
Chapter
Why have online artist communities largely rejected AI image generators when they have embraced other technologies? We focus on cooperative design and community values to first frame these communities as digital counterculture akin to transformative fandom. Then, we use this framework to explore the online art community’s core principles in order t...
Chapter
Machine Learning is a powerful tool, but it also has a great potential to cause harm if not approached carefully. Designers must be reflexive and aware of their algorithms’ impacts, and one such way of reflection is known as human-centered machine learning. In this paper, we approach a classical problem that has been approached through ML - sentime...
Chapter
Qualitative coding of large datasets has been a valuable tool for qualitative researchers. In terms of inter-rater reliability, existing metrics have not evolved to fit current approaches, presenting a variety of restrictions. In this paper, we propose Generalized Cohen’s kappa, a novel IRR metric that can be applied in a variety of qualitative cod...
Chapter
As people continue to develop friendships over the Internet in greater numbers than in-person, the complex factors behind them become important to study. One such factor is emotional expression, and we are motivated to better understand how it plays a role in both continuing existing and building new friendships. In this study, we examined the role...
Article
Full-text available
We apply the color–magnitude intercept calibration method (CMAGIC) to the Nearby Supernova Factory SNe Ia spectrophotometric data set. The currently existing CMAGIC parameters are the slope and intercept of a straight line fit to the linear region in the color–magnitude diagram, which occurs over a span of approximately 30 days after maximum bright...
Article
Full-text available
We calibrate spectrophotometric optical spectra of 32 stars commonly used as standard stars, referenced to 14 stars already on the Hubble Space Telescope–based CALSPEC flux system. Observations of CALSPEC and non-CALSPEC stars were obtained with the SuperNova Integral Field Spectrograph over the wavelength range 3300–9400 Å as calibration for the N...
Article
This exploratory study investigates the use of game development as a speculative activity to teach data science ethics incorporating the Directed Research Groups (DRG) format, that decentralizes classroom dynamics, emulates real‐life working environments, and offers students creative choices driven by their own interests. This DRG focuses on creati...
Preprint
We apply the color-magnitude intercept calibration method (CMAGIC) to the Nearby Supernova Factory SNe Ia spectrophotometric dataset. The currently existing CMAGIC parameters are the slope and intercept of a straight line fit to the first linear region in the color-magnitude diagram, which occurs over a span of approximately 30 days after maximum b...
Article
Full-text available
We construct a physically parameterized probabilistic autoencoder (PAE) to learn the intrinsic diversity of Type Ia supernovae (SNe Ia) from a sparse set of spectral time series. The PAE is a two-stage generative model, composed of an autoencoder that is interpreted probabilistically after training using a normalizing flow. We demonstrate that the...
Preprint
Full-text available
We construct a physically-parameterized probabilistic autoencoder (PAE) to learn the intrinsic diversity of type Ia supernovae (SNe Ia) from a sparse set of spectral time series. The PAE is a two-stage generative model, composed of an Auto-Encoder (AE) which is interpreted probabilistically after training using a Normalizing Flow (NF). We demonstra...
Preprint
Full-text available
We calibrate spectrophotometric optical spectra of 32 stars commonly used as standard stars, referenced to 14 stars already on the HST-based CALSPEC flux system. Observations of CALSPEC and non-CALSPEC stars were obtained with the SuperNova Integral Field Spectrograph over the wavelength range 3300 A to 9400 A as calibration for the Nearby Supernov...
Article
Full-text available
While there is increasing global attention to data privacy, most of their current theoretical understanding is based on research conducted in a few countries. Prior work argues that people's cultural backgrounds might shape their privacy concerns; thus, we could expect people from different world regions to conceptualize them in diverse ways. We co...
Book
Best practices for addressing the bias and inequality that may result from the automated collection, analysis, and distribution of large datasets. Human-centered data science is a new interdisciplinary field that draws from human-computer interaction, social science, statistics, and computational techniques. This book, written by founders of the f...
Preprint
Full-text available
While there is increasing global attention to data privacy, most of their current theoretical understanding is based on research conducted in a few countries. Prior work argues that people's cultural backgrounds might shape their privacy concerns; thus, we could expect people from different world regions to conceptualize them in diverse ways. We co...
Chapter
The recent rise in online education and the accompanying difficulties encountered by both students and educators demonstrate the value of better understanding how online environments can facilitate learning and community. Fanfiction websites contain an enormous amount of original creative writing, primarily written by young people, offering an oppo...
Preprint
Full-text available
Relationships form the core of connected learning. In this study, we apply and extend social network analysis methods to uncover the layered network structure of relationships among Fanfiction.net authors and reviewers. Fanfiction.net, one of the world's largest fanfiction communities, is a space where millions of young people engage with written m...
Preprint
We show how spectra of Type Ia supernovae (SNe Ia) at maximum light can be used to improve cosmological distance estimates. In a companion article, we used manifold learning to build a three-dimensional parameterization of the intrinsic diversity of SNe Ia at maximum light that we call the "Twins Embedding". In this article, we discuss how the Twin...
Preprint
We study the spectral diversity of Type Ia supernovae (SNe Ia) at maximum light using high signal-to-noise spectrophotometry of 173 SNe Ia from the Nearby Supernova Factory. We decompose the diversity of these spectra into different extrinsic and intrinsic components, and we construct a nonlinear parameterization of the intrinsic diversity of SNe I...
Article
We study the spectral diversity of Type Ia supernovae (SNe Ia) at maximum light using high signal-to-noise spectrophotometry of 173 SNe Ia from the Nearby Supernova Factory. We decompose the diversity of these spectra into different extrinsic and intrinsic components, and we construct a nonlinear parameterization of the intrinsic diversity of SNe I...
Article
We show how spectra of Type Ia supernovae (SNe Ia) at maximum light can be used to improve cosmological distance estimates. In a companion article, we used manifold learning to build a three-dimensional parameterization of the intrinsic diversity of SNe Ia at maximum light that we call the “Twins Embedding.” In this article, we discuss how the Twin...
Article
Educational games, particularly those that encourage collaboration with peers and focusing on social and ethical issues, may be powerful in improving retention of human computer interaction (HCI) and human centered data science (HCDS) concepts among young people by providing strong emotional experiences. Further, games have the potential of reachin...
Article
Full-text available
As part of an on-going effort to identify, understand and correct for astrophysics biases in the standardization of Type Ia supernovae (SN Ia) for cosmology, we have statistically classified a large sample of nearby SNe Ia into those that are located in predominantly younger or older environments. This classification is based on the specific star f...
Article
The Nearby Supernova Factory presents an interim data release of spectrophotometric timeseries of 210 SNe Ia. Two slightly different versions of the data are included, corresponding to the training data sets used for the SNEMO and SUGAR Type Ia models. The data has been shifted to the restframe and is blinded with respect to cosmology.
Preprint
Full-text available
The Nearby Supernova Factory has made spectrophotometric observations of Type Ia supernovae since $2004$. This work presents an interim version of the data produced, including $210$ supernovae observed between $2004$ and $2013$.
Article
Full-text available
Context. Type Ia supernovae (SNe Ia) are widely used to measure the expansion of the Universe. Improving distance measurements of SNe Ia is one technique to better constrain the acceleration of expansion and determine its physical nature. Aims. This document develops a new SNe Ia spectral energy distribution (SED) model, called the SUpernova Genera...
Conference Paper
Social media platforms and social network sites generate a multitude of digital trace behavioral data, the scale of which often necessitates the use of computational data science methods. On the other hand, the socio-behavioral and often relational nature of the social media data requires the attention to context of user activity traditionally asso...
Preprint
Full-text available
The Cambridge Analytica scandal triggered a conversation on Twitter about data practices and their implications. Our research proposes to leverage this conversation to extend the understanding of how information privacy is framed by users worldwide. We collected tweets about the scandal written in Spanish and English between April and July 2018. We...
Conference Paper
Full-text available
The Cambridge Analytica scandal triggered a conversation on Twitter about data practices and their implications. Our research proposes to leverage this conversation to extend the understanding of how information privacy is framed by users worldwide. We collected tweets about the scandal written in Spanish and English between April and July 2018. We...
Preprint
Full-text available
Currently, there is a limited understanding of how data privacy concerns vary across the world. The Cambridge Analytica scandal triggered a wide-ranging discussion on social media about user data collection and use practices. We conducted an inter-language study of this online conversation to compare how people speaking different languages react to...
Preprint
Full-text available
Type Ia Supernovae (SNe Ia) are widely used to measure the expansion of the Universe. Improving distance measurements of SNe Ia is one technique to better constrain the acceleration of expansion and determine its physical nature. This document develops a new SNe Ia spectral energy distribution (SED) model, called the SUpernova Generator And Reconst...
Book
An in-depth examination of the novel ways young people support and learn from each other though participation in online fanfiction communities. Over the past twenty years, amateur fanfiction writers have published an astonishing amount of fiction in online repositories. More than 1.5 million enthusiastic fanfiction writers—primarily young people in...
Conference Paper
Full-text available
Currently, there is a limited understanding of how data privacy concerns vary across the world. The Cambridge Analytica scandal triggered a wide-ranging discussion on social media about user data collection and use practices. We conducted a cross-language study of this online conversation to compare how people speaking different languages react to...
Conference Paper
Raiding is a format in digital gaming that requires groups of people to collaborate and/or compete for a common goal. In 2017, the raiding format was introduced in the location-based mobile game Pokémon GO, which offers a mixed reality experience to friends and strangers coordinating for in-person raids. To understand this technology-mediated socia...
Conference Paper
Full-text available
The process of qualitative coding often involves multiple coders coding the same data to ensure reliable codes and a consistent understanding of the codebook. One aspect of qualitative coding includes resolving disagreements, where coders discuss differences in coding to reach a consensus. We conduct a case study to evaluate four strategies of disa...
Conference Paper
Inspired by the ACM SIGCHI Across Borders Initiative, this workshop focuses on ongoing CSCW research in, or about, Latin America (LATAM). We seek to position LATAM as the common context that unites students, academic and industry researchers who participate in the workshop. Our goals are: (1) to discuss the opportunities and challenges of doing CSC...
Conference Paper
Communities focused on creating and sharing fanworks have existed since before the internet, but have thrived in the presence of online platforms. From young people learning literacy skills through writing fanfiction about their favorite media to complex infrastructural work in fans creating their own platforms to longstanding social norms that pro...
Preprint
FanFiction.net provides an informal learning space for young writers through distributed mentoring, networked giving and receiving of feedback. In this paper, we quantify the cumulative effect of feedback on lexical diversity for 1.5 million authors.
Article
Full-text available
Machine learning (ML) has become increasingly influential to human society, yet the primary advancements and applications of ML are driven by research in only a few computational disciplines. Even applications that affect or analyze human behaviors and social structures are often developed with limited input from experts outside of computational fi...
Preprint
Full-text available
As part of an on-going effort to identify, understand and correct for astrophysics biases in the standardization of Type Ia supernovae (SNIa) for cosmology, we have statistically classified a large sample of nearby SNeIa into those located in predominantly younger or older environments. This classification is based on the specific star formation ra...
Conference Paper
Full-text available
Collaborative qualitative coding often involves coders assign- ing different labels to the same instance, leading to ambiguity. We refer to such an instance of ambiguity as disagreement in coding. Analyzing reasons for such a disagreement is essential-- both for purposes of bolstering user understanding gained from coding and reinterpreting the dat...
Article
Full-text available
Context. Type Ia supernovae (SNe Ia) are widely used to measure the expansion of the Universe. To perform such measurements the luminosity and cosmological redshift ( z ) of the SNe Ia have to be determined. The uncertainty on z includes an unknown peculiar velocity, which can be very large for SNe Ia in the virialized cores of massive clusters. Ai...
Preprint
Type Ia Supernovae (SNe Ia) are widely used to measure the expansion of the Universe. To perform such measurements the luminosity and cosmological redshift ($z$) of the SNe Ia have to be determined. The uncertainty on $z$ includes an unknown peculiar velocity, which can be very large for SNe Ia in the virialized cores of massive clusters. We determ...
Article
Full-text available
Context. Observations of type Ia supernovae (SNe Ia) can be used to derive accurate cosmological distances through empirical standardization techniques. Despite this success neither the progenitors of SNe Ia nor the explosion process are fully understood. The U -band region has been less well observed for nearby SNe, due to technical challenges, bu...
Preprint
Context. Observations of Type Ia supernovae (SNe Ia) can be used to derive accurate cosmological distances through empirical standardization techniques. Despite this success neither the progenitors of SNe Ia nor the explosion process are fully understood. The U-band region has been less well observed for nearby SNe, due to technical challenges, but...
Article
Full-text available
Computing affects how scientific knowledge is constructed, verified, and validated. Rapid changes in hardware capability, and software flexibility, are coupled with a volatile tool and skill set, particularly in the interdisciplinary scientific contexts of oceanography. Existing research considers the role of scientists as both users and producers...
Article
Many successful digital interfaces employ visual metaphors to convey features or data properties to users, but the characteristics that make a visual metaphor effective are not well understood. We used a theoretical conception of metaphor from cognitive linguistics to design an interactive system for viewing the citation network of the corpora of l...
Article
The study of emoticon use in text communication is in its early stages (Aragon, Feldman, Chen & Kroll, 2014), with even less known about how emoticons function in multilingual environments. We describe a preliminary longitudinal analysis of text communication in an online bilingual scientific work environment and demonstrate how patterns of emotico...
Conference Paper
Social media has become a fruitful platform on which to study human behavior and social phenomena. However, social media data are usually messy, disorganized, and noisy, which makes finding patterns in such data a challenging task. Visualization can help with the exploration of such massive data. Researchers studying social media often begin by rev...
Article
With its roots dating to popular television shows of the 1960s such as Star Trek, fanfiction has blossomed into an extremely widespread form of creative expression. The transition from printed zines to online fanfiction repositories has facilitated this growth in popularity, with millions of fans writing stories and adding daily to sites such as Ar...
Article
We draw parallels between emoticons in textual communication and gesture in signed language with respect to the interdependence of codes by describing two contexts under which the behavior of emoticons in textual communication resembles that of gesture in speech. Generalizing from those findings, we propose that gesture is likely characterized by a...
Conference Paper
A distance cartogram (DC) is a technique that alters distances between a user-specified origin and the other locations in a map with respect to travel time. With DC, users can weigh the relative travel time costs between the origin and potential destinations at a glance because travel times are projected in a linearly interpolated time space from t...
Conference Paper
Affect has been identified as an important component of the communication practices of distributed teams. Our emerging theory of distributed affect moves beyond the individual as the primary unit of analysis, focusing instead on affect as a dynamic group process. Drawing upon a data set of over four years of chat logs from a distributed scientific...
Article
From Harry Potter to American Horror Story, fanfiction is extremely popular among young people. Sites such as Fanfiction.net host millions of stories, with thousands more posted each day. Enthusiasts are sharing their writing and reading stories written by others. Exactly how does a generation known more for videogame expertise than long-form writi...
Preprint
From Harry Potter to American Horror Story, fanfiction is extremely popular among young people. Sites such as Fanfiction.net host millions of stories, with thousands more posted each day. Enthusiasts are sharing their writing and reading stories written by others. Exactly how does a generation known more for videogame expertise than long-form writi...
Article
As the era of ‘big data’ unfolds, researchers are increasingly engaging with large, complex data sets compiled from heterogeneous sources and distributed across networked technologies. The nature of these data sets makes it difficult to grasp and manipulate their materiality. We argue that moments of breakdown – points at which progress is stopped...
Conference Paper
The study and analysis of large and complex data sets offer a wealth of insights in a variety of applications. Computational approaches provide researchers access to broad assemblages of data, but the insights extracted may lack the rich detail that qualitative approaches have brought to the understanding of sociotechnical phenomena. How do we pres...
Conference Paper
Full-text available
Software is fundamental to academic research work, both as part of the method and as the result of research. In June 2016 25 people gathered at Schloss Dagstuhl for a week-long Perspectives Workshop and began to develop a manifesto which places emphasis on the scholarly value of academic software and on personal responsibility. Twenty pledges cover...
Article
Full-text available
We introduce a method for identifying "twin" Type Ia supernovae, and using them to improve distance measurements. This novel approach to Type Ia supernova standardization is made possible by spectrophotometric time series observations from the Nearby Supernova Factory (SNfactory). We begin with a well-measured set of supernovae, find pairs whose sp...
Article
Full-text available
High performance computing (HPC) has driven collaborative science discovery for decades. Exascale computing platforms, currently in the design stage, will be deployed around 2022. The next generation of supercomputers is expected to utilize radically different computational paradigms, necessitating fundamental changes in how the community of scient...