About
42
Publications
6,647
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
440
Citations
Introduction
Publications
Publications (42)
Background
Although the COVID-19 pandemic has persisted for over 3 years, reinfections with SARS-CoV-2 are not well understood. We aim to characterize reinfection, understand development of Long COVID after reinfection, and compare severity of reinfection with initial infection.
Methods
We use an electronic health record study cohort of over 3 mil...
Background. In 2021, we used the National COVID Cohort Collaborative (N3C) as part of the NIH RECOVER Initiative to develop a machine learning (ML) pipeline to identify patients with a high probability of having post-acute sequelae of SARS-CoV-2 infection (PASC), or Long COVID. However, the increased home testing, missing documentation, and reinfec...
Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatG...
Although the COVID-19 pandemic has persisted for over 2 years, reinfections with SARS-CoV-2 are not well understood. We use the electronic health record (EHR)-based study cohort from the National COVID Cohort Collaborative (N3C) as part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative to characterize reinfection, understand dev...
Frequently, Machine Learning (ML) algorithms are trained on human-labeled data. Although often seen as a “gold standard,” human labeling is all but error free. Decisions in the design of labeling tasks can lead to distortions of the resulting labeled data and impact predictions. Building on insights from survey methodology, a field that studies the...
Introduction
The current study analyzes age-differentiated Reddit conversations about Electronic Nicotine Delivery System [ENDS] .
Methods
The current study combines two methods to 1) predict Reddit users’ age into two categories (13-20 [Underage], 21-54 [Of Legal Age]) using a machine learning algorithm and 2) qualitatively code ENDS related Redd...
BACKGROUND
Mass shootings result in widespread psychological trauma for survivors and members of the affected community. However, less is known about the broader effects of indirect exposure (e.g., media) to mass shootings. Crisis lines offer a unique opportunity to examine real-time data on the widespread psychological effects of mass shootings....
Background:
Mass shootings result in widespread psychological trauma for survivors and members of the affected community. However, less is known about the broader effects of indirect exposure (eg, media) to mass shootings. Crisis lines offer a unique opportunity to examine real-time data on the widespread psychological effects of mass shootings....
Background
Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups.
Objectiv...
Background
Treatment outcomes after pelvic organ prolapse (POP) surgery are often presented as dichotomous ‘success or failure’ based upon anatomic and symptom criteria. However, clinical experience suggests some women with outcome ‘failures’ are asymptomatic and perceive their surgery to be successful, while others have anatomic resolution, but co...
BACKGROUND
Previous qualitative studies and data science studies using Reddit for tobacco research are limited by the lack of available demographic information. Social media investigations are often limited to manual qualitative coding or machine learning classification in isolation.
OBJECTIVE
This study combines both machine learning methods and...
BACKGROUND
Electronic nicotine delivery system (ENDS) brands, such as JUUL, used social media as a key component of their marketing strategy, which led to massive sales growth from 2015 to 2018. During this time, ENDS use rapidly increased among youths and young adults, with flavored products being particularly popular among these groups.
OBJECTIV...
[This corrects the article DOI: 10.2196/25807.].
Background
Social media are important for monitoring perceptions of public health issues and for educating target audiences about health; however, limited information about the demographics of social media users makes it challenging to identify conversations among target audiences and limits how well social media can be used for public health surve...
BACKGROUND
Social media are important for monitoring perceptions of public health issues and for educating target audiences about health; however, limited information about the demographics of social media users makes it challenging to identify conversations among target audiences and limits how well social media can be used for public health surve...
This chapter discusses the basis for these population estimates, scope, and limitations based on experiences in development of massive global population datasets and usage of these datasets as a basis for sampling design. It presents tools and approaches for using these georeferenced population estimates for complex household survey sampling. The c...
Accurate projections of seasonal agricultural output are essential for improving food security. However, the collection of agricultural information through seasonal agricultural surveys is often not timely enough to inform public and private stakeholders about crop status during the growing season. Acquiring timely and accurate crop estimates can b...
Exposure assessment studies are the primary means for understanding links between exposure to chemical and physical agents and adverse health effects. Recently, researchers have proposed using wearable monitors during exposure assessment studies to obtain higher fidelity readings of exposures actually experienced by subjects. However, limited resea...
The results of many large-scale federal or multi-site evaluations are typically compiled into long reports which end up sitting on policymaker's shelves. Moreover, the information policymakers need from these reports is often buried in the report, may not be remembered, understood, or readily accessible to the policymaker when it is needed. This is...
JUUL is the most popular electronic nicotine delivery system (ENDS) in the United States. JUUL’s discreet design, availability in flavors such as mango, and use of nicotine salt solutions that deliver a high dose of nicotine with minimal harshness may explain its appeal. Nearly 64.2% of high school students have ever used JUUL, and 47.1% currently...
Social media data are increasingly used by researchers to gain insights on individuals’ behaviors and opinions. Platforms like Twitter provide access to individuals’ postings, networks of friends and followers, and the content to which they are exposed. This article presents the methods and results of an exploratory study to supplement survey data...
SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates...
While governments, researchers, and NGOs are exploring ways to leverage big data sources for sustainable development, household surveys are still a critical source of information for dozens of the 232 indicators for the Sustainable Development Goals (SDGs) in low- and middle-income countries (LMICs). Though some countries’ statistical agencies main...
BACKGROUND
JUUL is an electronic nicotine delivery system (ENDS) resembling a USB device that has become rapidly popular among youth. Recent studies suggest that social media may be contributing to its popularity. JUUL company claims their products are targeted for adult current smokers but recent surveillance suggests youth may be exposed to JUUL...
As a larger proportion of society participates in social media, public health organizations are increasingly using digital campaigns to engage and educate their target audiences. Computational methods such as social network analysis and machine learning can provide social media campaigns with a rare opportunity to better understand their followers...
Social networks play a critical role in the formation of criminal and radical groups. However, understanding of these formations relies on difficult to collect data. We present an approach where narrative data from the trial of the 1995 Paris Metro and RER bombings was used to extract actors, places, groups and actions that led to the formation of...
Background:
Conducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded population sampling method that addresses some o...
Background:
Tumor testing for mutations in the epidermal growth factor receptor (EGFR) gene is indicated for all newly diagnosed, metastatic lung cancer patients, who may be candidates for first-line treatment with an EGFR tyrosine kinase inhibitor. Few studies have analyzed population-level testing.
Methods:
We identified clinical, demographic,...
Background:
Despite concerns about their health risks, e‑cigarettes have gained popularity in recent years. Concurrent with the recent increase in e‑cigarette use, social media sites such as Twitter have become a common platform for sharing information about e-cigarettes and to promote marketing of e‑cigarettes. Monitoring the trends in e‑cigarett...
Health organizations are increasingly using social media, such as Twitter, to disseminate health messages to target audiences. Determining the extent to which the target audience (e.g., age groups) was reached is critical to evaluating the impact of social media education campaigns. The main objective of this study was to examine the separate and j...
Tests of different classifiers.
A) Based on Accuracy B) Based on F-1 Score.
(DOCX)
Description of metadata and linguistic features.
(DOCX)
Background
Twitter represents a social media platform through which medical cannabis dispensaries can rapidly promote and advertise a multitude of retail products. Yet, to date, no studies have systematically evaluated Twitter behavior among dispensaries and how these behaviors influence the formation of social networks.
Objectives
This study soug...