
Frauke Kreuter- Professor (Full) at University of Maryland, College Park
Frauke Kreuter
- Professor (Full) at University of Maryland, College Park
About
230
Publications
51,692
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
7,038
Citations
Introduction
Current institution
Publications
Publications (230)
Decision-making inherently involves cause–effect relationships that introduce causal challenges. We argue that reliable algorithms for decision-making need to build upon causal reasoning. Addressing these causal challenges requires explicit assumptions about the underlying causal structure to ensure identifiability and estimatability, which means t...
Recent generative AI technologies, particularly Large Language Models (LLMs), have increased interest in Natural Language Processing (NLP) methods for scientists and practitioners across disciplines. In this position paper, we highlight one such discipline — survey methodology, which not only uses more and more NLP techniques, e.g., using LLMs to s...
Objectives
This study evaluates the association between trust in health care professionals and health care delays across 21 countries.
Methods
We apply logistic regression models to survey data of over 621,000 individuals collected in Spring 2023.
Results
Results show 44.5% of respondents with medical conditions experienced delays in accessing he...
Generative AI (GenAI) is increasingly used in survey contexts to simulate human preferences. While many research endeavors evaluate the quality of synthetic GenAI data by comparing model-generated responses to gold-standard survey results, fundamental questions about the validity and reliability of using LLMs as substitutes for human respondents re...
Understanding pragmatics-the use of language in context-is crucial for developing NLP systems capable of interpreting nuanced language use. Despite recent advances in language technologies, including large language models, evaluating their ability to handle pragmatic phenomena such as implicatures and references remains challenging. To advance prag...
Models trained on crowdsourced labels may not reflect broader population views when annotator pools are not representative. Since collecting representative labels is challenging, we propose Population-Aligned Instance Replication (PAIR), a method to address this bias through statistical adjustment. Using a simulation study of hate speech and offens...
Occupational data play a vital role in research, official statistics, and policymaking, yet their collection and accurate classification remain a persistent challenge. This study investigates the effects of occupational question wording on data variability and the performance of automatic coding tools. Through a series of survey experiments conduct...
High-quality annotations are a critical success factor for machine learning (ML) applications. To achieve this, we have traditionally relied on human annotators, navigating the challenges of limited budgets and the varying task-specific expertise, costs, and availability. Since the emergence of Large Language Models (LLMs), their popularity for gen...
In recent research, large language models (LLMs) have been increasingly used to investigate public opinions. This study investigates the algorithmic fidelity of LLMs, i.e., the ability to replicate the socio-cultural context and nuanced opinions of human participants. Using open-ended survey data from the German Longitudinal Election Studies (GLES)...
This study explores the effectiveness of Germany’s rent control regulation in the context of Munich's housing crisis. The rent brake, aimed at preventing excessive rents in tense housing markets, was implemented in 2015 in Germany but has faced challenges in practical application. This research evaluates the number of tenants eligible to use the re...
AI-driven decision-making systems are becoming instrumental in the public sector, with applications spanning areas like criminal justice, social welfare, financial fraud detection, and public health. While these systems offer great potential benefits to institutional decision-making processes, such as improved efficiency and reliability, these syst...
Im Kompetenzzentrum Datenqualität in den Sozialwissenschaften (KODAQS) wird ein Ansatz zur Vermittlung von Best Practices zur Messung von Datenqualität in den quantitativen Sozialwissenschaften erarbeitet und umgesetzt. Durch das entwickelte Curriculum und Modellierung eines Lehr-Lernprozesses für den Kompetenzerwerb von Methoden und Wissen zur Beu...
Automated decision-making (ADM) systems are being deployed across a diverse range of critical problem areas such as social welfare and healthcare. Recent work highlights the importance of causal ML models in ADM systems, but implementing them in complex social environments poses significant challenges. Research on how these challenges impact the pe...
R package PracTools v.1.5. This package is also posted to the Comprehensive R Archive site (CRAN).
Functions and datasets to support Valliant, Dever, and Kreuter, Practical Tools for Designing and Weighting Survey Samples (2nd edition, 2018). Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-,...
Recent advances in Large Language Models (LLMs) have sparked wide interest in validating and comprehending the human-like cognitive-behavioral traits LLMs may have. These cognitive-behavioral traits include typically Attitudes, Opinions, Values (AOV). However, measuring AOV embedded within LLMs remains opaque, and different evaluation methods may y...
Conversational Large Language Models are trained to refuse to answer harmful questions. However, emergent jailbreaking techniques can still elicit unsafe outputs, presenting an ongoing challenge for model alignment. To better understand how different jailbreak types circumvent safeguards, this paper analyses model activations on different jailbreak...
Online surveys are a widely used mode of data collection. However, as no interviewer is present, respondents face any difficulties they encounter alone, which may lead to measurement error and biased or (at worst) invalid conclusions. Detecting response difficulty is therefore vital. Previous research has predominantly focused on response times to...
Help Manual for the R package PracTools v.1.4.3. Functions and datasets to support Valliant, Dever, and Kreuter (2018), <doi:10.1007/978-3-319-93632-1>, ``Practical Tools for Designing and Weighting Survey Samples''. Contains functions for sample size calculation for survey samples using stratified or clustered one-, two-, and three-stage sample de...
The open-ended nature of language generation makes the evaluation of autoregressive large language models (LLMs) challenging. One common evaluation approach uses multiple-choice questions (MCQ) to limit the response space. The model is then evaluated by ranking the candidate answers by the log probability of the first token prediction. However, fir...
Machine Learning (ML) systems are becoming instrumental in the public sector, with applications spanning areas like criminal justice, social welfare, financial fraud detection, and public health. While these systems offer great potential benefits to institutional decision-making processes, such as improved efficiency and reliability, they still fac...
Statistical profiling of job seekers is an attractive option to guide the activities of public employment services. Many hope that algorithms will improve both efficiency and effectiveness of employment services’ activities that are so far often based on human judgment. Against this backdrop, we evaluate regression and machine-learning models for p...
Survey participants’ mouse movements provide a rich, unobtrusive source of paradata, offering insight into the response process beyond the observed answers. However, the use of mouse tracking may require participants’ explicit consent for their movements to be recorded and analyzed. Thus, the question arises of how its presence affects the willingn...
Advanced large language models like ChatGPT have gained considerable attention recently, including among students. However, while the debate on ChatGPT in academia is making waves, more understanding is needed among lecturers and teachers on how students use and perceive ChatGPT. To address this gap, we analyzed the content on ChatGPT available on...
Survey research aims to collect robust and reliable data from respondents. However, despite researchers’ efforts in designing questionnaires, survey instruments may be imperfect, and question structure not as clear as could be, thus creating a burden for respondents. If it were possible to detect such problems, this knowledge could be used to predi...
Frequently, Machine Learning (ML) algorithms are trained on human-labeled data. Although often seen as a “gold standard,” human labeling is all but error free. Decisions in the design of labeling tasks can lead to distortions of the resulting labeled data and impact predictions. Building on insights from survey methodology, a field that studies the...
Employment relationships are embedded in a network of social norms that provide an implicit framework for desired behaviour, especially if contractual solutions are weak. The COVID-19 pandemic has brought about major changes that have led to situations, such as the scope of short-time work or home-based work in a firm. Against this backdrop, our st...
Prediction algorithms are regularly used to support and automate high-stakes policy decisions about the allocation of scarce public resources. However, data-driven decision-making raises problems of algorithmic fairness and justice. So far, fairness and justice are frequently conflated, with the consequence that distributive justice concerns are no...
Objectives: Real-time data analysis during a pandemic is crucial. This paper aims to introduce a novel interactive tool called Covid-Predictor-Tracker using several sources of COVID-19 data, which allows examining developments over time and across countries. Exemplified here by investigating relative effects of vaccination to non-pharmaceutical int...
Human perceptions of fairness in (semi-)automated decision-making (ADM) constitute a crucial building block toward developing human-centered ADM solutions. However, measuring fairness perceptions is challenging because various context and design characteristics of ADM systems need to be disentangled. Particularly, ADM applications need to use the r...
The COVID-19 pandemic has spotlighted the importance of high-quality data for empirical health research and evidence-based political decision-making. To leverage the full potential of these data, a better understanding of the determinants and conditions under which people are willing to share their health data is critical. Building on the privacy t...
Linking digital trace data to existing panel survey data may increase the overall analysis potential of the data. However, producing linked products often requires additional engagement from survey participants through consent or participation in additional tasks. Panel operators may worry that such additional requests may backfire and lead to lowe...
Academic and public debates are increasingly concerned with the question whether and how algorithmic decision-making (ADM) may reinforce social inequality. Most previous research on this topic originates from computer science. The social sciences, however, have huge potentials to contribute to research on social consequences of ADM. Based on a proc...
We propose new ensemble models for multivariate functional data classification as combinations of semi-metric-based weak learners. Our models extend current semi-metric-type methods from the univariate to the multivariate case, propose new semi-metrics to compute distances between functions, and consider more flexible options for combining weak lea...
Research apps allow to administer survey questions and passively collect smartphone data, thus providing rich information on individual and social behaviours. Agreeing to this novel form of data collection requires multiple consent steps, and little is known about the effect of non-participation. We invited 4,293 Android smart-phone owners from the...
Survey participants' mouse movements provide a rich, unobtrusive source of paradata, and offer insight into the response process beyond the observed answers. However, the use of mouse-tracking may require participants' explicit consent that their movements are recorded and analyzed. Thus, the fundamental question arises how this affects the willing...
Objectives: To examine the association of non-pharmaceutical interventions (NPIs) with anxiety and depressive symptoms among adults and determine if these associations varied by gender and age.
Methods: We combined survey data from 16,177,184 adults from 43 countries who participated in the daily COVID-19 Trends and Impact Survey via Facebook with...
As smartphones become increasingly prevalent, social scientists are recognizing the ubiquitous data generated by the sensors built into these devices as an innovative data source. Passively collected data from sensors that measure geolocation or movement provide an unobtrusive way to observe participants in everyday situations and are free from rea...
Significance
We revisit the problem of ensuring statistically valid inferences across diverse target populations from a single source of training data. Our approach builds a surprising technical connection between the inference problem and a technique developed for algorithmic fairness, called “multicalibration.” We derive a correspondence between...
Freie, öffentlichen Meinungsbildung ist das Herzstück der Demokratie. Doch digitale Kommunikation und datengetriebene Kuratierung von Inhalten verändern das der Demokratie eigene Konzept von Öffentlichkeit und erfordern neue gesetzliche Rahmenbedingungen. In diesem Sammelband führen Expert:innen der Rechts- und Politikwissenschaften, der Soziologie...
Significance
The University of Maryland Global COVID Trends and Impact Survey (UMD-CTIS), launched April 2020, is the largest remote global health monitoring system. This study includes ∼30 million responses through December 2020 from all 114 countries/territories with survey weights to adjust for nonresponse and demographics. Using self-reported c...
Significance
The US COVID-19 Trends and Impact Survey (CTIS) has operated continuously since April 6, 2020, collecting over 20 million responses. As the largest public health survey conducted in the United States to date, CTIS was designed to facilitate detailed demographic and geographic analyses, track trends over time, and accommodate rapid revi...
Increasing the sample size of a survey is often thought to increase the accuracy of the results. However, an analysis of big surveys on the uptake of COVID-19 vaccines shows that larger sample sizes do not protect against bias. Estimates of vaccine uptake from big surveys show bias.
Background
Guidelines and recommendations from public health authorities related to face masks have been essential in containing the COVID-19 pandemic. We assessed the prevalence and correlates of mask usage during the pandemic.
Methods
We examined a total of 13,723,810 responses to a daily cross-sectional online survey in 38 countries of people w...
A major concern arising from ubiquitous tracking of individuals’ online activity is that algorithms may be trained to predict personal sensitive information, even for users who do not wish to reveal such information. Although previous research has shown that digital trace data can accurately predict sociodemographic characteristics, little is known...
Social media are becoming more popular as a source of data for social science researchers. These data are plentiful and offer the potential to answer new research questions at smaller geographies and for rarer subpopulations. When deciding whether to use data from social media, it is useful to learn as much as possible about the data and its source...
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic is posing a global public health burden. These consequences have been shown to increase the risk of mental distress, but the underlying protective and risk factors for mental distress and trends over different waves of the pandemic are largely unknown. Furthermore, it is larg...
Algorithmic profiling is increasingly used in the public sector as a means to allocate limited public resources effectively and objectively. One example is the prediction-based statistical profiling of job seekers to guide the allocation of support measures by public employment services. However, empirical evaluations of potential side-effects such...
Survey research aims to collect robust and reliable data from respondents. However, despite researchers’ efforts in designing questionnaires, survey instruments may be imperfect, and question structure not as clear as could be, thus creating a burden for respondents. If it were possible to detect such problems, this knowledge could be used to predi...
The U.S. COVID-19 Trends and Impact Survey (CTIS) is a large, cross-sectional, Internet-based survey that has operated continuously since April 6, 2020. By inviting a random sample of Facebook active users each day, CTIS collects information about COVID-19 symptoms, risks, mitigating behaviors, mental health, testing, vaccination, and other key pri...
The International Program in Survey and Data Science (IPSDS) is an online educational program, which can be attended through the Joint Program in Survey Methodology (JPSM) at the University of Maryland (UMD) and a part-time Master of Applied Data Science & Measurement (MDM) at the University of Mannheim and Mannheim Business School (MBS). It is tar...
Simultaneously tracking the global COVID-19 impact across multiple populations is challenging due to regional variation in resources and reporting. Leveraging self-reported survey outcomes via an existing international social media network has the potential to provide reliable and standardized data streams to support monitoring and decision-making...
Smartphones sind für viele Menschen zu einem selbstverständlichen Bestandteil des Alltags geworden. Sie werden neben der Nutzung zur Kommunikation, Unterhaltung und Information auch bei der Jobsuche und im Arbeitsalltag genutzt (Perrin 2017). Dies bietet Möglichkeiten Smartphones als Datenerhebungsinstrument für die wissenschaftliche Forschung einz...
Aims
To examine changes in drinking behavior among US adults between March 10 and July 21, 2020, a critical period during the COVID-19 pandemic.
Design
Longitudinal, internet-based panel survey.
Setting
The Understanding America Study (UAS), a nationally-representative panel of US adults aged 18 or older.
Participants
4,298 US adults who reporte...
The ability to ‘sense’ the social environment and thereby to understand the thoughts and actions of others allows humans to fit into their social worlds, communicate and cooperate, and learn from others’ experiences. Here we argue that, through the lens of computational social science, this ability can be used to advance research into human sociali...
In the wake of the digital revolution and connected technologies, societies store an ever-increasing amount of data on humans, their preferences, and behavior. These modern technologies create a trust challenge, insofar as individuals have to trust data collectors such as private organizations, government institutions, and researchers that their da...
The advent of powerful prediction algorithms led to increased automation of high-stake decisions regarding the allocation of scarce resources such as government spending and welfare support. This automation bears the risk of perpetuating unwanted discrimination against vulnerable and historically disadvantaged groups. Research on algorithmic discri...
While the COVID-19 pandemic has been devastating, data collected in this context has unprecedented opportunities for data scientists. The stunning breadth of data obtained through new gathering systems put in place to manage the pandemic offers a richly textured view of a transformed world. Looking forward, privacy researchers worry that these new...
Background
Cross-sectional studies have found that the coronavirus disease 2019 (COVID-19) pandemic has negatively affected population-level mental health. Longitudinal studies are necessary to examine trajectories of change in mental health over time and identify sociodemographic groups at risk for persistent distress.
Purpose
To examine the traj...
Across survey organizations around the world, there is increasing pressure to augment survey data with administrative data. In many settings, obtaining informed consent from respondents is required before administrative data can be linked. A key question is whether respondents understand the linkage consent request and if consent is correlated with...
Background: Guidelines and recommendations from public health authorities related to face masks have been essential in containing the COVID-19 pandemic. We assessed the prevalence and correlates of mask usage during the pandemic. Methods: We examined a total of 13,723,810 responses to a daily cross-sectional representative online survey in 38 count...
Within the survey context, a geofence can be defined as a geographical area that triggers a survey invitation when an individual enters the area, dwells in the area for a defined amount of time or exits the area. Geofences may be used to administer context-specific surveys, such as an evaluation survey of a shopping experience at a specific retail...
Audio computer-assisted self-interviewing (ACASI) has been widely used to collect sensitive information from respondents in face-to-face interviews. Interviewers ask questions that are not sensitive or only moderately sensitive and then allow respondents to self-administer more sensitive questions, listening to audio recordings of the questions and...
The new European General Data Protection Regulation (GDPR) imposes enhanced requirements on digital data collection. This article reports from a 2018 German nationwide population-based probability app study in which participants were asked through a GDPR compliant consent process to share a series of digital trace data, including geolocation, accel...
Objectives. To assess the impact of the COVID-19 pandemic on mental distress in US adults.
Methods. Participants were 5065 adults from the Understanding America Study, a probability-based Internet panel representative of the US adult population. The main exposure was survey completion date (March 10–16, 2020). The outcome was mental distress measur...
This chapter analyzes the effects of different incentive schemes on participation rates in a study combining self‐reports and passive data collection using smartphones, as well as breaks out these effects by economic subgroups. Providing some form of incentive, whether monetary or some other kind of token of appreciation, is common for studies recr...
Official statistics could be produced more quickly and less expensively than currently possible using surveys. Big Data may also provide statistics for small areas to make informed policy and program decisions This chapter uses social media data to produce similar attitude distributions as survey data. It explores deeper into each distribution to d...
This chapter reviews existing literature on concern with and willingness to engage in active and passive forms of mobile data collection. It describes four online surveys conducted in two countries that all administered a similar set of questions on concern with five different forms of mobile data collection. The chapter uses these data to analyze...
Hintergrund: Die SARS-CoV-2-Pandemie bringt Belastungen durch Überforderung des Gesundheitssystems, Lockdown der Wirtschaft, Kontakt-und Ausgangsbeschränkungen sowie Quarantänemaßnahmen mit sich. Diese Arbeit gibt einen Überblick über psychische Belastungen in der gegenwärtigen Pandemie und identifiziert protektive und Risikofaktoren.
Methode: Ei...
Most individuals in the United States have no history of a mental health condition yet are at risk for psychological distress due to the COVID-19 pandemic. The objective of this study was to assess the frequency and risk and protective factors of psychological distress, during the beginning of the COVID-19 pandemic, in this group. Data comes from t...
Background: The COVID-19 pandemic is the greatest public health crisis of the last 100 years.
Countries have responded with various levels of lockdown to save lives and stop health systems from
being overwhelmed. At the same time, lockdowns entail large socio-economic costs. One exit strategy
under consideration is a mobile phone app that traces cl...
BACKGROUND
The COVID-19 pandemic is the greatest public health crisis of the last 100 years. Countries have responded with various levels of lockdown to save lives and stop health systems from being overwhelmed. At the same time, lockdowns entail large socio-economic costs. One exit strategy under consideration is a mobile phone app that traces clo...
Background:
The COVID-19 pandemic is the greatest public health crisis of the last 100 years. Countries have responded with various levels of lockdown to save lives and stop health systems from being overwhelmed. At the same time, lockdowns entail large socio-economic costs. One exit strategy under consideration is a mobile phone app that traces c...
Researchers are combining self-reports from mobile surveys with passive data collection using sensors and apps on smartphones increasingly more often. While smartphones are commonly used in some groups of individuals, smartphone penetration is significantly lower in other groups. In addition, different operating systems (OSs) limit how mobile data...
Interviewer-respondent rapport is generally considered to be beneficial for the quality of the data collected in survey interviews; however, the relationship between rapport and data quality has rarely been directly investigated. We conducted a laboratory experiment in which eight professional interviewers interviewed 125 respondents to see how the...
Die Nachfrage nach gut ausgebildeten DatenwissenschaftlerInnen, die sowohl die Fähigkeiten besitzen, Daten auf „traditionellem Weg“ zu erheben und auszuwerten und ebenso mit großen semi- oder gar unstrukturierten Datensätzen zu arbeiten, steigt kontinuierlich an. In diesem Beitrag beschreiben wir, welche Kompetenzen Sozial- und MarktforscherInnen h...
Web surveys have become a standard mode of survey administration, in part because they offer greater technological capabilities so that aspects of the questionnaire's design can be dynamically controlled. Designers often use these features to help guide respondents through a survey, but web‐based designs also allow researchers to collect and analyz...
Appendix 8A Additional Evaluation of Derived NSFG Classes Figure A8A.1 Figure A8A.2 Table A8A.1 Table A8A.2 Table A8A.3 Appendix 8B Additional Details on ESS Items
Table A7A.1 Number of identical response patterns
The COVID-19 pandemic and associated government lockdown restrictions have fueled a high demand for survey data on how individuals and establishments are coping with the restrictions. However, the pandemic has also dramatically affected surveys themselves, forcing research institutes to adapt their fieldwork operations to the uncertain and evolving...
Interviewer-administered surveys are a primary method of collecting information from populations across the United States and the world. Various types of interviewer-administered surveys exist, including large-scale government surveys that monitor populations (e.g., the Current Population Survey), surveys used by the academic community to understan...
Survey researchers are increasingly seeking opportunities to link interview data with administrative records. However, obtaining consent from all survey respondents (or certain subgroups) remains a barrier to performing record linkage in many studies. We experimentally investigated whether emphasizing different benefits of record linkage to respond...