Ian Lundberg's research while affiliated with Cornell University and other places

Publications (14)

Article
Computational power and big data have created new opportunities to explore and understand the social world. A special synergy is possible when social scientists combine human attention to certain aspects of the problem with the power of algorithms to automate other aspects of the problem. We review selected exemplary applications where machine lear...
Preprint
Computational power and digital data have created new opportunities to explore and understand the social world. A special synergy is possible when social scientists combine human attention to certain aspects of the problem with the power of algorithms to automate other aspects of the problem. We review selected exemplary applications where machine...
Article
Disparities across race, gender, and class are important targets of descriptive research. But rather than only describe disparities, research would ideally inform interventions to close those gaps. The gap-closing estimand quantifies how much a gap (e.g., incomes by race) would close if we intervened to equalize a treatment (e.g., access to college...
Article
We make only one point in this article. Every quantitative study must be able to answer the question: what is your estimand? The estimand is the target quantity—the purpose of the statistical analysis. Much attention is already placed on how to do estimation; a similar degree of care should be given to defining the thing we are estimating. We advoc...
Article
Studies of economic mobility summarize the distribution of offspring incomes for each level of parent income. Mitnik and Grusky (2020) highlight that the conventional intergenerational elasticity (IGE) targets the geometric mean and propose a parametric strategy for estimating the arithmetic mean. We decompose the IGE and their proposal into two ch...
Article
Sibling (cousin) correlations are empirically straightforward: they capture the degree to which siblings’ (cousins’) socioeconomic outcomes are similar. At face value, these quantities seem to summarize something about how families constrain opportunity. Their meaning, however, is complicated. One empirical set of sibling and cousin correlations ca...
Article
A lack of affordable housing is a pressing issue for many low‐income American families and can lead to eviction from their homes. Housing assistance programs to address this problem include public housing and other assistance, including vouchers, through which a government agency offsets the cost of private market housing. This paper assesses wheth...
Article
Full-text available
How predictable are life trajectories? We investigated this question with a scientific mass collaboration using the common task method; 160 teams built predictive models for six life outcomes using data from the Fragile Families and Child Wellbeing Study, a high-quality birth cohort study. Despite using a rich dataset and applying machine-learning...
Preprint
The link between theory and quantitative empirical evidence is a longstanding hurdle in sociological research. Ambiguity about the role that statistical evidence plays in an argument may produce misleading conclusions and poor methodological practice. This ambiguity could be reduced if researchers would state the theoretical estimand---the central...
Article
Full-text available
The Fragile Families Challenge is a scientific mass collaboration designed to measure and understand the predictability of life trajectories. Participants in the Challenge created predictive models of six life outcomes using data from the Fragile Families and Child Wellbeing Study, a high-quality birth cohort study. This Special Collection includes...
Article
Full-text available
Stewards of social data face a fundamental tension. On one hand, they want to make their data accessible to as many researchers as possible to facilitate new discoveries. At the same time, they want to restrict access to their data as much as possible to protect the people represented in the data. In this article, we provide a case study addressing...
Article
A growing body of research suggests that housing eviction is more common than previously recognized and may play an important role in the reproduction of poverty. The proportion of children affected by housing eviction, however, remains largely unknown. We estimate that one in seven children born in large U.S. cities in 1998–2000 experienced at lea...
Preprint
Stewards of social science data face a fundamental tension. On one hand, they want to make their data accessible to as many researchers as possible to facilitate new discoveries. At the same time, they want to restrict access to their data as much as possible in order to protect the people represented in the data. In this paper, we provide a case s...
Article
Recent research has shown that men’s wages rise more rapidly than expected prior to marriage, but interpretations diverge on whether this indicates selection or a causal effect of anticipating marriage. We seek to adjudicate this debate by bringing together literatures on (1) the male marriage wage premium; (2) selection into marriage based on men’...

Citations

... In conclusion, machine learning has potential and offers excellent opportunities in the social sciences. In addition, technological advances are making machine learning tools an attractive alternative to classical methods, and the technical barriers to using ML are decreasing thanks to available open-source software [30]. ...
... There is a venerable tradition across health and the social sciences of quantifying betweengroup disparities, and in trying to understand the extent to which observable characteristics explain these differences. For overview discussions, see, among others: Fortin et al. (2011) in economics, Jackson (2020) in health, andLundberg (2021) in sociology. In this section, we review this approach as applied to racial disparities in EGS outcomes. ...
... Second, we also demonstrate empirically that political and science populism still have crucial differences: Factor analyses showed that both variants are perceived as distinct sets of ideas and have different antecedents, which is in line with theoretical premises that political and science populism conceptualize 'elites' differently and attribute the alleged virtuousness of ordinary people to different reasons. Researchers should thus not consider science populism as 'old wine in a new bottle' (Geurkink et al. 2020), i.e. as a phenomenon that is already included in political populism (see Oliver and Rahn 2016), but should rather employ conceptual arguments based on which populism variant is (or whether both are) relevant to their research questions, and acknowledge that both variants may behave differently depending on the study setting, topical context, and choice of covariates (see Lundberg, Johnson, and Stewart 2021). This may require future scholarship on political and science populism to develop different theoretical and statistical models to explain support for either form of populism. ...
... Одна из последних дискуссий о правильности оценивания межпоколенческой мобильности развернулась на страницах журнала «Sociological Methodology» (Lundberg, Stewart, 2020) как реакция на статью профессора Стэнфордского университета П. Митника и его коллег (Mitnik, Bryant, Weber, 2019). Авторы обратили внимание на то, что до сих пор, несмотря на огромное число работ, нет правильных оценок межпоколенческой мобильности. ...
... Milardo, 2009). Social stratification research has explored the possibility of social advantages being nested within kinship structures and found that one's own socioeconomic outcomes are associated with characteristics of proximate as well as more remote kin (Lundberg, 2020;Mare, 2011). ...
... Median annual household income ranges from $18,668 to $34,511, compared to about $55,000 for the city [21]. Percent of the population in poverty ranges from 28 [21]. These characteristics typify neighborhoods with high transiency rates, both for rental homes with federally subsidized rents and for those with non-subsidized rental units [22][23][24][25]. ...
... ML is the study of how algorithms can learn from data (e.g., past social events) with no or little human guidance, thereby predicting new data instances (e.g., future social events) (Hastie et al., 2009). As ML excels in predicting social events (Bail, 2014;Boelaert & Ollion, 2018;DiMaggio et al., 2013;Kino et al., 2021;Molina & Garip, 2019;Mullainathan & Spiess, 2017;Salganik, Lundberg, et al., 2020), some scholars propose using ML algorithms for pure prediction problems "where causal inference is not central, or even necessary" (Kleinberg et al., 2015, p. 491). ...
... The two choices we raise-distributional summary and functional form-are applicable in any regression context. Researchers often equate the research goal with the coefficient of a regression model, but we advocate a more conscious choice of estimand (Lundberg, Johnson, and Stewart 2020). Research constrained to the study of parameterized means risks obscuring important sources of evidence. ...
... The data collection procedures were overseen by the Institutional Review Board of Princeton University (Protocol 8061, Lundberg et al., 2018). Parents provided informed consent to join the study and made this agreement on behalf of their children. ...
... While establishing a common task framework to evaluate causality remains a challenge in many disciplinesespecially in the social sciences-several common task frameworks focus on an observable estimand: predicting outcomes . For example, the Fragile Families Challenge is a scholarly mass collaboration tailored to predict six life outcomes for children age 15 (Salganik et al., 2019;Salganik, Lundberg, et al., 2020; τ Ŷ Gk τ Gk Salganik, Maffeo, et al., 2020). These outcomes are child grade point average (GPA), child grit, household eviction, household material hardship, caregiver layoff, and caregiver participation in job training. ...