Stephen W. Raudenbush's research while affiliated with University of Chicago and other places

Publications (158)

Article
In 2003, Chicago Public Schools introduced double-dose algebra, requiring two periods of math—one period of algebra and one of algebra support—for incoming ninth graders with eighth-grade math scores below the national median. Using a regression discontinuity design, earlier studies showed promising results from the program: For median-skill studen...
Article
Full-text available
Early linguistic input is a powerful predictor of children’s language outcomes. We investigated two novel questions about this relationship: Does the impact of language input vary over time, and does the impact of time-varying language input on child outcomes differ for vocabulary and for syntax? Using methods from epidemiology to account for basel...
Article
Full-text available
Social inequality in mathematical skill is apparent at kindergarten entry and persists during elementary school. To level the playing field, we trained teachers to assess children's numerical and spatial skills every 10 wk. Each assessment provided teachers with information about a child's growth trajectory on each skill, information designed to he...
Article
For educational researchers, Al-Ubaydli et al. raise a crucial question: How can the science of scaling experimental innovations contribute to school improvement? By assessing how particular innovative programs work, why, for whom and under what conditions, experimenters test theories: about how children and youth learn, about how adults can collab...
Article
Education research has experienced a methodological renaissance over the past two decades, with a new focus on large-scale randomized experiments. This wave of experiments has made education research an even more exciting area for statisticians, unearthing many lessons and challenges in experimental design, causal inference, and statistics more bro...
Article
The present paper considers a fundamental question in evaluation research: “By how much do program effects vary across sites?” The paper first presents a theoretical model of cross-site impact variation and a related estimation model with a random treatment coefficient and fixed site-specific intercepts. This approach eliminates several biases that...
Article
In 2003, Chicago launched “Double-Dose Algebra,” requiring students with pretest scores below the national median to take two periods of math–algebra and supplemental coursework. In many schools, assignment to Double Dose changed the peer composition of the algebra classroom. Using school-specific instrumental variables within a regression-disconti...
Article
The present article provides a synthesis of the conceptual and statistical issues involved in using multisite randomized trials to learn about and from a distribution of heterogeneous program impacts across individuals and/or program sites. Learning about such a distribution involves estimating its mean value, detecting and quantifying its variatio...
Chapter
Surveys of student perceptions produce multiple measures of classroom quality. This chapter explores which of these measures are most useful in predicting student learning. It introduces the Multilevel Variable Selection Model (MVSM) as standard methods of statistical prediction do not help much and can produce highly misleading results when the pr...
Article
Does experience in school increase or reduce social inequality in skills? Sociologists have long debated this question. Drawing from the counterfactual account of causality, we propose that the impact of going to school on a given skill depends on the quality of the instructional regime a child will experience at school compared with the quality of...
Article
The present paper, which is intended for a diverse audience of evaluation researchers, applied social scientists, and research funders, provides a broad overview of the conceptual and statistical issues involved in using multisite randomized trials to learn about and from variation in program effects across individuals, across policy-relevant and t...
Article
Full-text available
We review findings from a four-year longitudinal study of language learning conducted on two samples: a sample of typically developing children whose parents vary substantially in socioeconomic status, and a sample of children with pre- or perinatal brain injury. This design enables us to study language development across a wide range of language l...
Article
Differences in vocabulary that children bring with them to school can be traced back to the gestures they produced at the age of 1;2, which, in turn, can be traced back to the gestures their parents produced at the same age (Rowe & Goldin-Meadow, 2009a42. Rowe, M. L. & Goldin-Meadow, S. (2009a). Differences in early gesture explain SES disparities...
Article
The increasing availability of data from multisite randomized trials provides a potential opportunity to use instrumental variables (IV) methods to study the effects of multiple hypothesized mediators of the effect of a treatment. We derive nine assumptions needed to identify the effects of multiple mediators when using site-by-treatment interactio...
Article
Full-text available
Many youth development programs aim to improve youth outcomes by raising the quality of social interactions occurring in groups such as classrooms, athletic teams, therapy groups, after-school programs, or recreation centers. As a result, evaluators are increasingly interested in determining whether such programs significantly improve “group qualit...
Chapter
Most causal analyses in the social sciences depend on the assumption that each participant possesses a single potential outcome under each possible treatment assignment. Rubin (J Am Stat Assoc 81:961–962, 1986) labeled this the “stable unit treatment value assumption” (SUTVA). Under SUTVA, the individual-specific impact of a treatment depends neith...
Article
Abstract This article extends single-level missing data methods to efficient estimation of a Q-level nested hierarchical general linear model given ignorable missing data with a general missing pattern at any of the Q levels. The key idea is to reexpress a desired hierarchical model as the joint distribution of all variables including the outcome t...
Article
Social scientists are frequently interested in assessing the qualities of social settings such as classrooms, schools, neighborhoods, or day care centers. The most common procedure requires observers to rate social interactions within these settings on multiple items and then to combine the item responses to obtain a summary measure of setting qual...
Article
Multisite trials can clarify the average impact of a new program and the heterogeneity of impacts across sites. Unfortunately, in many applications, compliance with treatment assignment is imperfect. For these applications, we propose an instrumental variable (IV) model with person-specific and site-specific random coefficients. Site-specific IV co...
Article
For decades, social scientists have been trying to answer causal questions about the effectiveness of certain programs or policies. The conventional methodology for answering such causal questions relies on the “no interference between different units” assumption; that is, a unit’s outcome depends solely on the treatment that the unit is assigned t...
Article
Children vary widely in the rate at which they acquire words--some start slow and speed up, others start fast and continue at a steady pace. Do early developmental variations of this sort help predict vocabulary skill just prior to kindergarten entry? This longitudinal study starts by examining important predictors (socioeconomic status [SES], pare...
Article
This article addresses three questions: Does reduced class size cause higher academic achievement in reading, mathematics, listening, and word recognition skills? If it does, how large are these effects? Does the magnitude of such effects vary significantly across schools? The authors analyze data from Tennessee’s Student/Teacher Achievement Ratio...
Article
To provide a method for any hospital to evaluate patient mortality using a hierarchical risk-adjustment equation derived from a reference sample. American College of Surgeons National Trauma Data Bank (NTDB). Hierarchical logistic regression models predicting mortality were estimated from NTDB data. Risk-adjusted hospital effects obtained directly...
Article
In organizational studies involving multiple levels, the association between a covariate and an outcome often differs at different levels of aggregation, giving rise to widespread interest in ‘‘contextual effects models.’’ Such models partition the regression into within- and between-cluster components. The conventional approach uses each cluster’s...
Article
Fixed effects models are often useful in longitudinal studies when the goal is to assess the impact of teacher or school characteristics on student learning. In this article, I introduce an alternative procedure: adaptive centering with random effects. I show that this procedure can replicate the fixed effects analysis while offering several compar...
Article
The ability of school (or teacher) value-added models to provide unbiased estimates of school (or teacher) effects rests on a set of assumptions. In this article, we identify six assumptions that are required so that the estimands of such models are well defined and the models are able to recover the desired parameters from observable data. These a...
Article
This article examines the power analyses for the first wave of group-randomized trials funded by the Institute of Education Sciences. Specifically, it assesses the precision and technical accuracy of the studies. The authors identified the appropriate experimental design and estimated the minimum detectable standardized effect size (MDES) for each...
Article
A number of recent studies have used surveys of neighborhood informants and direct observation of city streets to assess aspects of community life such as collective efficacy, the density of kin networks, and social disorder. Raudenbush and Sampson (1999a) have coined the term “ecometrics” to denote the study of the reliability and validity of such...
Article
The gap between Blacks and Whites in educational outcomes has narrowed dramatically over the past 60 years, but progress stopped around 1990. The author reviews research suggesting that increasing the quantity and quality of schooling can play a powerful role in overcoming racial inequality. To achieve that goal, he reasons, our knowledge of best i...
Article
This volume considers the problem of quantitatively summarizing results from a stream of studies, each testing a common hypothesis. In the simplest case, each study yields a single estimate of the impact of some intervention. Such an estimate will deviate from the true effect size as a function of random error because each study uses a finite sampl...
Article
The authors propose a strategy for studying the effects of time-varying instructional treatments on repeatedly observed student achievement. This approach responds to three challenges: (a) The yearly reallocation of students to classrooms and teachers creates a complex structure of dependence among responses; (b) a child’s learning outcome under a...
Article
Full-text available
This paper provides practical guidance for researchers who are designing studies that randomize groups to measure the impacts of interventions on children. To do so, the paper: (1) provides new empirical information about the values of parameters that influence the precision of impact estimates (intra-class correlations and R-squares);(2) examines...
Chapter
Many youth development programs are designed to improve youth outcomes by improving the quality of social interactions occurring in classrooms, athletic teams, therapy groups, after-school programs, recreation centers, or other group settings. In evaluating such programs, it becomes essential to assess the impact of the program on the "group qualit...
Article
A dramatic shift in research priorities has recently produced a large number of ambitious randomized trials in K-12 education. In most cases, the aim is to improve student academic learning by improving classroom instruction. Embedded in these studies are theories about how the quality of classroom must improve if these interventions are to succeed...
Article
Understanding the impact of “instructional regimes” on student learning is central to advancing educational policy. Research on instructional regimes has parallels with clinical trials in medicine yet poses unique challenges because of the social nature of instruction: A child’s potential outcome under a given regime depends on peers and teachers,...
Article
Disparities in verbal ability, a major predictor of later life outcomes, have generated widespread debate, but few studies have been able to isolate neighborhood-level causes in a developmentally and ecologically appropriate way. This study presents longitudinal evidence from a large-scale study of >2,000 children ages 6–12 living in Chicago, along...
Article
The development of model-based methods for incomplete data has been a seminal contribution to statistical practice. Under the assumption of ignorable missingness, one estimates the joint distribution of the complete data for thetainTheta from the incomplete or observed data y(obs). Many interesting models involve one-to-one transformations of theta...
Article
Full-text available
Many youth development programs are designed to improve youth outcomes by improving the quality of social interactions occurring in classrooms, athletic teams, therapy groups, after-school programs, recreation centers, or other group settings. In evaluating such programs, it becomes essential to assess the impact of the program on the “group qualit...
Chapter
Hierarchical data from many small clusters arise by necessity and by design. They arise by necessity when the aim is to study married couples [1], identical twins [25], siblings [12], paired comparison tasks [2], cooperative learning groups [36], multiple informants of child social behavior [20], and studies of animal reproduction [35]. They arise...
Article
Interest has rapidly increased in studies that randomly assign classrooms or schools to interventions. When well implemented, such studies eliminate selection bias, providing strong evidence about the impact of the interventions. However, unless expected impacts are large, the number of units to be randomized needs to be quite large to achieve adeq...
Article
The age at which a child receives a cochlear implant seems to be one of the more important predictors of his or her speech and language outcomes. However, understanding the association between age at implantation and child outcomes is complex because a child's age, length of device use, and age at implantation are highly related. In this study, we...
Article
In many surveys, responses to earlier questions determine whether later questions are asked. The probability of an affirmative response to a given item is therefore nonzero only if the participant responded affirmatively to some set of logically prior items, known as “filter items.” In such surveys, the usual conditional independence assumption of...
Article
This article considers the policy of retaining low-achieving children in kindergarten rather than promoting them to first grade. Under the stable unit treatment value assumption (SUTVA) as articulated by Rubin, each child at risk of retention has two potential outcomes: Y(1) if retained and Y(0) if promoted. But SUTVA is questionable, because a chi...
Article
We examined whether retail tobacco outlet density was related to youth cigarette smoking after control for a diverse range of neighborhood characteristics. Data were gathered from 2116 respondents (aged 11 to 23 years) residing in 178 census tracts in Chicago, Ill. Propensity score stratification methods for continuous exposures were used to adjust...
Article
This article considers the policy of retaining low-achieving children in kindergarten rather than promoting them to first grade. Under the stable unit treatment value assumption (SUTVA) as articulated by Rubin, each child at risk of retention has two potential outcomes: Y(1) if retained and Y(0) if promoted. But SUTVA is questionable, because a chi...
Article
Applications of group trajectory modeling summarize individual histories in a language that is broadly accessible to clinicians. This strength depends on the belief that a population consists, at least roughly, of a small number of subgroups whose members display similar records of behavior. In this view, the purpose of longitudinal research is to...
Article
Several hypotheses in family psychology involve comparisons of sociocultural groups. Yet the potential for cross-cultural inequivalence in widely used psychological measurement instruments threatens the validity of inferences about group differences. Methods for dealing with these issues have been developed via the framework of item response theory...
Article
In most visual preference surveys, citizens are shown a sample of scenes and asked to rate them on a preference scale. Scenes are then classified by type, and for each scene type, statistics are computed. In the end, results may suggest that one scene type is preferred to another, but that is about all that can be said. In this article, we offer an...
Article
Grade retention has been controversial for many years, and current calls to end social promotion have lent new urgency to this issue. On the one hand, a policy of retaining in grade those students making slow progress might facilitate instruction by making classrooms more homogeneous academically. On the other hand, grade retention might harm high-...
Article
In recent years, several studies in the medical and health service research literature have advocated the use of hierarchical statistical models (multilevel models or random-effects models) to analyze data that are nested (eg, patients nested within hospitals). However, these models are computer-intensive and complicated to perform. There is virtua...
Article
Education research is an interdisciplinary effort long characterized by methodological diversity. Why, then, do we hear an urgent call for mixed methods now? Apparently, a recent shift in the applied research agenda has fostered concern that methodological pluralism is at risk. In this article, the author argues that (a) a focus on evaluating the e...
Article
Full-text available
We analyzed key individual, family, and neighborhood factors to assess competing hypotheses regarding racial/ethnic gaps in perpetrating violence. From 1995 to 2002, we collected 3 waves of data on 2974 participants aged 8 [corrected] to 25 years living in 180 Chicago neighborhoods, augmented by a separate community survey of 8782 Chicago residents...
Article
This article reveals the grounds on which individuals form perceptions of disorder. Integrating ideas about implicit bias and statistical discrimination with a theoretical framework on neighborhood racial stigma, our empirical test brings together personal interviews, census data, police records, and systematic social observations situated with- in...
Article
Global propositions about the problems and prospects of program evaluation obscure the circumstances in which social science research can contribute to social problem solving.
Article
The noncentrality parameter for the noncentral F is a precision-weighted sum of squares of treatment means, which is closely related to the test statistic and effect size. The two common effect size estimates are not based on the uniformly minimum variance unbiased (UMVU) estimate of the noncentrality parameter. The UMVU estimate of the noncentrali...
Article
The question of how to estimate school and teacher contributions to student learning is fundamental to educational policy and practice, and the three thoughtful articles in this issue represent a major advance. The current level of public confusion about these issues is so severe and the consequences for schooling so great that it is a big relief t...
Article
Full-text available
Multilevel statistical models have become increasingly popular among public health researchers over the past decade. Yet the enthusiasm with which these models are being adopted may obscure rather than solve some problems of statistical and substantive inference. We discuss the three most common applications of multilevel models in public health: (...
Article
Full-text available
To determine the relationship between urban sprawl, health, and health-related behaviors. Cross-sectional analysis using hierarchical modeling to relate characteristics of individuals and places to levels of physical activity, obesity, body mass index (BMI), hypertension, diabetes, and coronary heart disease. U.S. counties (448) and metropolitan ar...
Article
Purpose: To determine the relationship between urban sprawl, health, and health-related behaviors. Design: Cross-sectional analysis using hierarchical modeling to relate characteristics of individuals and places to levels of physical activity, obesity, body mass index (BMI), hypertension, diabetes, and coronary heart disease. Setting: U.S. cou...
Article
In studying correlates of social behavior, attitudes, and beliefs, a measurement model is required to combine information across a large number of item responses. Multiple constructs are often of interest, and covariates are often multilevel (e.g., measured at the person and neighborhood level). Some item–level missing data can be expected. This pa...
Article
Many researchers who study the relations between school resources and student achievement have worked from a causal model, which typically is implicit. In this model, some resource or set of resources is the causal variable and student achievement is the outcome. In a few recent, more nuanced versions, resource effects depend on intervening influen...
Chapter
As interest in the social sciences and public health increasingly turns to the integration of individual, family, and neighborhood processes, a potential mismatch arises in the quality of measures. Standing behind individual measurement are decades of research, producing measures that often have excellent statistical properties. In contrast, much l...
Article
Full-text available
Differences in maternal characteristics only partially explain the lower birth weights of infants of African-American women. It is hypothesized that economic and social features of urban neighborhoods may further account for these differences. The authors conducted a household survey of 8,782 adults residing in 343 Chicago, Illinois, neighborhoods...
Article
This paper considers the quantitative assessment of ecological settings such as neighborhoods and schools. Available administrative data typically provide useful but limited information on such settings. We demonstrate how more complete information can be reliably obtained from surveys and observational studies. Survey-based assessments are constru...
Article
Much social and behavioral research involves hierarchical data structures. . . . Recent developments in the statistical theory of hierarchical linear models now afford an integrated set of methods for such applications. This introductory text explicates the theory and use of hierarchical linear models (HLM) through rich, illustrative examples and...
Article
Consider a study in which 2 groups are followed over time to assess group differences in the average rate of change, rate of acceleration, or higher degree polynomial effect. In designing such a study, one must decide on the duration of the study, frequency of observation, and number of participants. The authors consider how these choices affect st...
Article
This article investigates the efficiency and robustness of alternative estimators of regression coefficients for three-level data. To study student achievement, researchers might formulate a standard regression model or a hierarchical model with a two- or three-level structure. Having chosen the model, the researchers might employ either a model-ba...
Article
Highlighting resource inequality, social processes, and spatial interdependence, this study combines structural characteristics from the 1990 census with a survey of 8,872 Chicago residents in 1995 to predict homicide variations in 1996–1998 across 343 neighborhoods. Spatial proximity to homicide is strongly related to increased homicide rates, adj...
Article
This review considers statistical analysis of data from studies that obtain repeated measures on each of many participants. Such studies aim to describe the average change in populations and to illuminate individual differences in trajectories of change. A person-specific model for the trajectory of each participant is viewed as the foundation of a...
Article
Places the hierarchical linear models into a broader context. Arguing that restrictions imposed by software and estimation ought not to limit the researchers' imagination, the author discusses a variety of models for various types of change, in various metrics. Among these, similarities are considered between hiearchical linear and structural equat...
Article
This article considers an analytic strategy for measuring and modeling child and adolescent problem behaviors. The strategy embeds an item response model within a hierarchical model to define an interval scale for the outcomes, to assess dimensionality, and to study how individual and contextual factors relate to multiple dimensions of problem beha...
Article
This article compares the degree to which educational attainment and cognitive skill, individually and together, serve to explain labor force outcomes (occupational status and earnings). Although the same antecedent factors affect both of them and they both are associated with labor force outcomes, they are not redundant measures. They are affected...
Article
The multisite trial, widely used in mental health research and education, enables experimenters to assess the average impact of a treatment across sites, the variance of treatment impact across sites, and the moderating effect of site characteristics on treatment efficacy. Key design decisions include the sample size per site and the number of site...
Article
In accelerated longitudinal design, one samples multiple age cohorts and then collects longitudinal data on members of each cohort. The aim is to study age-outcome trajectories over a broad age span during a study of short duration. A threat to valid inference is the Age x Cohort interaction effect. S. W. Raudenbush and W. S. Chan (1993) developed...
Article
Nested random effects models are often used to represent similar processes occurring in each of many clusters. Suppose that, given cluster-specific random effects b, the data y are distributed according to f(y|b, Θ), while b follows a density p(b|Θ). Likelihood inference requires maximization of ∫ f(y|b, Θ)p(b|Θdb with respect to Θ. Evaluation of t...
Article
Using data collected under the Trial State Assessment (TSA) of the National Assessment of Educational Progress (NAEP), this article describes and illustrates a two-stage statistical model for investigating state-to-state variation in mathematics achievement. At the first stage, within each state, a two-level hierarchical linear model is estimated v...
Article
Researchers commonly ask whether relationships between exogenous predictors, X, and outcomes, Y, are mediated by a third set of variables, Z. Simultaneous equations decompose the relationship between X and Y into an indirect component, operating through Z, and a direct component, the relationship between X and Y given Z. Often, X, Y, and/or Z are m...
Article
This article assesses the sources and consequences of public disorder. Based on the videotaping and systematic rating of more than 23,000 street segments in Chicago, highly reliable scales of social and physical disorder for 196 neighborhoods are constructed. Census data, police records, and an independent survey of more than 3,500 residents are th...
Article
This article considers social and ethnic inequality in access to resources for mathematics learning in eighth grade: favorable school disciplinary climate, advanced course offerings, teacher subject-matter preparation, and emphasis on reasoning during classroom discourse. Data are from 41 states and territories1 participating in the 1992 Trial Stat...
Article
Bayesian analysis of hierarchically structured data with random intercept and heterogeneous within-group (Level-1) variance is presented. Inferences about all parameters, including the Level-1 variance and intercept for each group, are based on their marginal posterior distributions approximated via the Gibbs sampler Analysis of artificial data wit...
Article
Few would deny that the civil rights and women's movements have substantially changed U.S. society. Yet ethnic and gender inequality in employment and earnings remain large. Even when comparisons are confined to persons of similar educational attainment, African Americans and Hispanic Americans earn less than European Americans, women earn less tha...
Article
This study reports on the development of a structured interview, My Exposure to Violence (My ETV), that was designed to assess child and youth exposure to violence. Eighty participants between the ages of 9 and 24 were assessed. Data from My ETV were fit to a Rasch model for rating scales, a technique that generates interval level measures and allo...
Article
This study reports on the development of a structured interview, My Exposure to Violence (My ETV), that was designed to assess child and youth exposure to violence. Eighty participants between the ages of 9 and 24 were assessed. Data from My ETV were fit to a Rasch model for rating scales, a technique that generates interval level measures and allo...
Article
Empirical researchers commonly invoke instrumental variable (IV) assumptions to identify treatment effects. This paper considers what can be learned under two specific violations of those assumptions: contaminated and corrupted data. Either of these violations prevents point identification, but sharp bounds of the treatment effect remain feasible....

Citations

... In [36], mixed linear regression analysis is applied to a large database to establish the correlation between the student's relationship and their academic performance. Finally, ref. [37] evaluated the impact of sequences of parents' input on children's language outcomes. Although all these applications are longitudinal or level studies, they are focused on selecting the best model for prediction, but no one addresses the issue of optimizing data recollection along time. ...