Sociological Methods & Research

Published by SAGE Publications
Print ISSN: 0049-1241
A simultaneous latent class analysis of survey data from 1978 and 1983 is used to clarify the current controversy over whether opposition to abortion reflects a conservative sexual morality or pro-life values. Results indicate that the “pro-life” and “pro-choice” dichotomy represents an incomplete characterization of the American public—a third group, characterized by a conservative sexual morality and opposition to discretionary abortion, but not to nondiscretionary abortion, must also be included in the classification. The evidence indicates that from 1978 to 1983 this sexually conservative group decreased from one-third to one-fourth of the U.S. adult population and that the “pro-choice” increased to approximately one-half of the population over the same period. The proportion of the population characterized as the “pro-life” class remained stable at approximately 1 of 4 over the same time period.
Multiequation models that contain observed or latent variables are common in the social sciences. To determine whether unique parameter values exist for such models, one needs to assess model identification. In practice analysts rely on empirical checks that evaluate the singularity of the information matrix evaluated at sample estimates of parameters. The discrepancy between estimates and population values, the limitations of numerical assessments of ranks, and the difference between local and global identification make this practice less than perfect. In this paper we outline how to use computer algebra systems (CAS) to determine the local and global identification of multiequation models with or without latent variables. We demonstrate a symbolic CAS approach to local identification and develop a CAS approach to obtain explicit algebraic solutions for each of the model parameters. We illustrate the procedures with several examples, including a new proof of the identification of a model for handling missing data using auxiliary variables. We present an identification procedure for Structural Equation Models that makes use of CAS and that is a useful complement to current methods.
This paper reports an analysis of nonparticipation and bias in a survey research project conducted among seniors in 18 high schools under the federal "informed consent" regulations. Three major findings emerge. First, the use of voluntary participation and (among students under 18)parental consent procedures reduced the participation rate sharply from that obtained in a similar survey in 1964. However, the reduced participation did not introduce much bias into three criterion measures for which population data were available: mean intelligence score, mean GPA, and the intelligence-GPA correlation. Third, we found that bias in these measures on a school-by-school basis was not strongly correlated with the participation rate, suggesting that researchers need to consider factors other than the response rate in assessing the amount of bias in survey research on high school populations.
Life course perspectives focus on the variation in trajectories, generally to identify differences in variation dynamics and classify trajectories accordingly. Our goal here is to develop methods to gauge the discontinuity characteristics trajectories exhibit and demonstrate how these measures facilitate analyses aimed to evaluate, compare, aggregate, and classify behaviors based on the event discontinuity they manifest. We restrict ourselves here to binary event sequences, providing directions for extending the methods in future research. We illustrate our techniques to data on older drug users. It should be noted though, that the application of these techniques is not restricted to drug use, but can be applied to a wide range of trajectory types. We suggest that the innovative measures of discontinuity presented can be further developed to provide additional analytical tools in social science research and in future applications. Our novel discontinuity measure visualizations have the potential to be valuable assessment strategies for interventions, prevention efforts, and other social services utilizing life course data.
Sociology is pluralist in subject matter, theory, and method, and thus a good place to entertain ideas about causation associated with their use under the law. I focus on two themes of their article: (1) the legal lens on causation that "considers populations in order to make statements about individuals" and (2) the importance of distinguishing between effects of causes and causes of effects.
Observed recruitment chains. Seeds at top size, proportional to number directly recruited.
A. Distribution of V-H estimates of proportion of middle-/low-tier sex workers resulting from the bootstrap procedure under various recruitment regimes. B . Distribution of V-H estimates of proportions of high tier sex workers resulting from the bootstrap procedure under various recruitment regimes. 
Actual and Observed Cross-tier Recruitment Ties and Distribution of All- alters Known by Recruiters by Tier for 174 Middle/Low Recruiters and 100 High-tier Recruiters.
Fixed-effects Logistic Regression Models: Odds of Passing a Coupon to an Alter by Relationship Attributes between Recruiter and Alters and Tier of Recruiter.
Distribution of Reasons Provided by Recruiting Participants for Why They Offered Coupons to Their Alters, Why They Withheld Coupons, and Why They Thought Their Alters Rejected the Invitations.
Respondent-driven sampling (RDS) is a method for recruiting "hidden" populations through a network-based, chain and peer referral process. RDS recruits hidden populations more effectively than other sampling methods and promises to generate unbiased estimates of their characteristics. RDS's faithful representation of hidden populations relies on the validity of core assumptions regarding the unobserved referral process. With empirical recruitment data from an RDS study of female sex workers (FSWs) in Shanghai, we assess the RDS assumption that participants recruit nonpreferentially from among their network alters. We also present a bootstrap method for constructing the confidence intervals around RDS estimates. This approach uniquely incorporates real-world features of the population under study (e.g., the sample's observed branching structure). We then extend this approach to approximate the distribution of RDS estimates under various peer recruitment scenarios consistent with the data as a means to quantify the impact of recruitment bias and of rejection bias on the RDS estimates. We find that the hierarchical social organization of FSWs leads to recruitment biases by constraining RDS recruitment across social classes and introducing bias in the RDS estimates.
Definitions of direct and indirect effects are given for settings in which individuals are clustered in groups or neighborhoods and in which treatments are administered at the group level. A particular intervention may affect individual outcomes both through its effect on the individual and by changing the group or neighborhood itself. Identification conditions are given for controlled direct effects and for natural direct and indirect effects. The interpretation of these identification conditions are discussed within the context of neighborhood research and multilevel modeling. Interventions at a single point in time and time-varying interventions are both considered. The definition of direct and indirect effects requires certain stability or no-interference conditions; some discussion is given as to how these no-interference conditions can be relaxed.
Within-survey multiple imputation (MI) methods are adapted to pooled-survey regression estimation where one survey has more regressors, but typically fewer observations, than the other. This adaptation is achieved through: (1) larger numbers of imputations to compensate for the higher fraction of missing values; (2) model-fit statistics to check the assumption that the two surveys sample from a common universe; and (3) specificying the analysis model completely from variables present in the survey with the larger set of regressors, thereby excluding variables never jointly observed. In contrast to the typical within-survey MI context, cross-survey missingness is monotonic and easily satisfies the Missing At Random (MAR) assumption needed for unbiased MI. Large efficiency gains and substantial reduction in omitted variable bias are demonstrated in an application to sociodemographic differences in the risk of child obesity estimated from two nationally-representative cohort surveys.
The authors consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on his or her behavior or other measurable responses. The authors show that generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular the authors demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual's enduring traits and his or her choices, even when there is no intrinsic affinity between them. The authors also suggest some possible constructive responses to these results.
This article is an empirical evaluation of the choice of fixed cutoff points in assessing the root mean square error of approximation (RMSEA) test statistic as a measure of goodness-of-fit in Structural Equation Models. Using simulation data, the authors first examine whether there is any empirical evidence for the use of a universal cutoff, and then compare the practice of using the point estimate of the RMSEA alone versus that of using it jointly with its related confidence interval. The results of the study demonstrate that there is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval. The authors' analyses suggest that to achieve a certain level of power or Type I error rate, the choice of cutoff values depends on model specifications, degrees of freedom, and sample size.
The Displaced New Orleans Residents Pilot Study was designed to examine the current location, well-being, and plans of people who lived in the City of New Orleans when Hurricane Katrina struck on 29 August 2005. The study is based on a representative sample of pre-Katrina dwellings in the city. Respondents were administered a short paper-and-pencil interview by mail, by telephone, or in person. The pilot study was fielded in the fall of 2006, approximately one year after Hurricane Katrina. In this paper, we describe the motivation for the pilot study, outline its design, and describe the fieldwork results using a set of fieldwork outcome rates and multivariate logistic models. We end with a discussion of the lessons learned from the pilot study for future studies of the effects of Hurricane Katrina on the population of New Orleans. The results point to the challenges and opportunities of studying this unique population.
Average Absolute Difference Between Empirical Standard Errors and Formula-Based Standard Errors of MI Estimates ( ~ u) Across 15 Conditions (5 Missing Data Proportions 3 3 Sample Sizes) 
Normal-distribution-based maximum likelihood (ML) and multiple imputation (MI) are the two major procedures for missing data analysis. This article compares the two procedures with respects to bias and efficiency of parameter estimates. It also compares formula-based standard errors (SEs) for each procedure against the corresponding empirical SEs. The results indicate that parameter estimates by MI tend to be less efficient than those by ML; and the estimates of variance-covariance parameters by MI are also more biased. In particular, when the population for the observed variables possesses heavy tails, estimates of variance-covariance parameters by MI may contain severe bias even at relative large sample sizes. Although performing a lot better, ML parameter estimates may also contain substantial bias at smaller sample sizes. The results also indicate that, when the underlying population is close to normally distributed, SEs based on the sandwich-type covariance matrix and those based on the observed information matrix are very comparable to empirical SEs with either ML or MI. When the underlying distribution has heavier tails, SEs based on the sandwich-type covariance matrix for ML estimates are more reliable than those based on the observed information matrix. Both empirical results and analysis show that neither SEs based on the observed information matrix nor those based on the sandwich-type covariance matrix can provide consistent SEs in MI. Thus, ML is preferable to MI in practice, although parameter estimates by MI might still be consistent.
This article extends the Blinder-Oaxaca decomposition method to the decomposition of changes in the wage gap between white and black men over time. The previously implemented technique, in which the contributions of two decomposition components are estimated by subtracting those at time 0 from the corresponding ones at time 1, can yield an untenable conclusion about the extent to which the contributions of the coefficient and endowment effects account for changes in the wage gap over time. This article presents a modified version of Smith and Welch's (1989) decomposition method through which the sources of the change over time are decomposed into five components. The extents to which the education, age, region, metro residence, and marital status variables contribute to the rising racial wage gap between white and black men from 1980 to 2005 are estimated using the five-component detailed decomposition method and are contrasted with the results of the old simple subtraction decomposition technique. In conclusion, this article shows that changes in the racial wage gap between 1980 and 2005 result from many contradicting forces and cannot be reduced to one explanation.
This paper explores the implications of possible bias cancellation using Rubin-style matching methods with complete and incomplete data. After reviewing the naïve causal estimator and the approaches of Heckman and Rubin to the causal estimation problem, we show how missing data can complicate the estimation of average causal effects in different ways, depending upon the nature of the missing mechanism. While - contrary to published assertions in the literature - bias cancellation does not generally occur when the multivariate distribution of the errors is symmetric, bias cancellation has been observed to occur for the case where selection into training is the treatment variable, and earnings is the outcome variable. A substantive rationale for bias cancellation is offered, which conceptualizes bias cancellation as the result of a mixture process based on two distinct individual-level decision-making models. While the general properties are unknown, the existence of bias cancellation appears to reduce the average bias in both OLS and matching methods relative to the symmetric distribution case. Analysis of simulated data under a set of difference scenarios suggests that matching methods do better than OLS in reducing that portion of bias that comes purely from the error distribution (i.e., from “selection on unobservables”). This advantage is often found also for the incomplete data case. Matching appears to offer no advantage over OLS in reducing the impact of bias due purely to selection on unobservable variables when the error variables are generated by standard multivariate normal distributions, which lack the bias-cancellation property. (AUTHORS)
The model comparison framework of Levy and Hancock for covariance and mean structure models is extended to treat multiple-group models, both in cases in which group membership is known and in those in which it is unknown (i.e., finite mixtures). The framework addresses questions of distinguishability as well as difference in fit of the models with respect to data, first by determining the nature of the models’ relation in terms of the families of distributions that constitute the models and then by conducting the appropriate statistical tests. In the case of latent mixtures of groups, the standard likelihood ratio theory does not apply, and a bootstrapping approach is used to facilitate the tests. Illustrations demonstrate the procedures.
This article reports on an extension of group-based trajectory modeling to address nonrandom participant attrition or truncation due to death that varies across trajectory groups. The effects of the model extension are explored in both simulated and real data. The analyses of simulated data establish that estimates of trajectory group size as measured by group membership probabilities can be badly biased by differential attrition rates across groups if the groups are initially not well separated. Differential attrition rates also imply that group sizes will change over time, which in turn has important implications for using the model parameter estimates to make population-level projections. Analyses of longitudinal data on disability levels in a sample of very elderly individuals support both of these conclusions.
If a researcher wants to estimate the individual age, period, and cohort coefficients in an age-period-cohort (APC) model, the method of choice is constrained regression, which includes the intrinsic estimator (IE) recently introduced by Yang and colleagues. To better understand these constrained models, the author shows algebraically how each constraint is associated with a specific generalized inverse that is associated with a particular solution vector that (when the model is just identified under the constraint) produces the least square solution to the APC model. The author then discusses the geometry of constrained estimators in terms of solutions being orthogonal to constraints, solutions to various constraints all lying on a line single line in multidimensional space, the distance on that line between various solutions, and the crucial role of the null vector. This provides insight into what characteristics all constrained estimators share and what is unique about the IE. The first part of the article focuses on constrained estimators in general (including the IE), and the latter part compares and contrasts the properties of traditionally constrained APC estimators and the IE. The author concludes with some cautions and suggestions for researchers using and interpreting constrained estimators.
Average response rates for different combinations of γ 1 and γ 2 .
Estimated means for unadjusted and adjusted respondent sample, n = 2,500, f = 0. 
Relative ratio of root mean square error for the adjusted respondent mean to the unadjusted respondent mean, n = 2,500, f = 0. 
Relative ratio of root mean square error for the adjusted respondent mean to the unadjusted respondent mean, n = 2,500, f = 0. 
Prior work has shown that effective survey nonresponse adjustment variables should be highly correlated with both the propensity to respond to a survey and the survey variables of interest. In practice, propensity models are often used for nonresponse adjustment with multiple auxiliary variables as predictors. These auxiliary variables may be positively or negatively associated with survey participation, they may be correlated with each other, and can have positive or negative relationships with the survey variables. Yet the consequences for nonresponse adjustment of these conditions are not known to survey practitioners. Simulations are used here to examine the effects of multiple auxiliary variables with opposite relationships with survey participation and the survey variables. The results show that bias and mean square error of adjusted respondent means are substantially different when the predictors have relationships of the same directions compared to when they have opposite directions with either propensity or the survey variables. Implications for nonresponse adjustment and responsive designs will be discussed.
gives relbiases of the GREG for Volunteer sample sizes of 250, 500, and 1000. 
Percentage relative biases in 10,000 samples propensity weighted estimates and the general regression estimation for Volunteer and Reference sample sizes of 500. Probability of volunteering depends on covariates and analysis variables. 
Panels of persons who volunteer to participate in Web surveys are used to make estimates for entire populations, including persons who have no access to the Internet. One method of adjusting a volunteer sample to attempt to make it representative of a larger population involves randomly selecting a reference sample from the larger population. The act of volunteering is treated as a quasi-random process where each person has some probability of volunteering. One option for computing weights for the volunteers is to combine the reference sample and Web volunteers and estimate probabilities of being a Web volunteer via propensity modeling. There are several options for using the estimated propensities to estimate population quantities. Careful analysis to justify these methods is lacking. The goals of this article are (a) to identify the assumptions and techniques of estimation that will lead to correct inference under the quasi-random approach, (b) to explore whether methods used in practice are biased, and (c) to illustrate the performance of some estimators that use estimated propensities. Two of our main findings are (a) that estimators of means based on estimates of propensity models that do not use the weights associated with the reference sample are biased even when the probability of volunteering is correctly modeled and (b) if the probability of volunteering is associated with analysis variables collected in the volunteer survey, propensity modeling does not correct bias.
Partition Structure for Newcomb Final Week. K = 4
Simulation Results: A Summary for the Levels of Each Feature.
Partition Structure for the McKinney Data: K = 4  
Understanding social phenomena with the help of mathematical models requires a coherent combination of theory, models, and data together with using valid data analytic methods. The study of social networks through the use of mathematical models is no exception. The intuitions of structural balance were formalized and led to a pair of remarkable theorems giving the nature of partition structures for balanced signed networks. Algorithms for partitioning signed networks, informed by these formal results, were developed and applied empirically. More recently, “structural balance” was generalized to “relaxed structural balance,” and a modified partitioning algorithm was proposed. Given the critical interplay of theory, models, and data, it is important that methods for the partitioning of signed networks in terms of relaxed structural balance model are appropriate. The authors consider two algorithms for establishing partitions of signed networks in terms of relaxed structural balance. One is an older heuristic relocation algorithm, and the other is a new exact solution procedure. The former can be used both inductively and deductively. When used deductively, this requires some prespecification incorporating substantive insights. The new branch-and-bound algorithm is used inductively and requires no prespecification of an image matrix in terms of ideal blocks. Both procedures are demonstrated using several examples from the literature, and their contributions are discussed. Together, the two algorithms provide a sound foundation for partitioning signed networks and yield optimal partitions. Issues of network size and density are considered in terms of their consequences for algorithm performance.
The list experiment is used to detect latent beliefs when researchers suspect a substantial degree of social desirability bias from respondents. This methodology has been used in areas ranging from racial attitudes to political preferences. Meanwhile, social psychologists interested in the salience of physical attributes to social behavior have provided respondents with experimentally altered photographs to test the influence of particular visual cues or traits on social evaluations. This experimental research has examined the effect of skin blemishes, hairlessness, and particular racial attributes on respondents' evaluation of these photographs. While this approach isolates variation in particular visual characteristics from other visual aspects that tend to covary with the traits in question, it fails to adequately deal with social desirability bias. This shortcoming is particularly important when concerned with potentially charged visual cues, such as body mass index (BMI). The present article describes a novel experiment that combines the digital alteration of photographs with the list experiment approach. When tested on a nationally representative sample of Internet respondents, results suggest that when shown photographs of women, male respondents report differences in levels of attractiveness based on the perceived BMI of the photographed confederate. Overweight individuals are less likely than their normal weight peers to report different levels of attractiveness between high-BMI and low-BMI photographs. Knowing that evaluations of attractiveness influence labor market outcomes, the findings are particularly salient in a society with rising incidence of obesity.
3.1. Relative efficiency versus the values of the coefficient of variation of the scrambling variable.
3.2. Relative efficiency versus the values of the coefficient of variation of the sensitive variable.
3.1. Relative efficiency of the proposed alternative randomized response model with respect to the BBB model.
In this article, an alternative randomized response model is proposed. The proposed model is found to be more efficient than the randomized response model studied by Bar-Lev, Bobovitch, and Boukai (2004). The relative efficiency of the proposed model is studied with respect to the Bar-Lev et al. (2004) model under various situations.
In a recent article, Zhang and Hoffman discuss the use of discrete choice logit models in sociological research. In the present article, the authors estimate a multinomial logit model of U.K. Magistrates Courts sentencing using a data set collected by the National Association for the Care and Resettlement of Offenders (NACRO) and test the independence of irrelevant alternatives (IIA) property using six tests. Conducting the tests with the appropriate large sample critical values, the authors find that the acceptance or rejection of IIA depends both on which test and which variant of a given test is used. The authors then use simulation techniques to assess the size and power performance of the tests. The empirical example is revisited with the inferences performed using empirical critical values obtained by simulation, and the resultant inferences are compared. The results show that empirical workers should exercise caution when testing for IIA.
Characteristics of the Multigroup Confirmatory Factor Analysis (MCFA), Item Response Theory (IRT), and Latent Class Factor Analysis (LCFA) Models for Measurement Equivalence (ME)
Fit Statistics for the Estimated Multigroup Confirmatory Factor Analysis (MCFA), Item Response Theory (IRT), and Latent Class Factor Analysis (LCFA) Models a. MCFA analysis Npar LL LL df Significance BIC(LL) AIC(LL)
Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Form of Inequivalence
Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Fit Statistic
Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Number of Inequivalent Items per Scale
Three distinctive methods of assessing measurement equivalence of ordinal items, namely, confirmatory factor analysis, differential item functioning using item response theory, and latent class factor analysis, make different modeling assumptions and adopt different procedures. Simulation data are used to compare the performance of these three approaches in detecting the sources of measurement inequivalence. For this purpose, the authors simulated Likert-type data using two nonlinear models, one with categorical and one with continuous latent variables. Inequivalence was set up in the slope parameters (loadings) as well as in the item intercept parameters in a form resembling agreement and extreme response styles. Results indicate that the item response theory and latent class factor models can relatively accurately detect and locate inequivalence in the intercept and slope parameters both at the scale and the item levels. Confirmatory factor analysis performs well when inequivalence is located in the slope parameters but wrongfully indicates inequivalence in the slope parameters when inequivalence is located in the intercept parameters. Influences of sample size, number of inequivalent items in a scale, and model fit criteria on the performance of the three methods are also analyzed.
Visual research is still a rather dispersed and ill-defined domain within the social sciences. Despite a heightened interest in using visuals in research, efforts toward a more unified conceptual and methodological framework for dealing vigilantly with the specifics of this (relatively) new way of scholarly thinking and doing remain sparse and limited in scope. In this article, the author proposes a more encompassing and refined analytical framework for visual methods of research. The "Integrated Framework" tries to account for the great variety within each of the currently discerned types or methods. It does so by moving beyond the more or less arbitrary and often very hybridly defined modes and techniques, with a clear focus on what connects or transcends them. The second part of the article discusses a number of critical issues that have been raised while unfolding the framework. These issues continue to posea challenge to a more visual social science, but can be turned into opportunities for advancement when dealt with appropriately.
In this article, the authors demonstrate the utility of an extended latent Markov model for analyzing temporal configurations in the behaviors of a sample of 550 domestic violence batterers. Domestic violence research indicates that victims experience a constellation of abusive behaviors rather than a single type of violent outcome. There is also evidence that observed behaviors are highly dynamic, with batterers cycling back and forth between periods of no abuse and violent or controlling behavior. These issues pose methodological challenges for social scientists. The extended latent Markov method uses multiple indicators to characterize batterer behaviors and relates the trajectories of violent states to predictors of abuse at baseline. The authors discuss both methodological refinements of the latent Markov models and policy implications of the data analysis.
Example of grouped versus interleafed questionnaire format
Filter Response Regressed on Section Placement for the Three Randomly Ordered Sections (Clothing, Leisure, and Voting)
Mean Percentage of Don't Knows or Refusals to Follow-up Items by Filter Format
When filter questions are asked to determine respondent eligibility for follow-up items, they are administered either interleafed (follow-up items immediately after the relevant filter) or grouped (follow-up items after multiple filters). Experiments with mental health items have found the interleafed form produces fewer yeses to later filters than the grouped form. Given the sensitivity of mental health, it is unclear whether this is due to respondent desire to avoid sensitive issues or simply the desire to shorten the interview. The absence of validation data in these studies also means the nature of the measurement error associated with the filter types is unknown. We conducted an experiment using mainly nonsensitive topics of varying cognitive burden with a sample that allowed validation of some items. Filter format generally had an effect, which grew as the number of filters increased and was larger when the follow-up questions were more difficult. Surprisingly, there was no evidence that measurement error for filters was reduced in the grouped version; moreover, missing data for follow-up items was increased in that version.
The author discusses the general problem of evaluating differences in adjusted survivor functions and develops a heuristic approach to generate the expected events that would occur under a Cox proportional hazards model. Differences in the resulting expected survivor distributions can be tested using generalized log rank tests. This method should prove useful for making other kinds of comparisons and generating adjusted life tables. The author also discusses alternative specifications of the classical Cox model that allow time-varying effects and thus permit a more direct assessment of group differences at various points in time. He implements recently developed semipara- metric approaches for estimating time-varying effects, which permit statistical tests of group difference in effects as well as tests of time-invariant effects. He shows that these approaches can provide insight into the nature of time- varying effects and can help reveal the temporal dynamic of group differences.
Configurational comparative methods constitute promising methodological tools that narrow the gap between variable-oriented and case-oriented research. Their infancy, however, means that the limits and advantages of these techniques are not clear. Tests on the sensitivity of qualitative comparative analysis (QCA) results have been sparse in previous empirical studies, and so has the provision of guidelines for doing this. Therefore this article uses data from a textbook example to discuss and illustrate various robustness checks of results based on the employment of crisp-set QCA and fuzzy-set QCA. In doing so, it focuses on three issues: the calibration of raw data into set-membership values, the frequency of cases linked to the configurations, and the choice of consistency thresholds. The study emphasizes that robustness tests, using systematic procedures, should be regarded as an important, and maybe even indispensable, analytical step in configurational comparative analysis.
Over the past decades there has been an increasing use of panel surveys at the household or individual level. Panel data have important advantages compared to independent cross sections, but also two potential drawbacks: attrition bias and panel conditioning effects. Attrition bias arises if dropping out of the panel is correlated with a variable of interest. Panel conditioning arises if responses are influenced by participation in the previous wave(s); the experience of the previous interview(s) may affect the answers to questions on the same topic, such that these answers differ systematically from those of respondents interviewed for the first time. In this study the authors discuss how to disentangle attrition and panel conditioning effects and develop tests for panel conditioning allowing for nonrandom attrition. First, the authors consider a nonparametric approach with assumptions on the sample design only, leading to interval identification of the measures for the attrition and panel conditioning effects. Second, the authors introduce additional assumptions concerning the attrition process, which lead to point estimates and standard errors for both the attrition bias and the panel conditioning effect. The authors illustrate their method on a variety of repeated questions in two household panels. The authors find significant panel conditioning effects in knowledge questions, but not in other types of questions. The examples show that the bounds can be informative if the attrition rate is not too high. In most but not all of the examples, point estimates of the panel conditioning effect are similar for different additional assumptions on the attrition process.
In this article the authors draw attention to the most recent and promising developments of sequence analysis. Taking methodological developments in life course sociology as the starting point, the authors detail the complementary strength in sequence analysis in this field. They argue that recent advantages of sequence analysis were developed in response to criticism of the original work, particularly optimal matching analysis. This debate arose over the past two decades and culminated in the 2000 exchange in Sociological Methods & Research. The debate triggered a 'second wave" of sequence techniques that led to new technical implementations of old ideas in sequence analysis. The authors bring these new technical approaches together, demonstrate selected advances with synthetic example data, and show how they conceptually contribute to life course research. This article demonstrates that in less than a decade, the field has made much progress toward fulfilling the prediction that Andrew Abbott made in 2000, that 'anybody who believes that pattern search techniques are not going to be basic to social sciences over the next 25 years is going to be very much surprised" (p. 75).
Classification of Target Population 
, the 
reports the results. We find some hints of a problematic correlation
In recent years, social scientists have increasingly turned to matching as a method for drawing causal inferences from observational data. Matching compares those who receive a treatment to those with similar background attributes who do not receive a treatment. Researchers who use matching frequently tout its ability to reduce bias, particularly when applied to data sets that contain extensive background information. Drawing on a randomized voter mobilization experiment, the authors compare estimates generated by matching to an experimental benchmark. The enormous sample size enables the authors to exactly match each treated subject to 40 untreated subjects. Matching greatly exaggerates the effectiveness of preelection phone calls encouraging voter participation. Moreover, it can produce nonsensical results: Matching suggests that another pre-election phone call that encouraged people to wear their seat belts also generated huge increases in voter turnout. This illustration suggests that caution is warranted when applying matching estimators to observational data, particularly when one is uncertain about the potential for biased inference.
Root mean square error (RMSE) for observed-data estimators of the mean, variance, and standard deviation of a normal variable. To standardize the results, the value of has been set to 1. 
RMSE for MI estimators of the mean, variance, and standard deviation of a normal variable. The number of observations increases along the horizontal axis, while the number of imputations is held constant at D=5. To standardize the results, the value of has been set to 1. 
Single imputation (SI) estimators.
RMSE for MI estimators of the mean, variance, and standard deviation of a normal variable. The number of imputations D increases along the horizontal axis, while the number of observed and missing values is held constant at 20. To standardize the results, the value of has been set to 1.
Widely used methods for analyzing missing data can be biased in small samples. To understand these biases, we evaluate in detail the situation where a small univariate normal sample, with values missing at random, is analyzed using either observed-data maximum likelihood (ML) or multiple imputation (MI). We evaluate two types of MI: the usual Bayesian approach, which we call posterior draw (PD) imputation, and a little-used alternative, which we call ML imputation, in which values are imputed conditionally on an ML estimate. We find that observed-data ML is more efficient and has lower mean squared error than either type of MI. Between the two types of MI, ML imputation is more efficient than PD imputation, and ML imputation also has less potential for bias in small samples. The bias and efficiency of PD imputation can be improved by a change of prior.
Missing data are common in observational studies due to self-selection of subjects. Missing data can bias estimates of linear regression and related models. The nature of selection bias and econometric methods for correcting it are described. The econometric approach relies upon a specification of the selection mechanism. We extend this approach to binary logit and probit models and provide a simple test for selection bias in these models. An analysis of candidate preference in the 1984 U.S. presidential election illustrates the technique.
Web surveys have several advantages compared to more traditional surveys with in-person interviews, telephone interviews, or mail surveys. Their most obvious potential drawback is that they may not be representative of the population of interest because the sub-population with access to Internet is quite specific. This paper investigates propensity scores as a method for dealing with selection bias in web surveys. The authors' main example has an unusually rich sampling design, where the Internet sample is drawn from an existing much larger probability sample that is representative of the US 50+ population and their spouses (the Health and Retirement Study). They use this to estimate propensity scores and to construct weights based on the propensity scores to correct for selectivity. They investigate whether propensity weights constructed on the basis of a relatively small set of variables are sufficient to correct the distribution of other variables so that these distributions become representative of the population. If this is the case, information about these other variables could be collected over the Internet only. Using a backward stepwise regression they find that at a minimum all demographic variables are needed to construct the weights. The propensity adjustment works well for many but not all variables investigated. For example, they find that correcting on the basis of socio-economic status by using education level and personal income is not enough to get a representative estimate of stock ownership. This casts some doubt on the common procedure to use a few basic variables to blindly correct for selectivity in convenience samples drawn over the Internet. Alternatives include providing non-Internet users with access to the Web or conducting web surveys in the context of mixed mode surveys.
In this note, the Cramer-Rao lower bound of variance by using the two decks of cards in randomized response sampling has been developed. The lower bound of variance has been compared with the recent estimator proposed by Odumade and Singh at equal protection of respondents. A real practical face-to-face interview data collected using two decks of cards has been analyzed and the results are discussed.
This article uses longitudinal data from the British Cohort Study to examine the early labor market trajectories---^the careers---of more than 5,000 women aged 16 to 29 years. Conventional event history approaches focus on particular transitions, the return to work after childbirth, for example, whereas the authors treat female careers more holistically, using sequence methods and cluster analysis to arrive at a rich but readily interpretable description of the data. The authors' typology presents a fuller picture of the underlying heterogeneity of female career paths that may not be revealed by more conventional transition-focused methods. Furthermore, the authors contribute to the small but growing literature on sequence analysis of female labor force participation by using their typology to show how careers are related to family background and school experiences.
DI Question Protocol on Current Employment Characteristics 13 5.PDI Last time we interviewed you, on <INTDATE>, you said your job was <OCCUP>. Are you still in that same occupation?
Models of respondent cognition problems or conversational behaviour
The authors examine how questionnaire structure affects survey interaction in the context of dependent interviewing (DI). DI is widely used in panel surveys to reduce observed spurious change in respondent circumstances. Although a growing literature generally finds beneficial measurement properties, little is known about how DI functions in interviews. The authors systematically observed survey interaction using behavior coding and analyzed an application of DI to obtain respondent employment characteristics. The authors found respondents indicated change in circumstances through a number of verbal machinations, including mismatch answers and explanations. Assessing whether these behaviors influenced subsequent question administration, the authors found qualitative evidence that the information disclosed when negating a DI question leads to subsequent interviewing errors. Quantitative analyses supported this evidence, suggesting that standardized interviewing deteriorates as respondents struggle to identify change in their circumstances. This analysis suggests that the reliability of detail about changed circumstances may not be improved using DI.
Plot of C ( x ) (horizontal axis) vs. C ( x, t x ) (vertical 
Smoothed distributions of C ( x, t x ) from 5000 boot- 
Plot of C ( x, t x ) (horizontal axis) vs. C ( x ′ ) (vertical 
Categorical time series, covering comparable time spans, are often quite different in a number of aspects: the number of distinct states, the number of transitions, and the distribution of durations over states. Each of these aspects contributes to an aggregate property of such series that is called complexity. Among sociologists and demographers, complexity is believed to systematically differ between groups as a result of social structure or social change. Such groups differ in, for example, age, gender, or status. The author proposes quantifications of complexity, based upon the number of distinct subsequences in combination with, in case of associated durations, the variance of these durations. A simple algorithm to compute these coefficients is provided and some of the statistical properties of the coefficients are investigated in an application to family formation histories of young American females.
As an extension of hierarchical linear models (HLMs), cross-classified random effects models (CCREMs) are used for analyzing multilevel data that do not have strictly hierarchical structures. Proportional reduction in prediction error, a multilevel version of the R2 in ordinary multiple regression, measures the predictive ability of a model and is useful in model selection. However, such a measure is not yet available for CCREMs. Using a two-level random-intercept CCREM, the authors have investigated how the estimated variance components change when predictors are added and have extended the measures of proportional reduction in prediction error from HLMs to CCREMs. The extended measures are generally unbiased for both balanced and unbalanced designs. An example is provided to illustrate the computation and interpretation of these measures in CCREMs.
-Comparison of parameter estimates and standard errors (in parentheses) 
Clogg and Eliason (1987) proposed a simple method for taking account of survey weights when fitting log-linear models to contingency tables. This article investigates the properties of this method. A rationale is provided for the method when the weights are constant within the cells of the table. For more general cases, however, it is shown that the standard errors produced by the method are invalid, contrary to claims in the literature. The method is compared to the pseudo maximum likelihood method both theoretically and through an empirical study of social mobility relating daughter's class to father's class using survey data from France. The method of Clogg and Eliason is found to underestimate standard errors systematically. The article concludes by recommending against the use of this method, despite its simplicity. The limitations of the method may be overcome by using the pseudo maximum likelihood method.
Definitions of direct and indirect effects are given for settings in which individuals are clustered in groups or neighborhoods and in which treatments are administered at the group level. A particular intervention may affect individual outcomes both through its effect on the individual and by changing the group or neighborhood itself. Identification conditions are given for controlled direct effects and for natural direct and indirect effects. The interpretation of these identification conditions are discussed within the context of neighborhood research and multilevel modeling. Interventions at a single point in time and time-varying interventions are both considered. The definition of direct and indirect effects requires certain stability or no-interference conditions; some discussion is given as to how these no-interference conditions can be relaxed.
In studying temporally ordered rates of events, epidemiologists, demographers, and social scientists often find it useful to distinguish three different temporal dimensions, namely, age (age of the participants involved), time period (the calendar year or other temporal period for recording the events of interest), and cohort (birth cohort or generation). Age-period-cohort (APC) analysis aims to analyze age-year-specific archived event rates to capture temporal trends in the events under investigation. However, in the context of tables of rates, the well-known relationship among these three factors, Period - Age = Cohort, makes the parameter estimation of the APC multiple classification model difficult. The identification problem of the parameter estimation has been studied since the 1970s and still remains in debate. Recent developments in this regard include the intrinsic estimator (IE) method, the autoregressive cohort model, the age-period-cohort-characteristic (APCC) model, the regression splines model, the smoothing cohort model, and the hierarchical APC model. O'Brien (2011; pp. 419-452, this issue) makes a further contribution in studying constrained estimators, particularly the IE, in the APC models. The authors, however, have important disagreements with O'Brien as to what the statistical properties of the IE are and how the estimates from the IE should be interpreted. The authors point out these disagreements to conclude the article.
Time use data (TUD) are distinctive, being episodic in nature and consisting of both continuous and discrete (exact zeros) values. TUD is non-negative and generally right skewed. To analyze such data, the Tobit, and to a lesser extent, linear regression models are often used. Tobit models assume the zeros represent censored values of an underlying normally distributed latent variable that theoretically includes negative values. Both the linear regression and Tobit models have normality as a key assumption. The Poisson-gamma distribution is a distribution with both a point mass at zero (corresponding to zero time spent on a given activity) and a continuous component. Using generalized linear models, TUD can be modeled utilizing the Poisson-gamma distribution. Using TUD, Tobit and linear regression models are compared to the Poisson-gamma with respect to the interpretation of the model, the model fit (analysis of residuals), and model performance through the use of a simulated data experiment. The Poisson-gamma is found to be theoretically and empirically more sound in many circumstances.
Consent rates by consent type 
Propensity to consent, by BHPS respondent, interview and interviewer characteristics (bivariate probit regressions).
(continued) Coefficients 
(continued) Margin S.E. Margin 
In the UK, in order to link individual-level administrative records to survey responses, a respondent needs to give their written consent. This paper explores whether characteristics of the respondent, the interviewer or survey design features influence consent. We use the BHPS combined with a survey of interviewers to model the probability that respondents consent to adding health and social security records to their survey responses. We find that some respondent characteristics and characteristics of the interview process within the household matter. By contrast, interviewer characteristics, including personality and attitudes to persuading respondents, are not associated with consent.
Analyses of social network data have suggested that obesity, smoking, happiness, and loneliness all travel through social networks. Individuals exert “contagion effects” on one another through social ties and association. These analyses have come under critique because of the possibility that homophily from unmeasured factors may explain these statistical associations and because similar findings can be obtained when the same methodology is applied to height, acne, and headaches, for which the conclusion of contagion effects seems somewhat less plausible. The author uses sensitivity analysis techniques to assess the extent to which supposed contagion effects for obesity, smoking, happiness, and loneliness might be explained away by homophily or confounding and the extent to which the critique using analysis of data on height, acne, and headaches is relevant. Sensitivity analyses suggest that contagion effects for obesity and smoking cessation are reasonably robust to possible latent homophily or environmental confounding; those for happiness and loneliness are somewhat less so. Supposed effects for height, acne, and headaches are all easily explained away by latent homophily and confounding. The methodology that has been used in past studies for contagion effects in social networks, when used in conjunction with sensitivity analysis, may prove useful in establishing social influence for various behaviors and states. The sensitivity analysis approach can be used to address the critique of latent homophily as a possible explanation of associations interpreted as contagion effects.
Gaining valid answers to so-called sensitive questions is an age-old problem in survey research. Various techniques have been developed to guarantee anonymity and minimize the respondent's feelings of jeopardy. Two such techniques are the randomized response technique (RRT) and the unmatched count technique (UCT). In this study we evaluate the effectiveness of different implementations of the RRT (using a forced-response design) in a computer-assisted setting and also compare the use of the RRT to that of the UCT. The techniques are evaluated according to various quality criteria, such as the prevalence estimates they provide, the ease of their use, and respondent trust in the techniques. Our results indicate that the RRTs are problematic with respect to several domains, such as the limited trust they inspire and non-response, and that the RRT estimates are unreliable due to a strong false "no" bias, especially for the more sensitive questions. The UCT, however, performed well compared to the RRTs on all the evaluated measures. The UCT estimates also had more face validity than the RRT estimates. We conclude that the UCT is a promising alternative to RRT in self-administered surveys and that future research should be directed towards evaluating and improving the technique.
Top-cited authors
Kenneth A Bollen
  • University of North Carolina at Chapel Hill
J Scott Long
  • Indiana University Bloomington
Chih-Ping Chou
  • University of Southern California
Bobby L Jones
Paul D Allison
  • University of Pennsylvania