Sociological Methods & Research

Published by SAGE

Online ISSN: 1552-8294

·

Print ISSN: 0049-1241

Articles


Sexual Morality, Pro-Life Values, and Attitudes toward Abortion: A Simultaneous Latent Structure Analysis for 1978-1983
  • Article

December 1987

·

40 Reads

A simultaneous latent class analysis of survey data from 1978 and 1983 is used to clarify the current controversy over whether opposition to abortion reflects a conservative sexual morality or pro-life values. Results indicate that the “pro-life” and “pro-choice” dichotomy represents an incomplete characterization of the American public—a third group, characterized by a conservative sexual morality and opposition to discretionary abortion, but not to nondiscretionary abortion, must also be included in the classification. The evidence indicates that from 1978 to 1983 this sexually conservative group decreased from one-third to one-fourth of the U.S. adult population and that the “pro-choice” increased to approximately one-half of the population over the same period. The proportion of the population characterized as the “pro-life” class remained stable at approximately 1 of 4 over the same time period.
Share

Figure 1. Bollen and Hoyle (1990) Model.  
Figure 2.
Figure 3. Duncan Model (1975)  
Figure 4.  
Figure 5. Panel A: Enders (2008) Model Panel B: Enders (2008) Model  

+1

Model Identification and Computer Algebra
  • Article
  • Full-text available

October 2010

·

502 Reads

Multiequation models that contain observed or latent variables are common in the social sciences. To determine whether unique parameter values exist for such models, one needs to assess model identification. In practice analysts rely on empirical checks that evaluate the singularity of the information matrix evaluated at sample estimates of parameters. The discrepancy between estimates and population values, the limitations of numerical assessments of ranks, and the difference between local and global identification make this practice less than perfect. In this paper we outline how to use computer algebra systems (CAS) to determine the local and global identification of multiequation models with or without latent variables. We demonstrate a symbolic CAS approach to local identification and develop a CAS approach to obtain explicit algebraic solutions for each of the model parameters. We illustrate the procedures with several examples, including a new proof of the identification of a model for handling missing data using auxiliary variables. We present an identification procedure for Structural Equation Models that makes use of CAS and that is a useful complement to current methods.
Download

The Impact of Informed Consent Regulations On Response Rate and Response Bias

December 1977

·

32 Reads

This paper reports an analysis of nonparticipation and bias in a survey research project conducted among seniors in 18 high schools under the federal "informed consent" regulations. Three major findings emerge. First, the use of voluntary participation and (among students under 18)parental consent procedures reduced the participation rate sharply from that obtained in a similar survey in 1964. However, the reduced participation did not introduce much bias into three criterion measures for which population data were available: mean intelligence score, mean GPA, and the intelligence-GPA correlation. Third, we found that bias in these measures on a school-by-school basis was not strongly correlated with the participation rate, suggesting that researchers need to consider factors other than the response rate in assessing the amount of bias in survey research on high school populations.

Measuring Discontinuity in Binary Longitudinal Data Applications to Drug Use Trajectories

May 2014

·

65 Reads

Life course perspectives focus on the variation in trajectories, generally to identify differences in variation dynamics and classify trajectories accordingly. Our goal here is to develop methods to gauge the discontinuity characteristics trajectories exhibit and demonstrate how these measures facilitate analyses aimed to evaluate, compare, aggregate, and classify behaviors based on the event discontinuity they manifest. We restrict ourselves here to binary event sequences, providing directions for extending the methods in future research. We illustrate our techniques to data on older drug users. It should be noted though, that the application of these techniques is not restricted to drug use, but can be applied to a wide range of trajectory types. We suggest that the innovative measures of discontinuity presented can be further developed to provide additional analytical tools in social science research and in future applications. Our novel discontinuity measure visualizations have the potential to be valuable assessment strategies for interventions, prevention efforts, and other social services utilizing life course data.

Figure 1. Observed recruitment chains. Seeds at top size, proportional to number directly recruited.
Figure 2. A. Distribution of V-H estimates of proportion of middle-/low-tier sex workers resulting from the bootstrap procedure under various recruitment regimes. B . Distribution of V-H estimates of proportions of high tier sex workers resulting from the bootstrap procedure under various recruitment regimes. 
Table 2 . Actual and Observed Cross-tier Recruitment Ties and Distribution of All- alters Known by Recruiters by Tier for 174 Middle/Low Recruiters and 100 High-tier Recruiters.
Table 3 . Fixed-effects Logistic Regression Models: Odds of Passing a Coupon to an Alter by Relationship Attributes between Recruiter and Alters and Tier of Recruiter.
Table 4 . Distribution of Reasons Provided by Recruiting Participants for Why They Offered Coupons to Their Alters, Why They Withheld Coupons, and Why They Thought Their Alters Rejected the Invitations.
An Empirical Analysis of the Impact of Recruitment Patterns on RDS Estimates among a Socially Ordered Population of Female Sex Workers in China

August 2013

·

216 Reads

·

·

·

[...]

·

Respondent-driven sampling (RDS) is a method for recruiting "hidden" populations through a network-based, chain and peer referral process. RDS recruits hidden populations more effectively than other sampling methods and promises to generate unbiased estimates of their characteristics. RDS's faithful representation of hidden populations relies on the validity of core assumptions regarding the unobserved referral process. With empirical recruitment data from an RDS study of female sex workers (FSWs) in Shanghai, we assess the RDS assumption that participants recruit nonpreferentially from among their network alters. We also present a bootstrap method for constructing the confidence intervals around RDS estimates. This approach uniquely incorporates real-world features of the population under study (e.g., the sample's observed branching structure). We then extend this approach to approximate the distribution of RDS estimates under various peer recruitment scenarios consistent with the data as a means to quantify the impact of recruitment bias and of rejection bias on the RDS estimates. We find that the hierarchical social organization of FSWs leads to recruitment biases by constraining RDS recruitment across social classes and introducing bias in the RDS estimates.

Direct and Indirect Effects for Neighborhood-Based Clustered and Longitudinal Data

May 2010

·

140 Reads

Definitions of direct and indirect effects are given for settings in which individuals are clustered in groups or neighborhoods and in which treatments are administered at the group level. A particular intervention may affect individual outcomes both through its effect on the individual and by changing the group or neighborhood itself. Identification conditions are given for controlled direct effects and for natural direct and indirect effects. The interpretation of these identification conditions are discussed within the context of neighborhood research and multilevel modeling. Interventions at a single point in time and time-varying interventions are both considered. The definition of direct and indirect effects requires certain stability or no-interference conditions; some discussion is given as to how these no-interference conditions can be relaxed.

Multiple Imputation For Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys

November 2013

·

78 Reads

Within-survey multiple imputation (MI) methods are adapted to pooled-survey regression estimation where one survey has more regressors, but typically fewer observations, than the other. This adaptation is achieved through: (1) larger numbers of imputations to compensate for the higher fraction of missing values; (2) model-fit statistics to check the assumption that the two surveys sample from a common universe; and (3) specificying the analysis model completely from variables present in the survey with the larger set of regressors, thereby excluding variables never jointly observed. In contrast to the typical within-survey MI context, cross-survey missingness is monotonic and easily satisfies the Missing At Random (MAR) assumption needed for unbiased MI. Large efficiency gains and substantial reduction in omitted variable bias are demonstrated in an application to sociodemographic differences in the risk of child obesity estimated from two nationally-representative cohort surveys.

Homophily and Contagion Are Generically Confounded in Observational Social Network Studies

May 2011

·

251 Reads

The authors consider processes on social networks that can potentially involve three factors: homophily, or the formation of social ties due to matching individual traits; social contagion, also known as social influence; and the causal effect of an individual's covariates on his or her behavior or other measurable responses. The authors show that generically, all of these are confounded with each other. Distinguishing them from one another requires strong assumptions on the parametrization of the social process or on the adequacy of the covariates used (or both). In particular the authors demonstrate, with simple examples, that asymmetries in regression coefficients cannot identify causal effects and that very simple models of imitation (a form of social contagion) can produce substantial correlations between an individual's enduring traits and his or her choices, even when there is no intrinsic affinity between them. The authors also suggest some possible constructive responses to these results.

An Empirical Evaluation of the Use of Fixed Cutoff Points in RMSEA Test Statistic in Structural Equation Models

May 2008

·

1,718 Reads

This article is an empirical evaluation of the choice of fixed cutoff points in assessing the root mean square error of approximation (RMSEA) test statistic as a measure of goodness-of-fit in Structural Equation Models. Using simulation data, the authors first examine whether there is any empirical evidence for the use of a universal cutoff, and then compare the practice of using the point estimate of the RMSEA alone versus that of using it jointly with its related confidence interval. The results of the study demonstrate that there is little empirical support for the use of .05 or any other value as universal cutoff values to determine adequate model fit, regardless of whether the point estimate is used alone or jointly with the confidence interval. The authors' analyses suggest that to achieve a certain level of power or Type I error rate, the choice of cutoff values depends on model specifications, degrees of freedom, and sample size.

Tracing the Effects of Hurricane Katrina on the Population of New Orleans: The Displaced New Orleans Residents Pilot Study

August 2009

·

51 Reads

The Displaced New Orleans Residents Pilot Study was designed to examine the current location, well-being, and plans of people who lived in the City of New Orleans when Hurricane Katrina struck on 29 August 2005. The study is based on a representative sample of pre-Katrina dwellings in the city. Respondents were administered a short paper-and-pencil interview by mail, by telephone, or in person. The pilot study was fielded in the fall of 2006, approximately one year after Hurricane Katrina. In this paper, we describe the motivation for the pilot study, outline its design, and describe the fieldwork results using a set of fieldwork outcome rates and multivariate logistic models. We end with a discussion of the lessons learned from the pilot study for future studies of the effects of Hurricane Katrina on the population of New Orleans. The results point to the challenges and opportunities of studying this unique population.

Table 13 . Average Absolute Difference Between Empirical Standard Errors and Formula-Based Standard Errors of MI Estimates ( ~ u) Across 15 Conditions (5 Missing Data Proportions 3 3 Sample Sizes) 
ML Versus MI for Missing Data With Violation of Distribution Conditions

November 2012

·

315 Reads

Normal-distribution-based maximum likelihood (ML) and multiple imputation (MI) are the two major procedures for missing data analysis. This article compares the two procedures with respects to bias and efficiency of parameter estimates. It also compares formula-based standard errors (SEs) for each procedure against the corresponding empirical SEs. The results indicate that parameter estimates by MI tend to be less efficient than those by ML; and the estimates of variance-covariance parameters by MI are also more biased. In particular, when the population for the observed variables possesses heavy tails, estimates of variance-covariance parameters by MI may contain severe bias even at relative large sample sizes. Although performing a lot better, ML parameter estimates may also contain substantial bias at smaller sample sizes. The results also indicate that, when the underlying population is close to normally distributed, SEs based on the sandwich-type covariance matrix and those based on the observed information matrix are very comparable to empirical SEs with either ML or MI. When the underlying distribution has heavier tails, SEs based on the sandwich-type covariance matrix for ML estimates are more reliable than those based on the observed information matrix. Both empirical results and analysis show that neither SEs based on the observed information matrix nor those based on the sandwich-type covariance matrix can provide consistent SEs in MI. Thus, ML is preferable to MI in practice, although parameter estimates by MI might still be consistent.

Decomposing the Change in the Wage Gap Between White and Black Men Over Time, 1980-2005: An Extension of the Blinder-Oaxaca Decomposition Method

June 2010

·

135 Reads

This article extends the Blinder-Oaxaca decomposition method to the decomposition of changes in the wage gap between white and black men over time. The previously implemented technique, in which the contributions of two decomposition components are estimated by subtracting those at time 0 from the corresponding ones at time 1, can yield an untenable conclusion about the extent to which the contributions of the coefficient and endowment effects account for changes in the wage gap over time. This article presents a modified version of Smith and Welch's (1989) decomposition method through which the sources of the change over time are decomposed into five components. The extents to which the education, age, region, metro residence, and marital status variables contribute to the rising racial wage gap between white and black men from 1980 to 2005 are estimated using the five-component detailed decomposition method and are contrasted with the results of the old simple subtraction decomposition technique. In conclusion, this article shows that changes in the racial wage gap between 1980 and 2005 result from many contradicting forces and cannot be reduced to one explanation.

Estimating Causal Effects With Matching Methods in the Presence and Absence of Bias Cancellation

January 2001

·

30 Reads

This paper explores the implications of possible bias cancellation using Rubin-style matching methods with complete and incomplete data. After reviewing the naïve causal estimator and the approaches of Heckman and Rubin to the causal estimation problem, we show how missing data can complicate the estimation of average causal effects in different ways, depending upon the nature of the missing mechanism. While - contrary to published assertions in the literature - bias cancellation does not generally occur when the multivariate distribution of the errors is symmetric, bias cancellation has been observed to occur for the case where selection into training is the treatment variable, and earnings is the outcome variable. A substantive rationale for bias cancellation is offered, which conceptualizes bias cancellation as the result of a mixture process based on two distinct individual-level decision-making models. While the general properties are unknown, the existence of bias cancellation appears to reduce the average bias in both OLS and matching methods relative to the symmetric distribution case. Analysis of simulated data under a set of difference scenarios suggests that matching methods do better than OLS in reducing that portion of bias that comes purely from the error distribution (i.e., from “selection on unobservables”). This advantage is often found also for the incomplete data case. Matching appears to offer no advantage over OLS in reducing the impact of bias due purely to selection on unobservable variables when the error variables are generated by standard multivariate normal distributions, which lack the bias-cancellation property. (AUTHORS)

An Extended Model Comparison Framework for Covariance and Mean Structure Models, Accommodating Multiple Groups and Latent Mixtures

May 2011

·

24 Reads

The model comparison framework of Levy and Hancock for covariance and mean structure models is extended to treat multiple-group models, both in cases in which group membership is known and in those in which it is unknown (i.e., finite mixtures). The framework addresses questions of distinguishability as well as difference in fit of the models with respect to data, first by determining the nature of the models’ relation in terms of the families of distributions that constitute the models and then by conducting the appropriate statistical tests. In the case of latent mixtures of groups, the standard likelihood ratio theory does not apply, and a bootstrapping approach is used to facilitate the tests. Illustrations demonstrate the procedures.

Group-Based Trajectory Modeling Extended to Account for Nonrandom Participant Attrition

May 2011

·

1,430 Reads

This article reports on an extension of group-based trajectory modeling to address nonrandom participant attrition or truncation due to death that varies across trajectory groups. The effects of the model extension are explored in both simulated and real data. The analyses of simulated data establish that estimates of trajectory group size as measured by group membership probabilities can be badly biased by differential attrition rates across groups if the groups are initially not well separated. Differential attrition rates also imply that group sizes will change over time, which in turn has important implications for using the model parameter estimates to make population-level projections. Analyses of longitudinal data on disability levels in a sample of very elderly individuals support both of these conclusions.

Intrinsic Estimators as Constrained Estimators in Age-Period-Cohort Accounting Models

August 2011

·

38 Reads

If a researcher wants to estimate the individual age, period, and cohort coefficients in an age-period-cohort (APC) model, the method of choice is constrained regression, which includes the intrinsic estimator (IE) recently introduced by Yang and colleagues. To better understand these constrained models, the author shows algebraically how each constraint is associated with a specific generalized inverse that is associated with a particular solution vector that (when the model is just identified under the constraint) produces the least square solution to the APC model. The author then discusses the geometry of constrained estimators in terms of solutions being orthogonal to constraints, solutions to various constraints all lying on a line single line in multidimensional space, the distance on that line between various solutions, and the crucial role of the null vector. This provides insight into what characteristics all constrained estimators share and what is unique about the IE. The first part of the article focuses on constrained estimators in general (including the IE), and the latter part compares and contrasts the properties of traditionally constrained APC estimators and the IE. The author concludes with some cautions and suggestions for researchers using and interpreting constrained estimators.

Figure 2. Average response rates for different combinations of γ 1 and γ 2 .
Figure 3. Estimated means for unadjusted and adjusted respondent sample, n = 2,500, f = 0. 
Figure 4. Relative ratio of root mean square error for the adjusted respondent mean to the unadjusted respondent mean, n = 2,500, f = 0. 
Figure 5. Relative ratio of root mean square error for the adjusted respondent mean to the unadjusted respondent mean, n = 2,500, f = 0. 
Multiple Auxiliary Variables in Nonresponse Adjustment

April 2013

·

341 Reads

Prior work has shown that effective survey nonresponse adjustment variables should be highly correlated with both the propensity to respond to a survey and the survey variables of interest. In practice, propensity models are often used for nonresponse adjustment with multiple auxiliary variables as predictors. These auxiliary variables may be positively or negatively associated with survey participation, they may be correlated with each other, and can have positive or negative relationships with the survey variables. Yet the consequences for nonresponse adjustment of these conditions are not known to survey practitioners. Simulations are used here to examine the effects of multiple auxiliary variables with opposite relationships with survey participation and the survey variables. The results show that bias and mean square error of adjusted respondent means are substantially different when the predictors have relationships of the same directions compared to when they have opposite directions with either propensity or the survey variables. Implications for nonresponse adjustment and responsive designs will be discussed.

Table 5 gives relbiases of the GREG for Volunteer sample sizes of 250, 500, and 1000. 
Table 5. 
Table 6. Percentage relative biases in 10,000 samples propensity weighted estimates and the general regression estimation for Volunteer and Reference sample sizes of 500. Probability of volunteering depends on covariates and analysis variables. 
Estimating Propensity Adjustments for Volunteer Web Surveys

January 2011

·

1,423 Reads

Panels of persons who volunteer to participate in Web surveys are used to make estimates for entire populations, including persons who have no access to the Internet. One method of adjusting a volunteer sample to attempt to make it representative of a larger population involves randomly selecting a reference sample from the larger population. The act of volunteering is treated as a quasi-random process where each person has some probability of volunteering. One option for computing weights for the volunteers is to combine the reference sample and Web volunteers and estimate probabilities of being a Web volunteer via propensity modeling. There are several options for using the estimated propensities to estimate population quantities. Careful analysis to justify these methods is lacking. The goals of this article are (a) to identify the assumptions and techniques of estimation that will lead to correct inference under the quasi-random approach, (b) to explore whether methods used in practice are biased, and (c) to illustrate the performance of some estimators that use estimated propensities. Two of our main findings are (a) that estimators of means based on estimates of propensity models that do not use the weights associated with the reference sample are biased even when the probability of volunteering is correctly modeled and (b) if the probability of volunteering is associated with analysis variables collected in the volunteer survey, propensity modeling does not correct bias.

Figure 2. Partition Structure for Newcomb Final Week. K = 4
Table 2 . Simulation Results: A Summary for the Levels of Each Feature.
Figure 3. Partition Structure for the McKinney Data: K = 4  
Two Algorithms for Relaxed Structural Balance Partitioning: Linking Theory, Models, and Data to Understand Social Network Phenomena

February 2011

·

364 Reads

Understanding social phenomena with the help of mathematical models requires a coherent combination of theory, models, and data together with using valid data analytic methods. The study of social networks through the use of mathematical models is no exception. The intuitions of structural balance were formalized and led to a pair of remarkable theorems giving the nature of partition structures for balanced signed networks. Algorithms for partitioning signed networks, informed by these formal results, were developed and applied empirically. More recently, “structural balance” was generalized to “relaxed structural balance,” and a modified partitioning algorithm was proposed. Given the critical interplay of theory, models, and data, it is important that methods for the partitioning of signed networks in terms of relaxed structural balance model are appropriate. The authors consider two algorithms for establishing partitions of signed networks in terms of relaxed structural balance. One is an older heuristic relocation algorithm, and the other is a new exact solution procedure. The former can be used both inductively and deductively. When used deductively, this requires some prespecification incorporating substantive insights. The new branch-and-bound algorithm is used inductively and requires no prespecification of an image matrix in terms of ideal blocks. Both procedures are demonstrated using several examples from the literature, and their contributions are discussed. Together, the two algorithms provide a sound foundation for partitioning signed networks and yield optimal partitions. Issues of network size and density are considered in terms of their consequences for algorithm performance.

Body Mass Index and Physical Attractiveness: Evidence From a Combination Image-Alteration/List Experiment

January 2011

·

74 Reads

The list experiment is used to detect latent beliefs when researchers suspect a substantial degree of social desirability bias from respondents. This methodology has been used in areas ranging from racial attitudes to political preferences. Meanwhile, social psychologists interested in the salience of physical attributes to social behavior have provided respondents with experimentally altered photographs to test the influence of particular visual cues or traits on social evaluations. This experimental research has examined the effect of skin blemishes, hairlessness, and particular racial attributes on respondents' evaluation of these photographs. While this approach isolates variation in particular visual characteristics from other visual aspects that tend to covary with the traits in question, it fails to adequately deal with social desirability bias. This shortcoming is particularly important when concerned with potentially charged visual cues, such as body mass index (BMI). The present article describes a novel experiment that combines the digital alteration of photographs with the list experiment approach. When tested on a nationally representative sample of Internet respondents, results suggest that when shown photographs of women, male respondents report differences in levels of attractiveness based on the perceived BMI of the photographed confederate. Overweight individuals are less likely than their normal weight peers to report different levels of attractiveness between high-BMI and low-BMI photographs. Knowing that evaluations of attractiveness influence labor market outcomes, the findings are particularly salient in a society with rising incidence of obesity.

Fig. 5.3.1. Relative efficiency versus the values of the coefficient of variation of the scrambling variable.
Fig. 5.3.2. Relative efficiency versus the values of the coefficient of variation of the sensitive variable.
3.1. Relative efficiency of the proposed alternative randomized response model with respect to the BBB model.
An Alternative to the Bar-Lev, Bobovitch, and Boukai Randomized Response Model

October 2010

·

88 Reads

In this article, an alternative randomized response model is proposed. The proposed model is found to be more efficient than the randomized response model studied by Bar-Lev, Bobovitch, and Boukai (2004). The relative efficiency of the proposed model is studied with respect to the Bar-Lev et al. (2004) model under various situations.

Testing for Independence of Irrelevant Alternatives: Some Empirical Results

February 1994

·

834 Reads

In a recent article, Zhang and Hoffman discuss the use of discrete choice logit models in sociological research. In the present article, the authors estimate a multinomial logit model of U.K. Magistrates Courts sentencing using a data set collected by the National Association for the Care and Resettlement of Offenders (NACRO) and test the independence of irrelevant alternatives (IIA) property using six tests. Conducting the tests with the appropriate large sample critical values, the authors find that the acceptance or rejection of IIA depends both on which test and which variant of a given test is used. The authors then use simulation techniques to assess the size and power performance of the tests. The empirical example is revisited with the inferences performed using empirical critical values obtained by simulation, and the resultant inferences are compared. The results show that empirical workers should exercise caution when testing for IIA.




Table 2 . Characteristics of the Multigroup Confirmatory Factor Analysis (MCFA), Item Response Theory (IRT), and Latent Class Factor Analysis (LCFA) Models for Measurement Equivalence (ME)
Table 3 . Fit Statistics for the Estimated Multigroup Confirmatory Factor Analysis (MCFA), Item Response Theory (IRT), and Latent Class Factor Analysis (LCFA) Models a. MCFA analysis Npar LL LL df Significance BIC(LL) AIC(LL)
Table 5 . Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Form of Inequivalence
Table 7 . Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Fit Statistic
Mean and Standard Deviation of the Number of Samples in Which Inequivalence Was Detected by Number of Inequivalent Items per Scale
Measurement Equivalence of Ordinal Items: A Comparison of Factor Analytic, Item Response Theory, and Latent Class Approaches

May 2011

·

498 Reads

Three distinctive methods of assessing measurement equivalence of ordinal items, namely, confirmatory factor analysis, differential item functioning using item response theory, and latent class factor analysis, make different modeling assumptions and adopt different procedures. Simulation data are used to compare the performance of these three approaches in detecting the sources of measurement inequivalence. For this purpose, the authors simulated Likert-type data using two nonlinear models, one with categorical and one with continuous latent variables. Inequivalence was set up in the slope parameters (loadings) as well as in the item intercept parameters in a form resembling agreement and extreme response styles. Results indicate that the item response theory and latent class factor models can relatively accurately detect and locate inequivalence in the intercept and slope parameters both at the scale and the item levels. Confirmatory factor analysis performs well when inequivalence is located in the slope parameters but wrongfully indicates inequivalence in the slope parameters when inequivalence is located in the intercept parameters. Influences of sample size, number of inequivalent items in a scale, and model fit criteria on the performance of the three methods are also analyzed.

Visual Sociology Reframed: An Analytical Synthesis and Discussion of Visual Methods in Social and Cultural Research

June 2010

·

4,726 Reads

Visual research is still a rather dispersed and ill-defined domain within the social sciences. Despite a heightened interest in using visuals in research, efforts toward a more unified conceptual and methodological framework for dealing vigilantly with the specifics of this (relatively) new way of scholarly thinking and doing remain sparse and limited in scope. In this article, the author proposes a more encompassing and refined analytical framework for visual methods of research. The "Integrated Framework" tries to account for the great variety within each of the currently discerned types or methods. It does so by moving beyond the more or less arbitrary and often very hybridly defined modes and techniques, with a clear focus on what connects or transcends them. The second part of the article discusses a number of critical issues that have been raised while unfolding the framework. These issues continue to posea challenge to a more visual social science, but can be turned into opportunities for advancement when dealt with appropriately.

Latent Markov Model for Analyzing Temporal Configuration for Violence Profiles and Trajectories in a Sample of Batterers

October 2010

·

26 Reads

In this article, the authors demonstrate the utility of an extended latent Markov model for analyzing temporal configurations in the behaviors of a sample of 550 domestic violence batterers. Domestic violence research indicates that victims experience a constellation of abusive behaviors rather than a single type of violent outcome. There is also evidence that observed behaviors are highly dynamic, with batterers cycling back and forth between periods of no abuse and violent or controlling behavior. These issues pose methodological challenges for social scientists. The extended latent Markov method uses multiple indicators to characterize batterer behaviors and relates the trajectories of violent states to predictors of abuse at baseline. The authors discuss both methodological refinements of the latent Markov models and policy implications of the data analysis.

Figure 1. Example of grouped versus interleafed questionnaire format
Table 4 . Filter Response Regressed on Section Placement for the Three Randomly Ordered Sections (Clothing, Leisure, and Voting)
Table 7 . Mean Percentage of Don't Knows or Refusals to Follow-up Items by Filter Format
The Effects of Asking Filter Questions in Interleafed Versus Grouped Format

January 2011

·

1,117 Reads

When filter questions are asked to determine respondent eligibility for follow-up items, they are administered either interleafed (follow-up items immediately after the relevant filter) or grouped (follow-up items after multiple filters). Experiments with mental health items have found the interleafed form produces fewer yeses to later filters than the grouped form. Given the sensitivity of mental health, it is unclear whether this is due to respondent desire to avoid sensitive issues or simply the desire to shorten the interview. The absence of validation data in these studies also means the nature of the measurement error associated with the filter types is unknown. We conducted an experiment using mainly nonsensitive topics of varying cognitive burden with a sample that allowed validation of some items. Filter format generally had an effect, which grew as the number of filters increased and was larger when the follow-up questions were more difficult. Surprisingly, there was no evidence that measurement error for filters was reduced in the grouped version; moreover, missing data for follow-up items was increased in that version.

Assessing Group Differences in Estimated Baseline Survivor Functions From Cox Proportional Hazards Models

October 2010

·

15 Reads

The author discusses the general problem of evaluating differences in adjusted survivor functions and develops a heuristic approach to generate the expected events that would occur under a Cox proportional hazards model. Differences in the resulting expected survivor distributions can be tested using generalized log rank tests. This method should prove useful for making other kinds of comparisons and generating adjusted life tables. The author also discusses alternative specifications of the classical Cox model that allow time-varying effects and thus permit a more direct assessment of group differences at various points in time. He implements recently developed semipara- metric approaches for estimating time-varying effects, which permit statistical tests of group difference in effects as well as tests of time-invariant effects. He shows that these approaches can provide insight into the nature of time- varying effects and can help reveal the temporal dynamic of group differences.

Assessing the Robustness of Crisp-Set and Fuzzy-Set QCA Results

May 2011

·

494 Reads

Configurational comparative methods constitute promising methodological tools that narrow the gap between variable-oriented and case-oriented research. Their infancy, however, means that the limits and advantages of these techniques are not clear. Tests on the sensitivity of qualitative comparative analysis (QCA) results have been sparse in previous empirical studies, and so has the provision of guidelines for doing this. Therefore this article uses data from a textbook example to discuss and illustrate various robustness checks of results based on the employment of crisp-set QCA and fuzzy-set QCA. In doing so, it focuses on three issues: the calibration of raw data into set-membership values, the frequency of cases linked to the configurations, and the choice of consistency thresholds. The study emphasizes that robustness tests, using systematic procedures, should be regarded as an important, and maybe even indispensable, analytical step in configurational comparative analysis.

Nonparametric Tests of Panel Conditioning and Attrition Bias in Panel Surveys

January 2011

·

66 Reads

Over the past decades there has been an increasing use of panel surveys at the household or individual level. Panel data have important advantages compared to independent cross sections, but also two potential drawbacks: attrition bias and panel conditioning effects. Attrition bias arises if dropping out of the panel is correlated with a variable of interest. Panel conditioning arises if responses are influenced by participation in the previous wave(s); the experience of the previous interview(s) may affect the answers to questions on the same topic, such that these answers differ systematically from those of respondents interviewed for the first time. In this study the authors discuss how to disentangle attrition and panel conditioning effects and develop tests for panel conditioning allowing for nonrandom attrition. First, the authors consider a nonparametric approach with assumptions on the sample design only, leading to interval identification of the measures for the attrition and panel conditioning effects. Second, the authors introduce additional assumptions concerning the attrition process, which lead to point estimates and standard errors for both the attrition bias and the panel conditioning effect. The authors illustrate their method on a variety of repeated questions in two household panels. The authors find significant panel conditioning effects in knowledge questions, but not in other types of questions. The examples show that the bounds can be informative if the attrition rate is not too high. In most but not all of the examples, point estimates of the panel conditioning effect are similar for different additional assumptions on the attrition process.

New Life for Old Ideas: The "Second Wave" of Sequence Analysis Bringing the "Course" Back Into the Life Course

April 2010

·

1,342 Reads

In this article the authors draw attention to the most recent and promising developments of sequence analysis. Taking methodological developments in life course sociology as the starting point, the authors detail the complementary strength in sequence analysis in this field. They argue that recent advantages of sequence analysis were developed in response to criticism of the original work, particularly optimal matching analysis. This debate arose over the past two decades and culminated in the 2000 exchange in Sociological Methods & Research. The debate triggered a 'second wave" of sequence techniques that led to new technical implementations of old ideas in sequence analysis. The authors bring these new technical approaches together, demonstrate selected advances with synthetic example data, and show how they conceptually contribute to life course research. This article demonstrates that in less than a decade, the field has made much progress toward fulfilling the prediction that Andrew Abbott made in 2000, that 'anybody who believes that pattern search techniques are not going to be basic to social sciences over the next 25 years is going to be very much surprised" (p. 75).

Table 1: Classification of Target Population 
Table 2, the 
Table 8 reports the results. We find some hints of a problematic correlation
A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark

April 2010

·

305 Reads

In recent years, social scientists have increasingly turned to matching as a method for drawing causal inferences from observational data. Matching compares those who receive a treatment to those with similar background attributes who do not receive a treatment. Researchers who use matching frequently tout its ability to reduce bias, particularly when applied to data sets that contain extensive background information. Drawing on a randomized voter mobilization experiment, the authors compare estimates generated by matching to an experimental benchmark. The enormous sample size enables the authors to exactly match each treated subject to 40 untreated subjects. Matching greatly exaggerates the effectiveness of preelection phone calls encouraging voter participation. Moreover, it can produce nonsensical results: Matching suggests that another pre-election phone call that encouraged people to wear their seat belts also generated huge increases in voter turnout. This illustration suggests that caution is warranted when applying matching estimators to observational data, particularly when one is uncertain about the potential for biased inference.

Figure 1. Root mean square error (RMSE) for observed-data estimators of the mean, variance, and standard deviation of a normal variable. To standardize the results, the value of has been set to 1. 
Figure 2 . RMSE for MI estimators of the mean, variance, and standard deviation of a normal variable. The number of observations increases along the horizontal axis, while the number of imputations is held constant at D=5. To standardize the results, the value of has been set to 1. 
Table 2 . Single imputation (SI) estimators.
Figure 3. RMSE for MI estimators of the mean, variance, and standard deviation of a normal variable. The number of imputations D increases along the horizontal axis, while the number of observed and missing values is held constant at 20. To standardize the results, the value of has been set to 1.
The Bias and Efficiency of Incomplete-Data Estimators in Small Univariate Normal Samples

April 2012

·

95 Reads

Widely used methods for analyzing missing data can be biased in small samples. To understand these biases, we evaluate in detail the situation where a small univariate normal sample, with values missing at random, is analyzed using either observed-data maximum likelihood (ML) or multiple imputation (MI). We evaluate two types of MI: the usual Bayesian approach, which we call posterior draw (PD) imputation, and a little-used alternative, which we call ML imputation, in which values are imputed conditionally on an ML estimate. We find that observed-data ML is more efficient and has lower mean squared error than either type of MI. Between the two types of MI, ML imputation is more efficient than PD imputation, and ML imputation also has less potential for bias in small samples. The bias and efficiency of PD imputation can be improved by a change of prior.

Cramer-Rao Lower Bound of Variance in Randomized Response Sampling

August 2011

·

115 Reads

In this note, the Cramer-Rao lower bound of variance by using the two decks of cards in randomized response sampling has been developed. The lower bound of variance has been compared with the recent estimator proposed by Odumade and Singh at equal protection of respondents. A real practical face-to-face interview data collected using two decks of cards has been analyzed and the results are discussed.

My Brilliant Career: Characterizing the Early Labor Market Trajectories of British Women From Generation X

April 2010

·

48 Reads

This article uses longitudinal data from the British Cohort Study to examine the early labor market trajectories---^the careers---of more than 5,000 women aged 16 to 29 years. Conventional event history approaches focus on particular transitions, the return to work after childbirth, for example, whereas the authors treat female careers more holistically, using sequence methods and cluster analysis to arrive at a rich but readily interpretable description of the data. The authors' typology presents a fuller picture of the underlying heterogeneity of female career paths that may not be revealed by more conventional transition-focused methods. Furthermore, the authors contribute to the small but growing literature on sequence analysis of female labor force participation by using their typology to show how careers are related to family background and school experiences.

Table 1 DI Question Protocol on Current Employment Characteristics 13 5.PDI Last time we interviewed you, on <INTDATE>, you said your job was <OCCUP>. Are you still in that same occupation?
Table 6 Models of respondent cognition problems or conversational behaviour
When Change Matters: An Analysis of Survey Interaction in Dependent Interviewing on the British Household Panel Study

May 2011

·

81 Reads

The authors examine how questionnaire structure affects survey interaction in the context of dependent interviewing (DI). DI is widely used in panel surveys to reduce observed spurious change in respondent circumstances. Although a growing literature generally finds beneficial measurement properties, little is known about how DI functions in interviews. The authors systematically observed survey interaction using behavior coding and analyzed an application of DI to obtain respondent employment characteristics. The authors found respondents indicated change in circumstances through a number of verbal machinations, including mismatch answers and explanations. Assessing whether these behaviors influenced subsequent question administration, the authors found qualitative evidence that the information disclosed when negating a DI question leads to subsequent interviewing errors. Quantitative analyses supported this evidence, suggesting that standardized interviewing deteriorates as respondents struggle to identify change in their circumstances. This analysis suggests that the reliability of detail about changed circumstances may not be improved using DI.

Figure 1. Plot of C ( x ) (horizontal axis) vs. C ( x, t x ) (vertical 
Figure 2. Smoothed distributions of C ( x, t x ) from 5000 boot- 
Figure 3. Plot of C ( x, t x ) (horizontal axis) vs. C ( x ′ ) (vertical 
Complexity of Categorical Time Series

April 2010

·

251 Reads

Categorical time series, covering comparable time spans, are often quite different in a number of aspects: the number of distinct states, the number of transitions, and the distribution of durations over states. Each of these aspects contributes to an aggregate property of such series that is called complexity. Among sociologists and demographers, complexity is believed to systematically differ between groups as a result of social structure or social change. Such groups differ in, for example, age, gender, or status. The author proposes quantifications of complexity, based upon the number of distinct subsequences in combination with, in case of associated durations, the variance of these durations. A simple algorithm to compute these coefficients is provided and some of the statistical properties of the coefficients are investigated in an application to family formation histories of young American females.

Proportional Reduction of Prediction Error in Cross-Classified Random Effects Models

October 2010

·

65 Reads

As an extension of hierarchical linear models (HLMs), cross-classified random effects models (CCREMs) are used for analyzing multilevel data that do not have strictly hierarchical structures. Proportional reduction in prediction error, a multilevel version of the R2 in ordinary multiple regression, measures the predictive ability of a model and is useful in model selection. However, such a measure is not yet available for CCREMs. Using a two-level random-intercept CCREM, the authors have investigated how the estimated variance components change when predictors are added and have extended the measures of proportional reduction in prediction error from HLMs to CCREMs. The extended measures are generally unbiased for both balanced and unbalanced designs. An example is provided to illustrate the computation and interpretation of these measures in CCREMs.

Table 4 -Comparison of parameter estimates and standard errors (in parentheses) 
Fitting Log-Linear Models to Contingency Tables From Surveys With Complex Sampling Designs: An Investigation of the Clogg-Eliason Approach

July 2010

·

179 Reads

Clogg and Eliason (1987) proposed a simple method for taking account of survey weights when fitting log-linear models to contingency tables. This article investigates the properties of this method. A rationale is provided for the method when the weights are constant within the cells of the table. For more general cases, however, it is shown that the standard errors produced by the method are invalid, contrary to claims in the literature. The method is compared to the pseudo maximum likelihood method both theoretically and through an empirical study of social mobility relating daughter's class to father's class using survey data from France. The method of Clogg and Eliason is found to underestimate standard errors systematically. The article concludes by recommending against the use of this method, despite its simplicity. The limitations of the method may be overcome by using the pseudo maximum likelihood method.

Direct and Indirect Effects for Neighborhood-Based Clustered and Longitudinal Data

5 Reads

Definitions of direct and indirect effects are given for settings in which individuals are clustered in groups or neighborhoods and in which treatments are administered at the group level. A particular intervention may affect individual outcomes both through its effect on the individual and by changing the group or neighborhood itself. Identification conditions are given for controlled direct effects and for natural direct and indirect effects. The interpretation of these identification conditions are discussed within the context of neighborhood research and multilevel modeling. Interventions at a single point in time and time-varying interventions are both considered. The definition of direct and indirect effects requires certain stability or no-interference conditions; some discussion is given as to how these no-interference conditions can be relaxed.

On the Intrinsic Estimator and Constrained Estimators in Age-Period-Cohort Models

August 2011

·

95 Reads

In studying temporally ordered rates of events, epidemiologists, demographers, and social scientists often find it useful to distinguish three different temporal dimensions, namely, age (age of the participants involved), time period (the calendar year or other temporal period for recording the events of interest), and cohort (birth cohort or generation). Age-period-cohort (APC) analysis aims to analyze age-year-specific archived event rates to capture temporal trends in the events under investigation. However, in the context of tables of rates, the well-known relationship among these three factors, Period - Age = Cohort, makes the parameter estimation of the APC multiple classification model difficult. The identification problem of the parameter estimation has been studied since the 1970s and still remains in debate. Recent developments in this regard include the intrinsic estimator (IE) method, the autoregressive cohort model, the age-period-cohort-characteristic (APCC) model, the regression splines model, the smoothing cohort model, and the hierarchical APC model. O'Brien (2011; pp. 419-452, this issue) makes a further contribution in studying constrained estimators, particularly the IE, in the APC models. The authors, however, have important disagreements with O'Brien as to what the statistical properties of the IE are and how the estimates from the IE should be interpreted. The authors point out these disagreements to conclude the article.

Comparisons of Tobit, Linear, and Poisson-Gamma Regression Models: An Application of Time Use Data

August 2011

·

340 Reads

Time use data (TUD) are distinctive, being episodic in nature and consisting of both continuous and discrete (exact zeros) values. TUD is non-negative and generally right skewed. To analyze such data, the Tobit, and to a lesser extent, linear regression models are often used. Tobit models assume the zeros represent censored values of an underlying normally distributed latent variable that theoretically includes negative values. Both the linear regression and Tobit models have normality as a key assumption. The Poisson-gamma distribution is a distribution with both a point mass at zero (corresponding to zero time spent on a given activity) and a continuous component. Using generalized linear models, TUD can be modeled utilizing the Poisson-gamma distribution. Using TUD, Tobit and linear regression models are compared to the Poisson-gamma with respect to the interpretation of the model, the model fit (analysis of residuals), and model performance through the use of a simulated data experiment. The Poisson-gamma is found to be theoretically and empirically more sound in many circumstances.

Table 1 Consent rates by consent type 
Table 2 . Propensity to consent, by BHPS respondent, interview and interviewer characteristics (bivariate probit regressions).
Table 3. (continued) Coefficients 
Table 3. (continued) Margin S.E. Margin 
Correlates of Obtaining Informed Consent to Data Linkage: Respondent, Interview and Interviewer Characteristics

September 2010

·

162 Reads

In the UK, in order to link individual-level administrative records to survey responses, a respondent needs to give their written consent. This paper explores whether characteristics of the respondent, the interviewer or survey design features influence consent. We use the BHPS combined with a survey of interviewers to model the probability that respondents consent to adding health and social security records to their survey responses. We find that some respondent characteristics and characteristics of the interview process within the household matter. By contrast, interviewer characteristics, including personality and attitudes to persuading respondents, are not associated with consent.

Sensitive Questions in Online Surveys: Experimental Results for the Randomized Response Technique (RRT) and the Unmatched Count Technique (UCT)

January 2008

·

384 Reads

Gaining valid answers to so-called sensitive questions is an age-old problem in survey research. Various techniques have been developed to guarantee anonymity and minimize the respondent's feelings of jeopardy. Two such techniques are the randomized response technique (RRT) and the unmatched count technique (UCT). In this study we evaluate the effectiveness of different implementations of the RRT (using a forced-response design) in a computer-assisted setting and also compare the use of the RRT to that of the UCT. The techniques are evaluated according to various quality criteria, such as the prevalence estimates they provide, the ease of their use, and respondent trust in the techniques. Our results indicate that the RRTs are problematic with respect to several domains, such as the limited trust they inspire and non-response, and that the RRT estimates are unreliable due to a strong false "no" bias, especially for the more sensitive questions. The UCT, however, performed well compared to the RRTs on all the evaluated measures. The UCT estimates also had more face validity than the RRT estimates. We conclude that the UCT is a promising alternative to RRT in self-administered surveys and that future research should be directed towards evaluating and improving the technique.

Table 1: OM and OMv distances for example sequences 
Table 2 : Correlation of OM and OMv distances, using real and simulated data Correlation with OM distance
Figure 3: OMv and OM distances for simulated data: Low, medium and high numbers of spells  
Figure 4: OMv and OM distances for mothers' labour market sequences (BHPS) and labour market entrants' sequences (MVAD)  
Table 4: OM and OMv 8-cluster solutions, entropy and spell-density 
Optimal Matching Analysis and Life-Course Data: The Importance of Duration

April 2010

·

1,547 Reads

The optimal matching (OM) algorithm is widely used for sequence analysis in sociology. It has a natural interpretation for discrete-time sequences but is also widely used for life-history data, which are continuous in time. Life- history data are arguably better dealt with in terms of episodes rather than as strings of time-unit observations, and in this article, the author examines whether the OM algorithm is unsuitable for such sequences. A modified version of the algorithm is proposed, weighting OM's elementary operations inversely with episode length. In the general case, the modified algorithm produces pairwise distances much lower than the standard algorithm, the more the sequences are composed of long spells in the same state. However, where all the sequences in a data set consist of few long spells, and there is low variability in the number of spells, the modified algorithm generates an overall pattern of distances that is not very different from standard OM.


Table 1 . Demographic Characteristics of Respondents in the NISVS Pilot, Nonresponse, and Cell Phone Studies.
Figure 2. Weighting Design for Combined NISVS Pilot, Nonresponse, and Cell Phone Studies.
Table 2 . Key Survey Estimates in the NISVS Pilot, Nonresponse, and Cell Phone Studies Weighted for Selection Probabilities and Unweighted Standard Deviations.
Table 3 . Associations among Log-Transformed Key Survey Variables in the NISVS Pilot, Nonresponse, and Cell Phone Studies.
Table 4 . Key Survey Estimates Based on NISVS: (1) Pilot, (2) Pilot and Nonresponse, (3) Pilot and Cell Phone, and (4) Pilot, Nonresponse, and Cell Phone Studies, Weighted by Final Poststratified Weights.
Multiple Sources of Nonobservation Error in Telephone Surveys: Coverage and Nonresponse

January 2011

·

241 Reads

Random digit dialed telephone surveys are facing two serious problems undermining probability-based inference and creating a potential for bias in survey estimates: declining response rates and declining coverage of the landline telephone frame. Optimum survey designs need to focus reduction techniques on errors that cannot be addressed through statistical adjustment. This requires (a) separating and estimating the relative magnitude of different error sources and (b) evaluating the degree to which each error source can be statistically adjusted. In this study, the authors found significant differences in means both for nonrespondents and for the eligible population excluded from the landline frame, which are also in opposite directions. Differences were also found for element variances and associations, which can affect survey results but are rarely examined. Adjustments were somewhat effective in decreasing both sources of bias, although addressing at least one through data collection led to less bias in the adjusted estimates.


Top-cited authors