[Show abstract][Hide abstract] ABSTRACT: Chronic malnutrition, termed stunting, is defined as suboptimal linear growth, affects one third of children in developing countries, and leads to increased mortality and poor developmental outcomes. The causes of childhood stunting are unknown, and strategies to improve growth and related outcomes in children have only had modest impacts. Recent studies have shown that the ecosystem of microbes in the human gut, termed the microbiota, can induce changes in weight. However, the specific changes in the gut microbiota that contribute to growth remain unknown, and no studies have investigated the gut microbiota as a determinant of chronic malnutrition.
We performed secondary analyses of data from two well-characterized twin cohorts of children from Malawi and Bangladesh to identify bacterial genera associated with linear growth. In a case-control analysis, we used the graphical lasso to estimate covariance network models of gut microbial interactions from relative genus abundances and used network analysis methods to select genera associated with stunting severity. In longitudinal analyses, we determined associations between these selected microbes and linear growth using between-within twin regression models to adjust for confounding and introduce temporality. Reduced microbiota diversity and increased covariance network density were associated with stunting severity, while increased relative abundance of Acidaminococcus sp. was associated with future linear growth deficits.
We show that length growth in children is associated with community-wide changes in the gut microbiota and with the abundance of the bacterial genus, Acidaminococcus. Larger cohorts are needed to confirm these findings and to clarify the mechanisms involved.
[Show abstract][Hide abstract] ABSTRACT: There have been considerable advances in the methodology for estimating dynamic treatment regimens, and for the design of sequential trials that can be used to collect unconfounded data to inform such regimens. However, relatively little attention has been paid to how such methodology could be used to advance understanding of optimal treatment strategies in a continuous dose setting, even though it is often the case that considerable patient heterogeneity in drug response along with a narrow therapeutic window may necessitate the tailoring of dosing over time. Such is the case with warfarin, a common oral anticoagulant. We propose novel, realistic simulation models based on pharmacokinetic-pharmacodynamic properties of the drug that can be used to evaluate potentially optimal dosing strategies. Our results suggest that this methodology can lead to a dosing strategy that performs well both within and across populations with different pharmacokinetic characteristics, and may assist in the design of randomized trials by narrowing the list of potential dosing strategies to those which are most promising.
[Show abstract][Hide abstract] ABSTRACT: Background
In the context of infectious disease, sequence clustering can be used to provide important insights into the dynamics of transmission. Cluster analysis is usually performed using a phylogenetic approach whereby clusters are assigned on the basis of sufficiently small genetic distances and high bootstrap support (or posterior probabilities). The computational burden involved in this phylogenetic threshold approach is a major drawback, especially when a large number of sequences are being considered. In addition, this method requires a skilled user to specify the appropriate threshold values which may vary widely depending on the application.
This paper presents the Gap Procedure, a distance-based clustering algorithm for the classification of DNA sequences sampled from individuals infected with the human immunodeficiency virus type 1 (HIV-1). Our heuristic algorithm bypasses the need for phylogenetic reconstruction, thereby supporting the quick analysis of large genetic data sets. Moreover, this fully automated procedure relies on data-driven gaps in sorted pairwise distances to infer clusters, thus no user-specified threshold values are required. The clustering results obtained by the Gap Procedure on both real and simulated data, closely agree with those found using the threshold approach, while only requiring a fraction of the time to complete the analysis.
Apart from the dramatic gains in computational time, the Gap Procedure is highly effective in finding distinct groups of genetically similar sequences and obviates the need for subjective user-specified values. The clusters of genetically similar sequences returned by this procedure can be used to detect patterns in HIV-1 transmission and thereby aid in the prevention, treatment and containment of the disease.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0791-x) contains supplementary material, which is available to authorized users.
[Show abstract][Hide abstract] ABSTRACT: Individualized medicine is an area that is growing, both in clinical and statistical settings, where in the latter, personalized treatment strategies are often referred to as dynamic treatment regimens. Estimation of the optimal dynamic treatment regime has focused primarily on semi-parametric approaches, some of which are said to be doubly robust in that they give rise to consistent estimators provided at least one of two models is correctly specified. In particular, the locally efficient doubly robust g-estimation is robust to misspecification of the treatment-free outcome model so long as the propensity model is specified correctly, at the cost of an increase in variability. In this paper, we propose data-adaptive weighting schemes that serve to decrease the impact of influential points and thus stabilize the estimator. In doing so, we provide a doubly robust g-estimator that is also robust in the sense of Hampel (15).
The International Journal of Biostatistics 08/2015; DOI:10.1515/ijb-2015-0015 · 0.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: This paper constructs a doubly robust estimator for continuous dose-response
estimation. An outcome regression model is augmented with a set of inverse
generalized propensity score covariates to correct for potential
misspecification bias. From the augmented model we can obtain consistent
estimates of mean average potential outcomes for distinct strata of the
treatment. A polynomial regression is then fitted to these point estimates to
derive a Taylor approximation to the continuous dose-response function. The
bootstrap is used for variance estimation. Analytical results and simulations
show that our approach can provide a good approximation to linear or nonlinear
dose-response functions under various sources of misspecification of the
outcome regression or propensity score models. Efficiency in finite samples is
good relative to minimum variance consistent estimators.
[Show abstract][Hide abstract] ABSTRACT: Amblyopia is the commonest visual disorder of childhood in Western societies, affecting, predominantly, spatial visual function. Treatment typically requires a period of refractive correction ('optical treatment') followed by occlusion: covering the nonamblyopic eye with a fabric patch for varying daily durations. Recent studies have provided insight into the optimal amount of patching ('dose'), leading to the adoption of standardized dosing strategies, which, though an advance on previous ad-hoc regimens, take little account of individual patient characteristics. This trial compares the effectiveness of a standardized dosing strategy (that is, a fixed daily occlusion dose based on disease severity) with a personalized dosing strategy (derived from known treatment dose-response functions), in which an initially prescribed occlusion dose is modulated, in a systematic manner, dependent on treatment compliance.
A total of 120 children aged between 3 and 8 years of age diagnosed with amblyopia in association with either anisometropia or strabismus, or both, will be randomized to receive either a standardized or a personalized occlusion dose regimen. To avoid confounding by the known benefits of refractive correction, participants will not be randomized until they have completed an optical treatment phase. The primary study objective is to determine whether, at trial endpoint, participants receiving a personalized dosing strategy require fewer hours of occlusion than those in receipt of a standardized dosing strategy. Secondary objectives are to quantify the relationship between observed changes in visual acuity (logMAR, logarithm of the Minimum Angle of Resolution) with age, amblyopia type, and severity of amblyopic visual acuity deficit.
This is the first randomized controlled trial of occlusion therapy for amblyopia to compare a treatment arm representative of current best practice with an arm representative of an entirely novel treatment regimen based on statistical modelling of previous trial outcome data. Should the personalized dosing strategy demonstrate superiority over the standardized dosing strategy, then its adoption into routine practice could bring practical benefits in reducing the duration of treatment needed to achieve an optimal outcome.
ISRCTN ISRCTN12292232 .
[Show abstract][Hide abstract] ABSTRACT: Length bias in survival data occurs in observational studies when, for example, subjects with shorter lifetimes are less likely to be present in the recorded data. In this paper, we consider estimating the causal exposure (treatment) effect on survival time from observational data when, in addition to the lack of randomization and consequent potential for confounding, the data constitute a length-biased sample; we hence term this a double-bias problem. We develop estimating equations that can be used to estimate the causal effect indexing the structural Cox proportional hazard and accelerated failure time models for point exposures in double-bias settings. The approaches rely on propensity score-based adjustments, and we demonstrate that estimation of the propensity score must be adjusted to acknowledge the length-biased sampling. Large sample properties of the estimators are established and their small sample behavior is studied using simulations. We apply the proposed methods to a set of, partly synthesized, length-biased survival data collected as part of the Canadian Study of Health and Aging (CSHA) to compare survival of subjects with dementia among institutionalized patients versus those recruited from the community and depict their adjusted survival curves.
The International Journal of Biostatistics 03/2015; 11(1). DOI:10.1515/ijb-2014-0037 · 0.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The nuclei of higher eukaryotic cells display compartmentalization and certain nuclear compartments have been shown to follow a degree of spatial organization. To date, the study of nuclear organization has often involved simple quantitative procedures that struggle with both the irregularity of the nuclear boundary and the problem of handling replicate images. Such studies typically focus on inter-object distance, rather than spatial location within the nucleus. The concern of this paper is the spatial preference of nuclear compartments, for which we have developed statistical tools to quantitatively study and explore nuclear organization. These tools combine replicate images to generate 'aggregate maps' which represent the spatial preferences of nuclear compartments. We present two examples of different compartments in mammalian fibroblasts (WI-38 and MRC-5) that demonstrate new knowledge of spatial preference within the cell nucleus. Specifically, the spatial preference of RNA polymerase II is preserved across normal and immortalized cells, whereas PML nuclear bodies exhibit a change in spatial preference from avoiding the centre in normal cells to exhibiting a preference for the centre in immortalized cells. In addition, we show that SC35 splicing speckles are excluded from the nuclear boundary and localize throughout the nucleoplasm and in the interchromatin space in non-transformed WI-38 cells. This new methodology is thus able to reveal the effect of large-scale perturbation on spatial architecture and preferences that would not be obvious from single cell imaging.
Journal of The Royal Society Interface 03/2015; 12(104). DOI:10.1098/rsif.2014.0894 · 3.92 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Road network capacity expansions are frequently proposed as solutions to urban traffic congestion but are controversial because it is thought that they can directly “induce” growth in traffic volumes. This article quantifies causal effects of road network capacity expansions on aggregate urban traffic volume and density in U.S. cities using a mixed model propensity score (PS) estimator. The motivation for this approach is that we seek to estimate a dose-response relationship between capacity and volume but suspect confounding from both observed and unobserved characteristics. Analytical results and simulations show that a longitudinal mixed model PS approach can be used to adjust effectively for time-invariant unobserved confounding via random effects (RE). Our empirical results indicate that network capacity expansions can cause substantial increases in aggregate urban traffic volumes such that even major capacity increases can actually lead to little or no reduction in network traffic densities. This result has important implications for optimal urban transportation strategies. Supplementary materials for this article are available online.
Journal of the American Statistical Association 10/2014; 109(508). DOI:10.1080/01621459.2014.956871 · 1.98 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: To detect hGH doping in sport, the World Anti-Doping Agency (WADA)-accredited laboratories use the ratio of the concentrations of recombinant hGH (‘rec’) versus other ‘natural’ pituitary-derived isoforms of hGH (‘pit’), measured with two different kits developed specifically to detect the administration of exogenous hGH. The current joint compliance decision limits (DLs) for ratios derived from these kits, designed so that they would both be exceeded in fewer than 1 in 10,000 samples from non-doping athletes, are based on data accrued in anti-doping labs up to March 2010, and later confirmed with data up to February–March 2011. In April 2013, WADA asked the authors to analyze the now much larger set of ratios collected in routine hGH testing of athletes, and to document in the peer-reviewed literature a statistical procedure for establishing DLs, so that it be re-applied as more data become available.
[Show abstract][Hide abstract] ABSTRACT: The pervasive use of prevalent cohort studies on disease duration,
increasingly calls for appropriate methodologies to account for the biases that
invariably accompany samples formed by such data. It is well-known, for
example, that subjects with shorter lifetime are less likely to be present in
such studies. Moreover, certain covariate values could be preferentially
selected into the sample, being linked to the long-term survivors. The existing
methodology for estimation of the propensity score using data collected on
prevalent cases requires the correct conditional survival/hazard function given
the treatment and covariates. This requirement can be alleviated if the disease
under study has stationary incidence, the so-called stationarity assumption. We
propose a nonparametric adjustment technique based on a weighted estimating
equation for estimating the propensity score which does not require modeling
the conditional survival/hazard function when the stationarity assumption
holds. Large sample properties of the estimator is established and its small
sample behavior is studied via simulation.
[Show abstract][Hide abstract] ABSTRACT: Due to the cost and complexity of conducting a sequential multiple assignment randomized trial (SMART), it is desirable to pre-define a small number of personalized regimes to study.
We proposed a simulation-based approach to studying personalized dosing strategies in contexts for which a therapeutic agent's pharmacokinetic and pharmacodynamics properties are well understood. We take dosing of warfarin as a case study, as its properties are well understood. We consider a SMART in which there are five intervention points in which dosing may be modified, following a loading phase of treatment.
Realistic SMARTs are simulated, and two methods of analysis, G-estimation and Q-learning, are used to assess potential personalized dosing strategies.
In settings where outcome modelling may be complex due to the highly non-linear nature of the pharmacokinetic and pharmacodynamics mechanisms of the therapeutic agent, G-estimation provides for which the more promising method of estimating an optimal dosing strategy. Used in combination with the simulated SMARTs, we were able to improve simulated patient outcomes and suggest which patient characteristics were needed to best individually tailor dosing. In particular, our simulations suggest that current dosing should be determined by an individual's current coagulation time as measured by the international normalized ratio (INR), their last measured INR, and their last dose. Tailoring treatment only based on current INR and last warfarin dose provided inferior control of INR over the course of the trial.
The ability of the simulated SMARTs to suggest optimal personalized dosing strategies relies on the pharmacokinetic and pharmacodynamic models used to generate the hypothetical patient profiles. This approach is best suited to therapeutic agents whose effects are well studied.
Prior to investing in a complex randomized trial that involves sequential treatment allocations, simulations should be used where possible in order to guide which dosing strategies to evaluate.
[Show abstract][Hide abstract] ABSTRACT: In the causal adjustment setting, variable selection techniques based on
either the outcome or treatment allocation model can result in the omission of
confounders or the inclusion of spurious variables in the propensity score. We
propose a variable selection method based on a penalized likelihood which
considers the response and treatment assignment models simultaneously. We show
that under some conditions our method attains the oracle property. The selected
variables are used to form a double robust regression estimator of the
treatment effect. Simulation results are presented and data from the National
Supported Work Demonstration are analyzed.
[Show abstract][Hide abstract] ABSTRACT: . The paper investigates the link between area-based socio-economic deprivation and the incidence of child pedestrian casualties. The analysis is conducted by using data for small spatial zones within major British cities over the period 2001–2007. Spatial longitudinal generalized linear mixed models, estimated by using frequentist and Bayesian approaches, are used to address issues of confounding, spatial dependence and transmission of deprivation effects across zones (i.e. interference). The results show a consistent strong deprivation effect across model specifications. The incidence of child pedestrian casualties in the most deprived zones is typically greater than 10 times that in the least deprived zones. Modelling interference through a spatially auto-regressive covariate uncovers a substantially larger effect.
Journal of the Royal Statistical Society Series A (Statistics in Society) 10/2013; 176(4). DOI:10.1111/j.1467-985X.2012.01071.x · 1.64 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Purpose:
Explore compliance with occlusion treatment of amblyopia in the Monitored and Randomized Occlusion Treatment of Amblyopia Studies (MOTAS and ROTAS), using objective monitoring.
Both studies had a three-phase protocol: initial assessment, refractive adaptation, and occlusion. In the occlusion phase, participants were instructed to dose for 6 hours/day (MOTAS) or randomized to 6 or 12 hour/day (ROTAS). Dose was monitored continuously using an occlusion dose monitor (ODM).
One hundred and fifty-two patients (71 male, 81 female; 122 Caucasian, 30 non-Caucasian) of mean ± SD age 68 ± 18 months participated. Amblyopia was defined as an interocular acuity difference of at least 0.1 logMAR and was associated with anisometropia in 50, strabismus in 44, and both (mixed) in 58. Median duration of occlusion was 99 days (interquartile range 72 days). Mean compliance was 44%, mean proportion of days with no patch worn was 42%. Compliance was lower (39%) on weekends compared with weekdays (46%, P = 0.04), as was the likelihood of dosing at all (52% vs. 60%, P = 0.028). Compliance was lower when attendance was less frequent (P < 0.001) and with prolonged treatment duration (P < 0.001). Age, sex, amblyopia type, and severity were not associated with compliance. Mixture modeling suggested three subpopulations of patch day doses: less than 30 minutes; doses that achieve 30% to 80% compliance; and doses that achieve around 100% compliance.
This study shows that compliance with patching treatment averages less than 50% and is influenced by several factors. A greater understanding of these influences should improve treatment outcome. (ClinicalTrials.gov number, NCT00274664).