Journal of Applied Statistics

Published by Taylor & Francis (Routledge)
Online ISSN: 1360-0532
Print ISSN: 0266-4763
An effective one-pot synthesis of dialkoxyphosphoryl-2-oxo-2H-pyran derivatives by three-component reaction of alky bromides and dialkyl acetylenedicarboxylates in the presence of trialkyl phosphite is described. The reactions were performed under solvent-free conditions at 50oC and neutral conditions and provided good yields of products.
KM survival curves according to race for PrCA in nine different counties, Louisiana. 
Prostate cancer is the most common cancer diagnosed in American men and the second leading cause of death from malignancies. There are large geographical variation and racial disparities existing in the survival rate of prostate cancer. Much work on the spatial survival model is based on the proportional hazards model, but few focused on the accelerated failure time model. In this paper, we investigate the prostate cancer data of Louisiana from the SEER program and the violation of the proportional hazards assumption suggests the spatial survival model based on the accelerated failure time model is more appropriate for this data set. To account for the possible extra-variation, we consider spatially-referenced independent or dependent spatial structures. The deviance information criterion (DIC) is used to select a best fitting model within the Bayesian frame work. The results from our study indicate that age, race, stage and geographical distribution are significant in evaluating prostate cancer survival.
G-protein coupled receptors (GPCRs) are proteins of the plasma membrane, which are characterized by seven membrane-spanning segments (TMs). GPCRs play an important role in almost all of our physiological and pathophysiological conditions by interacting with a large variety of ligands and stimulating different G-proteins and signaling cascades. By playing a key role in the function of our body and being involved in the pathophysiology of many disorders, GPCRs are very important therapeutic targets. Determination of the structure and function of GPCRs could advance the design of novel receptor-specific drugs against various diseases. A powerful method to obtain structural and functional information for GPCRs is the cysteine substituted accessibility method (SCAM). SCAM is used to systematically map the TM residues of GPCRs and determine their functional role. SCAM can also be used to determine differences in the structures of the TMs in different functional states of GPCRs.
Positive predictive value and 1 minus negative predictive value of CA-IX 
Positive predictive value and 1 minus negative predictive value of HC2 
Observed data log-likelihood versus number of iterations in MCEM with different M 
In this article we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo EM (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group (GOG) study of significant cervical lesion (S-CL) diagnosis in women with atypical glandular cells of undetermined significance (AGC) to compare the diagnostic accuracy of a histology-based evaluation, a CA-IX biomarker-based test and a human papillomavirus (HPV) DNA test.
Weights of rats at the end of 20. and 24. weeks.
Objectives: Obesity is a worldwide problem, leading to cardiomyopathy. Oxidative stress and inflammation have been reported to play significant roles in developing obesity cardiomyopathy. N-acetylcysteine is a glutathione prodrug that preserves liver against steatosis via constraining the production of reactive oxygen species. Etodolac is a nonsteroidal anti-inflammatory drug which has been demonstrated to protect liver against fibrosis. The aim of the present study was to evaluate and compare the effects of N-acetylcysteine and etodolac on impaired cardiac functions due to high-fat-diet (HFD) induced myocardial steatosis in rats. Material and methods: Thirty-two male Sprague-Dawley rats were randomly divided into four groups. Control group was maintained on standard-rat-basic-diet (SD) for 20 weeks, while HFD was given to three study groups for 20 weeks. Then N-acetylcysteine was given to one of the study groups (HFD+NAC), and etodolac to another group (HFD+ETD) as a supplement for 4 weeks while all groups were continued on SD. At the end of the study periods, hearts were examined by Langendorff technique and rat livers were evaluated histologically. Results: HFD and HFD+ETD groups presented with significantly higher steatosis and fibrosis in liver compared to other groups. HFD+NAC preserved diastolic functions. Also HFD+NAC and HFD+ETD groups had significantly better systolic funtions than HFD group. Conclusions: Obesity is associated with diastolic dysfunction rather than systolic dysfunction. NAC may protect the heart against diastolic dysfunction due to obesity. NAC and etodolac treatment improve systolic function, even in the absence of systolic dysfunction.
Biomarkers have the potential to improve our understanding of disease diagnosis and prognosis. Biomarker levels that fall below the assay detection limits (DLs), however, compromise the application of biomarkers in research and practice. Most existing methods to handle non-detects focus on a scenario in which the response variable is subject to detection limit; only a few methods consider explanatory variables when dealing with DLs. We propose a Bayesian approach for generalized linear models with explanatory variables subject to lower, upper, or interval DLs. In simulation studies, we compared the proposed Bayesian approach to four commonly used methods in a logistic regression model with explanatory variable measurements subject to DL. We also applied the Bayesian approach and other four methods in a real study, in which a panel of cytokine biomarkers was studied for their association with acute lung injury (ALI). We found that IL8 was associated with a moderate increase in risk for ALI in the model based on the proposed Bayesian approach.
Immune decline with ageing accounts for the increased risk of infections, inflammatory chronic disease, autoimmunity and cancer in humans. Both innate and adaptive immune functions are compromised in aged people and, therefore, attempts to correct these dysfunctions represent a major goal of modern medicine. In this review, special emphasis will be placed on the aged innate immunity with special reference to polymorphonuclear cell, monocyte/macrophage, dendritic cell and natural killer cell functions. As potential modifiers of the impaired innate immunity, some principal nutraceuticals will be illustrated, such as micronutrients, pre-probiotics and polyphenols. In elderly, clinical trials with the above products are scanty, however, some encouraging effects on the recovery of innate immune cells have been reported. In addition, our own results obtained with symbiotics and polyphenols extracted from red wine or fermented grape marc suggest the potential ability of these substances to modulate the innate immune response in ageing, thus reducing the inflammaging which characterizes immune senescence.
Botulinum neurotoxins (BoNTs) are endopeptidases that target motor neurons and block acetylcholine neurotransmitter release. This action results in the muscle paralysis that defines the disease botulism. To date, there are no FDA-approved therapeutics to treat BoNT-mediated paralysis after intoxication of the motor neuron. Importantly, the rationale for pursuing treatments to counter these toxins is driven by their potential misuse. Current drug discovery efforts have mainly focused on small molecules, peptides, and peptidomimetics that can directly and competitively inhibit BoNT light chain proteolytic activity. Although this is a rational approach, direct inhibition of the Zn2+ metalloprotease activity has been elusive as demonstrated by the dearth of candidates undergoing clinical evaluation. Therefore, broadening the scope of viable targets beyond that of active site protease inhibitors represents an additional strategy that could move the field closer to the clinic. Here we review the rationale, and discuss the outcomes of earlier approaches and highlight potential new targets for BoNT inhibition. These include BoNT uptake and processing inhibitors, enzymatic inhibitors, and modulators of neuronal processes associated with toxin clearance, neurotransmitter potentiation, and other pathways geared towards neuronal recovery and repair.
Markov regression models are useful tools for estimating the impact of risk factors on rates of transition between multiple disease states. Alzheimer's disease (AD) is an example of a multi-state disease process in which great interest lies in identifying risk factors for transition. In this context, non-homogeneous models are required because transition rates change as subjects age. In this report we propose a non-homogeneous Markov regression model that allows for reversible and recurrent disease states, transitions among multiple states between observations, and unequally spaced observation times. We conducted simulation studies to demonstrate performance of estimators for covariate effects from this model and compare performance with alternative models when the underlying non-homogeneous process was correctly specified and under model misspecification. In simulation studies, we found that covariate effects were biased if non-homogeneity of the disease process was not accounted for. However, estimates from non-homogeneous models were robust to misspecification of the form of the non-homogeneity. We used our model to estimate risk factors for transition to mild cognitive impairment (MCI) and AD in a longitudinal study of subjects included in the National Alzheimer's Coordinating Center's Uniform Data Set. Using our model, we found that subjects with MCI affecting multiple cognitive domains were significantly less likely to revert to normal cognition.
This article provides a unified methodology of meta-analysis that synthesizes medical evidence by using both available individual patient data (IPD) and published summary statistics within the framework of likelihood principle. Most up-to-date scientific evidence on medicine is crucial information not only to consumers but also to decision makers, and can only be obtained when existing evidence from the literature and the most recent individual patient data are optimally synthesized. We propose a general linear mixed effects model to conduct meta-analyses when individual patient data are only available for some of the studies and summary statistics have to be used for the rest of the studies. Our approach includes both the traditional meta-analyses in which only summary statistics are available for all studies and the other extreme case in which individual patient data are available for all studies as special examples. We implement the proposed model with statistical procedures from standard computing packages. We provide measures of heterogeneity based on the proposed model. Finally, we demonstrate the proposed methodology through a real life example studying the cerebrospinal fluid biomarkers to identify individuals with high risk of developing Alzheimer's disease when they are still cognitively normal.
Determining the effectiveness of different treatments from observational data, which are characterized by imbalance between groups due to lack of randomization, is challenging. Propensity matching is often used to rectify imbalances among prognostic variables. However, there are no guidelines on how appropriately to analyze group matched data when the outcome is a zero inflated count. In addition, there is debate over whether to account for correlation of responses induced by matching, and/or whether to adjust for variables used in generating the propensity score in the final analysis. The aim of this research is to compare covariate unadjusted and adjusted zero-inflated Poisson models that do and do not account for the correlation. A simulation study is conducted, demonstrating that it is necessary to adjust for potential residual confounding, but that accounting for correlation is less important. The methods are applied to a biomedical research data set.
Melioidosis is a serious emerging endemic infectious disease caused by Burkholderia pseudomallei, a gram-negative pathogen. Septicemic melioidosis has a mortality rate of 50% even with treatment. Like other gramnegative bacteria, B. pseudomallei is resistant to a number of antibiotics and multi-drug resistant B. pseudomallei is beginning to be encountered in hospitals. There is a clear medical need to develop new treatment options to manage this disease. We used Burkholderia thailandensis (a BSL-2 class organism) to infect Caenorhabditis elegans and set up a surrogate whole animal infection model of melioidosis that we could run in a 384 microtitre plate and establish a whole animal HTS assay. We have optimized and validated this assay in a fluorescence-based format that can be run on our automated screening platforms. This assay has now been used to screen over 300,000 compounds from our small molecule library and we are in the process of characterizing the hits obtained and select compounds for further studies. We have thus established a biologically relevant assay technology platform to screen for antibacterial compounds and used this platform to identify new compounds that may find application in treating melioidosis infections.
Common structural architecture of HIV co-receptors. (A) Structure comparison between CXCR4 (cyans, 3ODU) and CCR5 (purple, 4MBS). Disulphide-bonded are shown as sticks in yellow. (B) Top view of the extracellular side of CXCR4 and CCR5. (C) Bottom view of the intracellular side of CXCR4 and CCR5.
Comparison of the ligand-binding pockets between CXCR4-IT1t and CCR5-maraviroc. (A) Key residues in the binding pockets of CXCR4-IT1t and CCR5-maraviroc. Receptors CXCR4 (cyans, 3ODU) and CCR5 (purple, 4MBS) are shown in cartoon. Ligands IT1t (white), maraviroc (red) involved in ligand binding are shown in spheres representation. (B and C) Top views of the ligand binding pockets in CXCR4 [(B), cyans] and CCR5 [(C), purple], showing a more open ligand-binding pocket in CCR5.
Chemokine receptors are G protein-coupled receptors that contain seven trans-membrane domains. CXCR4 and CCR5 as major co-receptors for HIV-1 entry into host cells are implicated in cancer and inflammation. They have been attractive targets for the pharmaceutical industry basing on their roles in HIV disease. Homology modeling, molecular docking, molecular dynamics, Molecular Mechanics/Generalized Born Surface Area and many other computational methods are applied to illustrate the structure, function and binding site of GPCR. Moreover, the high resolution crystal structures of CXCR4 and CCR5 have provided extremely valuable structural information and receptor activation mechanisms, enable structure-based drug discovery for the treatment of HIV-1 infection. We also describe the recent progress about the small molecule antagonists of CXCR4 and CCR5 and the interaction between GPCR and their ligands predicted by molecular docking and molecular dynamics methods. Future research questions and further investigations are outlined to highlight some researches that may be relevant to the advancement of therapies targeting the important receptor related with HIV.
Growing experimental evidences suggest that dimerization and oligomerization are important for G Protein-Coupled Receptors (GPCRs) function. The detailed structural information of dimeric/oligomeric GPCRs would be very important to understand their function. Although it is encouraging that recently several experimental GPCR structures in oligomeric forms have appeared, experimental determination of GPCR structures in oligomeric forms is still a big challenge, especially in mimicking the membrane environment. Therefore, development of computational approaches to predict dimerization of GPCRs will be highly valuable. In this review, we summarize computational approaches that have been developed and used for modeling of GPCR dimerization. In addition, we introduce a novel two-dimensional Brownian Dynamics based protein docking approach, which we have recently adapted, for GPCR dimer prediction.
In disease screening and diagnosis, often multiple markers are measured and they are combined in order to improve the accuracy of diagnosis. McIntosh and Pepe (2002, Biometrics58, 657-644) showed that the risk score, defined as the probability of disease conditional on multiple markers, is the optimal function for classification based on the Neyman-Pearson Lemma. They proposed a two-step procedure to approximate the risk score. However, the resulted ROC curve is only defined in a subrange (L, h) of the false-positive rates in (0,1) and determination of the lower limit L needs extra prior information. In practice, most diagnostic tests are not perfect and it is usually rare that a single marker is uniformly better than the other tests. Using simulation, I show that multivariate adaptive regression spline (MARS) is a useful tool to approximate the risk score when combining multiple markers, especially when the ROC curves from multiple tests cross. The resulted ROC is defined in the whole range of (0,1) and it is easy to implement and has intuitive interpretation. The sample code of the application is shown in the appendix.
We study the genotype calling algorithms for the high-throughput single-nucleotide polymorphism (SNP) arrays. Building upon the novel SNP-RMA preprocessing approach and the state-of-the-art CRLMM approach for genotype calling, we propose a simple modification to better model and combine the information across multiple SNPs with empirical Bayes modeling, which could often significantly improve the genotype calling of CRLMM. Through applications to the HapMap Trio data set and a non-HapMap test set of high quality SNP chips, we illustrate the competitive performance of the proposed method.
Ecological Momentary Assessment is an emerging method of data collection in behavioral research that may be used to capture the times of repeated behavioral events on electronic devices, and information on subjects' psychological states through the electronic administration of questionnaires at times selected from a probability-based design as well as the event times. A method for fitting a mixed Poisson point process model is proposed for the impact of partially-observed, time-varying covariates on the timing of repeated behavioral events. A random frailty is included in the point-process intensity to describe variation among subjects in baseline rates of event occurrence. Covariate coefficients are estimated using estimating equations constructed by replacing the integrated intensity in the Poisson score equations with a design-unbiased estimator. An estimator is also proposed for the variance of the random frailties. Our estimators are robust in the sense that no model assumptions are made regarding the distribution of the time-varying covariates or the distribution of the random effects. However, subject effects are estimated under gamma frailties using an approximate hierarchical likelihood. The proposed approach is illustrated using smoking data.
Oxidative stress in the obese-insulin resistant condition has been shown to affect cognitive as well as brain mitochondrial functions. Garlic extract has exerted a potent antioxidant effect. However, the effects of garlic extract on the brain of obese-insulin resistant rats have never been investigated. We hypothesized that garlic extract improves cognitive function and brain mitochondrial function in obese-insulin resistant rats induced by long-term high-fat diet (HFD) consumption. Male Wistar rats were fed either normal diet or HFD for 16 weeks (n = 24/group). At week 12, rats in each dietary group received either vehicle or garlic extract (250 and 500 mg·kg(-1)·day(-1)) for 28 days. Learning and memory behaviors, metabolic parameters, and brain mitochondrial function were determined at the end of treatment. HFD led to increased body weight, visceral fat, plasma insulin, cholesterol, and malondialdehyde (MDA) levels, indicating the development of insulin resistance. Furthermore, HFD rats had cognitive deficit and brain mitochondrial dysfunction. HFD rats treated with both doses of garlic extract had decreased body weight, visceral fat, plasma cholesterol, and MDA levels. Garlic extract also improved cognitive function and brain mitochondrial function, which were impaired in obese-insulin resistant rats caused by HFD consumption.
Activation of the innate immune system can enhance resistance to a variety of bacterial and viral infections. In situations where the etiological agent of disease is unknown, such as a bioterror attack, stimulation of innate immunity may be particularly useful as induced immune responses are often capable of providing protection against a broad range of pathogens. In particular, the threat of an intentional release of a highly virulent bacterial pathogen that is either intrinsically resistant to antibiotics, or has been weaponized via the introduction of antibiotic resistance, makes immunopotentiation an attractive complementary or alternative strategy to enhance resistance to bacterial biothreat agents. Francisella tularensis, Yersinia pestis, Bacillus anthracis, and Burkholderia mallei or pseudomallei can all be easily disseminated via the respiratory route and infections can result in high mortality rates. Therefore, there has been a marked increase in research on immunotherapeutics against these Tier 1 select agents over the last 10 years that will be covered in this review. In addition, immunopotentiation against non-Tier 1 select agents such as Brucella spp., and Coxiella burnetii has also been studied and will be reviewed here. In particular, we will focus on cellular targets, such as toll-like receptors (TLRs), carbohydrate receptors and cytokine receptors, which have been exploited by immunomodulatory regimens that confer broad-spectrum protection against virulent bacterial pathogens.
Study design for this 2-phase 7-day winter military training program (4-day, garrison-based, military training task (MTT) and a 3-day ski march (SKI)). 
Physiological consequences of winter military operations are not well described. This study examined Norwegian soldiers (n = 21 males) participating in a physically demanding winter training program to evaluate whether short-term military training alters energy and whole-body protein balance, muscle damage, soreness, and performance. Energy expenditure (D2(18)O) and intake were measured daily, and postabsorptive whole-body protein turnover ([(15)N]-glycine), muscle damage, soreness, and performance (vertical jump) were assessed at baseline, following a 4-day, military task training phase (MTT) and after a 3-day, 54-km ski march (SKI). Energy intake (kcal·day(-1)) increased (P < 0.01) from (mean ± SD (95% confidence interval)) 3098 ± 236 (2985, 3212) during MTT to 3461 ± 586 (3178, 3743) during SKI, while protein (g·kg(-1)·day(-1)) intake remained constant (MTT, 1.59 ± 0.33 (1.51, 1.66); and SKI, 1.71 ± 0.55 (1.58, 1.85)). Energy expenditure increased (P < 0.05) during SKI (6851 ± 562 (6580, 7122)) compared with MTT (5480 ± 389 (5293, 5668)) and exceeded energy intake. Protein flux, synthesis, and breakdown were all increased (P < 0.05) 24%, 18%, and 27%, respectively, during SKI compared with baseline and MTT. Whole-body protein balance was lower (P < 0.05) during SKI (-1.41 ± 1.11 (-1.98, -0.84) g·kg(-1)·10 h) than MTT and baseline. Muscle damage and soreness increased and performance decreased progressively (P < 0.05). The physiological consequences observed during short-term winter military training provide the basis for future studies to evaluate nutritional strategies that attenuate protein loss and sustain performance during severe energy deficits.
Two types of confidence intervals (CIs) and confidence bands (CBs) for the receiver operating characteristic (ROC) curve are studied: pointwise CIs and simultaneous CBs. An optimized version of the pointwise CI with the shortest width is developed. A new ellipse-envelope simultaneous CB for the ROC curve is suggested as an adaptation of the Working-Hotelling-type CB implemented in a paper by Ma and Hall (1993). Statistical simulations show that our ellipse-envelope CB covers the true ROC curve with a probability close to nominal while the coverage probability of the Ma and Hall CB is significantly smaller. Simulations also show that our CI for the area under the ROC curve is close to nominal while the coverage probability of the CI suggested by Hanley and McNail (1982) uniformly overestimates the nominal value. Two examples illustrate our simultaneous ROC bands: radiation dose estimation from time to vomiting and discrimination of breast cancer from benign abnormalities using electrical impedance measurements.
Gene copy number (GCN) changes are common characteristics of many genetic diseases. Comparative genomic hybridization (CGH) is a new technology widely used today to screen the GCN changes in mutant cells with high resolution genome-wide. Statistical methods for analyzing such CGH data have been evolving. Existing methods are either frequentist's, or full Bayesian. The former often has computational advantage, while the latter can incorporate prior information into the model, but could be misleading when one does not have sound prior information. In an attempt to take full advantages of both approaches, we develop a Bayesian-frequentist hybrid approach, in which a subset of the model parameters is inferred by the Bayesian method, while the rest parameters by the frequentist's. This new hybrid approach provides advantages over those of the Bayesian or frequentist's method used alone. This is especially the case when sound prior information is available on part of the parameters, and the sample size is relatively small. Spatial dependence and false discovery rate are also discussed, and the parameter estimation is efficient. As an illustration, we used the proposed hybrid approach to analyze a real CGH data.
This paper presents a continuous-time Bayesian model for analyzing durations of behavior displays in social interactions. Duration data of social interactions are often complex because of repeated behaviors (events) at individual or group (e.g., dyad) level, multiple behaviors (multistates), and several choices of exit from a current event (competing risks). A multilevel, multistate model is proposed to adequately characterize the behavioral processes. The model incorporates dyad-specific and transition-specific random effects to account for heterogeneity among dyads and interdependence among competing risks. The proposed method is applied to child-parent observational data derived from the School Transitions Project to assess the relation of emotional expression in child-parent interaction to risk for early and persisting child conduct problems.
A virologic marker, the number of HIV RNA copies or viral load, is currently used to evaluate antiretroviral (ARV) therapies in AIDS clinical trials. This marker can be used to assess the antiviral potency of therapies, but may be easily affected by clinical factors such as drug exposures and drug resistance as well as baseline characteristics during the long-term treatment evaluation process. HIV dynamic studies have significantly contributed to the understanding of HIV pathogenesis and ARV treatment strategies. Viral dynamic models can be formulated through differential equations, but there has been only limited development of statistical methodologies for estimating such models or assessing their agreement with observed data. This paper develops a mechanism-based nonlinear differential equation models for characterizing long-term viral dynamics with ARV therapy. In this model we not only incorporate clinical factors (drug exposures and susceptibility), but also baseline covariate (baseline viral load, CD4 count, weight or age) into a function of treatment efficacy. A Bayesian nonlinear mixed-effects modeling approach is investigated with application to an AIDS clinical trial study. The effects of confounding interaction of clinical factors with covariate-based models are compared using the Deviance Information Criteria (DIC), a Bayesian version of the classical deviance for model assessment, designed from complex hierarchical model settings. Relationships between baseline covariate combined with confounding clinical factors and drug efficacy are explored. In addition, we compared models incorporating each of four baseline covariates through DIC and some interesting findings are presented. Our results suggest that modeling HIV dynamics and virologic responses with consideration of time-varying clinical factors as well as baseline characteristics may play an important role in understanding HIV pathogenesis, designing new treatment strategies for long-term care of AIDS patients.
Time-series plots of (a) five distinct alcohol bans and (b) daily homicides in Cali, Colombia from January 1999 to August 2008. Also shown is an estimate of the mean homicide rate obtained by applying a Gaussian kernel smoother (dotted curve). 
A Poisson regression model with an offset assumes a constant baseline rate after accounting for measured covariates, which may lead to biased estimates of coefficients in an inhomogeneous Poisson process. To correctly estimate the effect of time-dependent covariates, we propose a Poisson change-point regression model with an offset that allows a time-varying baseline rate. When the nonconstant pattern of a log baseline rate is modeled with a nonparametric step function, the resulting semi-parametric model involves a model component of varying dimension and thus requires a sophisticated varying-dimensional inference to obtain correct estimates of model parameters of fixed dimension. To fit the proposed varying-dimensional model, we devise a state-of-the-art MCMC-type algorithm based on partial collapse. The proposed model and methods are used to investigate an association between daily homicide rates in Cali, Colombia and policies that restrict the hours during which the legal sale of alcoholic beverages is permitted. While simultaneously identifying the latent changes in the baseline homicide rate which correspond to the incidence of sociopolitical events, we explore the effect of policies governing the sale of alcohol on homicide rates and seek a policy that balances the economic and cultural dependencies on alcohol sales to the health of the public.
Background: Breast cancer (BC), the most frequent malignancy in women worldwide, is currently diagnosed in about 1.4 million female patients annually. Approximately 10-20% of BC is represented by triple negative breast cancer (TNBC) which is aggressive, the prognosis is poor and patients cannot benefit from targeted treatment based on hormonal or HER2 receptors. For this reason, search for markers that can predict the efficacy of chemotherapy in TNBC is a priority. Methods and results: This review focuses on BCL2 protein as a prognostic marker in TNBC and its potential as a predictor of sensitivity to chemotherapy. Conclusion: BCL2 protein expression is a positive prognostic factor in BC. Better survival of patients with BCL2 positivity (BCL2+) has been explained by the correlation with estrogen receptor positive (ER+) status. BCL2+ is however not simply a surrogate marker for ER+. Moreover, BCL2 protein expression is also a positive prognostic marker in the TNBC subgroup. We and others show, that low BCL2 expression was associated with good outcome of TNBC patients treated with both adjuvant and neoadjuvant anthracycline-based chemotherapy. On the other hand, recent studies have shown that a subset of TNBC patients may benefit from the classical adjuvant CMF (cyclophosphamide, methotrexate, 5-fluorouracil) regimen. Given the heterogeneity of TNBC there is an urgent need to find and validate the sensitivity predictors to these regimens making them usable in clinical practice. BCL2 enrichment has been described in the mesenchymal stem-like (MSL) TNBC subgroup.
Investigation of monthly Chinese births in Singapore, Malaysia, Hong Kong and Taiwan shows a very similar seasonal pattern. The strong influence of Chinese culture appears to be the cause of the seasonality and similarity. Economic development has not altered this seasonal pattern significantly. The statistical methods presented in this paper to analyze Chinese births are readily applicable to many other areas.
We describe here the state of the art of certain aspects concerning potential small molecule therapy directed toward botulism, by inhibition of the zinc-protease containing light chain (LC) of botulinum neurotoxin BoNT/A from the anaerobic bacillus Clostridium botulinum. Botulinum neurotoxins (BoNTs) are comprised of eight serologically-distinct proteins (A - H), several of which are further divided, such as BoNT/A which has five subtypes. The BoNTs are the most toxic substances known to mankind, causing a form of flaccid paralysis that can be rapid and is often lethal. BoNT/A is comprised of a ~100 kDa heavy chain (HC) attached via a single disulfide Cys-Cys bond to a ~50 kDa LC. The HC mediates transport to and uptake by presynaptic glutamatergic neurons, where the LC cleaves the protein SNAP-25 and thus prevents vesicular trafficking and release of acetylcholine. The Zn-endoprotease activity of the LC of BoNT/A is a target for the development of small molecule inhibitors of BoNT/A-mediated toxicity. A variety of BoNT/A LC inhibitors have been described to date and we focus here primarily on the Zn-binding 8-hydroxyquinoline structural type as well as some of the previously-described hydroxamic acids.
Botulinum neurotoxins (BoNTs) are a class of bacterial neurotoxins that are the most potent toxic compounds reported to date. Exposure to relatively low concentrations of the toxin protein can result in major muscle paralysis, which may result in death in severe cases. In addition to their role in natural human disease, BoNTs are currently under close scrutiny because of their potential to be used as biowarfare agents. Clinical treatment options for botulism are currently limited, and finite stockpiles of antitoxin exist. In light of current bioterrorist threats, researchers have focused on identifying new molecules that can be applied to either sensitive toxin detection or improved clinical treatment. High-throughput screening (HTS) is a laboratory technique commonly employed to screen large libraries of diverse compounds based on specific compound binding capabilities or function. Here we review existing HTS platforms that have been applied to identify novel BoNT diagnostic or therapeutic agents. HTS platforms for screening antibodies, peptides, small molecules, and aptamers are described, as well as the screening results and current progress of the identified compounds.
Delivering therapeutic cargos to specific cell types in vivo poses many technical challenges. There is currently a plethora of drug leads and therapies against numerous diseases, ranging from small molecule compounds to nucleic acids to peptides to proteins with varying binding or enzymatic functions. Many of these candidate therapies have documented potential for mitigating or reversing disease symptoms, if only a means for gaining access to the intracellular target were available. Recent advances in our understanding of the biology of cellular uptake and transport processes and the mode of action of bacterial protein toxins have accelerated the development of toxin-based cargo-delivery vehicle platforms. This review provides an updated survey of the status of available platforms for targeted delivery of therapeutic cargos, outlining various strategies that have been used to deliver different types of cargo into cells. Particular emphasis is placed on the application of toxin-based approaches, examining critical issues that have hampered realization of post-intoxication antitoxins against botulism.
This study examined differences in dietary intake on weekdays versus weekends in Canada (n = 34 402) and found that energy intake was 62 ± 23 kcal higher, and dietary quality was slightly lower on weekends (p < 0.05). After energy adjustment, Canadians consumed 66% more alcohol, 10% more cholesterol, and significantly lower intakes of carbohydrates, protein, and most micronutrients (ranging from 2.0%-6.9% lower) on weekends. Findings suggest that Canadians consume a slightly less favourable nutrient profile and poorer dietary quality on weekends.
Using several variables known to be related to prostate cancer, a multivariate classification method is developed to predict the onset of clinical prostate cancer. A multivariate mixed-effects model is used to describe longitudinal changes in prostate specific antigen (PSA), a free testosterone index (FTI), and body mass index (BMI) before any clinical evidence of prostate cancer. The patterns of change in these three variables are allowed to vary depending on whether the subject develops prostate cancer or not and the severity of the prostate cancer at diagnosis. An application of Bayes' theorem provides posterior probabilities that we use to predict whether an individual will develop prostate cancer and, if so, whether it is a high-risk or a low-risk cancer. The classification rule is applied sequentially one multivariate observation at a time until the subject is classified as a cancer case or until the last observation has been used. We perform the analyses using each of the three variables individually, combined together in pairs, and all three variables together in one analysis. We compare the classification results among the various analyses and a simulation study demonstrates how the sensitivity of prediction changes with respect to the number and type of variables used in the prediction process.
Previous research on prostate cancer survival trends in the United States National Cancer Institute's Surveillance Epidemiology and End Results database has indicated a potential change-point in the age of diagnosis of prostate cancer around age 50. Identifying a change-point value in prostate cancer survival and cure could have important policy and health care management implications. Statistical analysis of this data has to address two complicating features: (1) change-point models are not smooth functions and so present computational and theoretical difficulties; and (2) models for prostate cancer survival need to account for the fact that many men diagnosed with prostate cancer can be effectively cured of their disease with early treatment. We develop a cure survival model that allows for change-point effects in covariates to investigate a potential change-point in the age of diagnosis of prostate cancer. Our results do not indicate that age under 50 is associated with increased hazard of death from prostate cancer.
Kir3 (or GIRK) channels have been known for nearly three decades to be activated by direct interactions with the βγ subunits of heterotrimeric G (Gαβγ) proteins in a membrane-delimited manner. Gα also interacts with GIRK channels and since PTX-sensitive Gα subunits show higher affinity of interaction they confer signaling specificity to G Protein-Coupled Receptors (GPCRs) that normally couple to these G protein subunits. In heterologous systems, overexpression of non PTX-sensitive Gα subunits scavenges the available Gβγ and biases GIRK activation through GPCRs that couple to these Gα subunits. Moreover, all Kir channels rely on their direct interactions with the phospholipid PIP2 to maintain their activity. Thus, signals that activate phospholipase C (e.g. through Gq signaling) to hydrolyze PIP2 result in inhibition of Kir channel activity. In this review, we illustrate with experiments performed in Xenopus oocytes that Kir channels can be used efficiently as reporters of GPCR function through Gi, Gs or Gq signaling. The membrane-delimited nature of this expression system makes it highly efficient for constructing dose-response curves yielding highly reproducible apparent affinities of different ligands for each GPCR tested.
In a smoothing spline model with unknown change-points, the choice of the smoothing parameter strongly influences the estimation of the change-point locations, and the function at the change-points. In a tumor biology example, where change-points in blood flow in response to treatment were of interest, choosing the smoothing parameter based on minimizing generalized cross validation, GCV, gave unsatisfactory estimates of the change-points. We propose a new method, aGCV, that re-weights the residual sum of squares and generalized degrees of freedom terms from GCV. The weight is chosen to maximize the decrease in the generalized degrees of freedom as a function of the weight value, while simultaneously minimizing aGCV as a function of the smoothing parameter and the change-points. Compared to GCV, simulation studies suggest that the aGCV method yields improved estimates of the change-point and the value of the function at the change-point.
In this paper, we study the multi-class differential gene expression detection for microarray data. We propose a likelihood based approach to estimating an empirical null distribution to incorporate gene interactions and provide more accurate false positive control than the commonly used permutation or theoretical null distribution based approach. We propose to rank important genes by p-values or local false discovery rate based on the estimated empirical null distribution. Through simulations and application to a lung transplant microarray data, we illustrate the competitive performance of the proposed method.
Current extremely large scale genetic data presents significant challenges for cluster analysis. Most existing clustering methods are typically built on Euclidean distance and geared toward analyzing continuous response. They work well for clustering, e.g., microarray gene expression data, but often perform poorly for clustering, e.g., large scale single nucleotide polymorphism data. In this paper, we study the penalized latent class model for clustering extremely large scale discrete data. The penalized latent class model takes into account the discrete nature of the response using appropriate generalized linear models and adopts the lasso penalized likelihood approach for simultaneous model estimation and selection of important covariates. We develop very efficient numerical algorithms for model estimation based on the iterative coordinate descent approach and further develop the Expectation-Maximization algorithm to incorporate and model missing values. We use simulation studies and applications to the international HapMap single nucleotide polymorphism data to illustrate the competitive performance of the penalized latent class model.
Diagram of the Subsemble procedure using linear regression to combine the subset-specific fits. The full data set, consisting of n observations is partitioned into J disjoint subsets. The same underlying algorithm ψ ˆ is applied to each subset, resulting in J subset-specific fits ψ ˆ 1 , ψ ˆ 2 , . . . , ψ ˆ J . V -fold cross-vali- 
Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive datasets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be a beneficial tool for small to moderate sized datasets, and often has better prediction performance than the underlying algorithm fit just once on the full dataset. We also describe how to include Subsemble as a candidate in a SuperLearner library, providing a practical way to evaluate the performance of Subsemlbe relative to the underlying algorithm fit just once on the full dataset.
In this paper, we derive sequential conditional probability ratio tests to compare diagnostic tests without distributional assumptions on test results. The test statistics in our method are nonparametric weighted areas under the receiver-operating characteristic curves. By using the new method, the decision of stopping the diagnostic trial early is unlikely to be reversed should the trials continue to the planned end. The conservatism reflected in this approach to have more conservative stopping boundaries during the course of the trial is especially appealing for diagnostic trials since the end point is not death. In addition, the maximum sample size of our method is not greater than a fixed sample test with similar power functions. Simulation studies are performed to evaluate the properties of the proposed sequential procedure. We illustrate the method using data from a thoracic aorta imaging study.
In this work, we develop modeling and estimation approach for the analysis of cross-sectional clustered data with multimodal conditional distributions where the main interest is in analysis of subpopulations. It is proposed to model such data in a hierarchical model with conditional distributions viewed as finite mixtures of normal components. With a large number of observations in the lowest level clusters, a two-stage estimation approach is used. In the first stage, the normal mixture parameters in each lowest level cluster are estimated using robust methods. Robust alternatives to the maximum likelihood estimation are used to provide stable results even for data with conditional distributions such that their components may not quite meet normality assumptions. Then the lowest level cluster-specific means and standard deviations are modeled in a mixed effects model in the second stage. A small simulation study was conducted to compare performance of finite normal mixture population parameter estimates based on robust and maximum likelihood estimation in stage 1. The proposed modeling approach is illustrated through the analysis of mice tendon fibril diameters data. Analyses results address genotype differences between corresponding components in the mixtures and demonstrate advantages of robust estimation in stage 1.
The National Cancer Institute (NCI) suggests a sudden reduction in prostate cancer mortality rates, likely due to highly successful treatments and screening methods for early diagnosis. We are interested in understanding the impact of medical breakthroughs, treatments, or interventions, on the survival experience for a population. For this purpose, estimating the underlying hazard function, with possible time change points, would be of substantial interest, as it will provide a general picture of the survival trend and when this trend is disrupted. Increasing attention has been given to testing the assumption of a constant failure rate against a failure rate that changes at a single point in time. We expand the set of alternatives to allow for the consideration of multiple change-points, and propose a model selection algorithm using sequential testing for the piecewise constant hazard model. These methods are data driven and allow us to estimate not only the number of change points in the hazard function but where those changes occur. Such an analysis allows for better understanding of how changing medical practice affects the survival experience for a patient population. We test for change points in prostate cancer mortality rates using the NCI Surveillance, Epidemiology, and End Results dataset.
Data Envelopment Analysis (DEA) is the most commonly used approach for evaluating healthcare efficiency (Hollingsworth, 2008), but a long-standing concern is that DEA assumes that data are measured without error. This is quite unlikely, and DEA and other efficiency analysis techniques may yield biased efficiency estimates if it is not realized (Gajewski, Lee, Bott, Piamjariyakul and Taunton, 2009; Ruggiero, 2004). We propose to address measurement error systematically using a Bayesian method (Bayesian DEA). We will apply Bayesian DEA to data from the National Database of Nursing Quality Indicators® (NDNQI®) to estimate nursing units' efficiency. Several external reliability studies inform the posterior distribution of the measurement error on the DEA variables. We will discuss the case of generalizing the approach to situations where an external reliability study is not feasible.
Longitudinal imaging studies have moved to the forefront of medical research due to their ability to characterize spatio-temporal features of biological structures across the lifespan. Valid inference in longitudinal imaging requires enough flexibility of the covariance model to allow reasonable fidelity to the true pattern. On the other hand, the existence of computable estimates demands a parsimonious parameterization of the covariance structure. Separable (Kronecker product) covariance models provide one such parameterization in which the spatial and temporal covariances are modeled separately. However, evaluating the validity of this parameterization in high-dimensions remains a challenge. Here we provide a scientifically informed approach to assessing the adequacy of separable (Kronecker product) covariance models when the number of observations is large relative to the number of independent sampling units (sample size). We address both the general case, in which unstructured matrices are considered for each covariance model, and the structured case, which assumes a particular structure for each model. For the structured case, we focus on the situation where the within subject correlation is believed to decrease exponentially in time and space as is common in longitudinal imaging studies. However, the provided framework equally applies to all covariance patterns used within the more general multivariate repeated measures context. Our approach provides useful guidance for high dimension, low sample size data that preclude using standard likelihood based tests. Longitudinal medical imaging data of caudate morphology in schizophrenia illustrates the approaches appeal.
In the early 1920s the antirachitic effect of food irradiated with ultraviolet light and cod liver oil has been recognized. The antirachitic substance was identified and called "vitamin D". Since then the key role of vitamin D in calcium and bone homeostasis has been investigated. Moreover, it has been recognized that vitamin D is able to modulate a variety of processes and regulatory systems such as host defense, inflammation, immunity, and repair. According to recent studies, vitamin D deficiency is likely to be an important etiological factor in the pathogenesis of many chronic diseases, as well as it has been associated with higher mortality rate for respiratory disease. In this regard, either observational studies aimed to verify an association between low vitamin D level and the incidence of respiratory tract infections (RTIs) or clinical trials on the effect of vitamin D as a supplementary treatment in RTIs patients have been presented in the emerging clinical literature. Conflicting results have been demonstrated in several randomized, double-blind, placebo controlled trials concerning the vitamin D treatment in tuberculosis. Some studies suggest a beneficial effect by vitamin D but it could not be reproduced in larger studies so far. In conclusion, although basic science research suggests that vitamin D may play an important role in modulating immune functions, no strong evidence exists whether correction of vitamin D depletion may be useful in the prevention or treatment of infections. Further and larger studies may clarify the role of vitamin D in infection.
Sketch plot for estimating f 1 (λ), f 2 (λ) and f 3 (λ).  
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 - λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.
(Position of the figure should be before part E --the simulation study)  
We wish to model pulse wave velocity (PWV) as a function of longitudinal measurements of pulse pressure (PP) at the same and prior visits at which the PWV is measured. A number of approaches are compared. First, we use the PP at the same visit as the PWV in a linear regression model. In addition, we also use the average of all available PP's as the explanatory variable in a linear regression model. Next, a two-stage process is applied. The longitudinal PP is modeled using a linear mixed-effects model. This modeled PP is used in the regression model to describe PWV. An approach for using the longitudinal PP data is to obtain a measure of cumulative burden, the area under the PP curve (AUC). This AUC is used as an explanatory variable to model PWV. Finally, a joint Bayesian model is constructed similar to the two-stage model.
Kendall's τ is a non-parametric measure of correlation based on ranks and is used in a wide range of research disciplines. Although methods are available for making inference about Kendall's τ, none has been extended to modeling multiple Kendall's τs arising in longitudinal data analysis. Compounding this problem is the pervasive issue of missing data in such study designs. In this paper, we develop a novel approach to provide inference about Kendall's τ within a longitudinal study setting under both complete and missing data. The proposed approach is illustrated with simulated data and applied to an HIV prevention study.
We briefly review and discuss design issues for population growth and decline models. We then use a flexible growth and decline model as an illustrative example and apply optimal design theory to find optimal sampling times for estimating model parameters, specific parameters and interesting functions of the model parameters for the model with two real applications. Robustness properties of the optimal designs are investigated when nominal values or the model is mis-specified, and also under a different optimality criterion. To facilitate use of optimal design ideas in practice, we also introduce a website for generating a variety of optimal designs for popular models from different disciplines.
This paper investigates a new test for normality that is easy for biomedical researchers to understand and easy to implement in all dimensions. In terms of power comparison against a broad range of alternatives, the new test outperforms the best known competitors in the literature as demonstrated by simulation results. In addition, the proposed test is illustrated using data from real biomedical studies.
Top-cited authors
Tony Lindeberg
  • KTH Royal Institute of Technology
Gauss M. Cordeiro
  • Federal University of Pernambuco
Haitham M. Yousof
  • Benha University
Ahmed Z. Afify
  • Benha University
Edwin Ortega
  • University of São Paulo