Hua Yun Chen

University of Illinois at Chicago, Chicago, IL, United States

Are you Hua Yun Chen?

Claim your profile

Publications (20)50.73 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Vitamin D deficiency is more common among African Americans (AAs) than among European Americans (EAs), and epidemiologic evidence links vitamin D status to many health outcomes. Two genome-wide association studies (GWAS) in European populations identified vitamin D pathway gene single-nucleotide polymorphisms (SNPs) associated with serum vitamin D [25(OH)D] levels, but a few of these SNPs have been replicated in AAs. Here, we investigated the associations of 39 SNPs in vitamin D pathway genes, including 19 GWAS-identified SNPs, with serum 25(OH)D concentrations in 652 AAs and 405 EAs. Linear and logistic regression analyses were performed adjusting for relevant environmental and biological factors. The pattern of SNP associations was distinct between AAs and EAs. In AAs, six GWAS-identified SNPs in GC, CYP2R1, and DHCR7/NADSYN1 were replicated, while nine GWAS SNPs in GC and CYP2R1 were replicated in EAs. A CYP2R1 SNP, rs12794714, exhibited the strongest signal of association in AAs. In EAs, however, a different CYP2R1 SNP, rs1993116, was the most strongly associated. Our models, which take into account genetic and environmental variables, accounted for 20 and 28 % of the variance in serum vitamin D levels in AAs and EAs, respectively.
    Human genetics. 08/2014;
  • Hua Yun Chen, Muredach P Reilly, Mingyao Li
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a semiparametric odds ratio model that extends Umbach and Weinberg's approach to exploiting gene-environment association model for efficiency gains in case-control designs to both discrete and continuous data. We directly model the gene-environment association in the control population to avoid estimating the intercept in the disease risk model, which is inherently difficult because of the scarcity of information on the parameter with the sampling designs. We propose a novel permutation-based approach to eliminate the high-dimensional nuisance parameters in the matched case-control design. The proposed approach reduces to the conditional logistic regression when the model for the gene-environment association is unrestricted. Simulation studies demonstrate good performance of the proposed approach. We apply the proposed approach to a study of gene-environment interaction on coronary artery disease. Copyright © 2013 John Wiley & Sons, Ltd.
    Statistics in Medicine 01/2013; · 2.04 Impact Factor
  • Hua Yun Chen, Rick Kittles, Wei Zhang
    [Show abstract] [Hide abstract]
    ABSTRACT: In genetic association studies with densely typed genetic markers, it is often of substantial interest to examine not only the primary phenotype but also the secondary traits for their association with the genetic markers. For more efficient sample ascertainment of the primary phenotype, a case-control design or its variants, such as the extreme-value sampling design for a quantitative trait, are often adopted. The secondary trait analysis without correcting for the sample ascertainment may yield a biased association estimator. We propose a new method aiming at correcting the potential bias due to the inadequate adjustment of the sample ascertainment. The method yields explicit correction formulas that can be used to both screen the genetic markers and rapidly evaluate the sensitivity of the results to the assumed baseline case-prevalence rate in the population. Simulation studies demonstrate good performance of the proposed approach in comparison with the more computationally intensive approaches, such as the compensator approaches and the maximum prospective likelihood approach. We illustrate the application of the approach by analysis of the genetic association of prostate specific antigen in a case-control study of prostate cancer in the African American population. Copyright © 2012 John Wiley & Sons, Ltd.
    Statistics in Medicine 09/2012; · 2.04 Impact Factor
  • Hua Yun Chen, Mingyao Li
    [Show abstract] [Hide abstract]
    ABSTRACT: Extreme-value sampling design that samples subjects with extremely large or small quantitative trait values is commonly used in genetic association studies. Samples in such designs are often treated as "cases" and "controls" and analyzed using logistic regression. Such a case-control analysis ignores the potential dose-response relationship between the quantitative trait and the underlying trait locus and thus may lead to loss of power in detecting genetic association. An alternative approach to analyzing such data is to model the dose-response relationship by a linear regression model. However, parameter estimation from this model can be biased, which may lead to inflated type I errors. We propose a robust and efficient approach that takes into consideration of both the biased sampling design and the potential dose-response relationship. Extensive simulations demonstrate that the proposed method is more powerful than the traditional logistic regression analysis and is more robust than the linear regression analysis. We applied our method to the analysis of a candidate gene association study on high-density lipoprotein cholesterol (HDL-C) which includes study subjects with extremely high or low HDL-C levels. Using our method, we identified several SNPs showing a stronger evidence of association with HDL-C than the traditional case-control logistic regression analysis. Our results suggest that it is important to appropriately model the quantitative traits and to adjust for the biased sampling when dose-response relationship exists in extreme-value sampling designs.
    Genetic Epidemiology 12/2011; 35(8):823-30. · 4.02 Impact Factor
  • Source
    Hua Yun Chen, Jinbo Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: For analysis of case-control genetic association studies, it has recently been shown that gene-environment independence in the population can be leveraged to increase efficiency for estimating gene-environment interaction effects in comparison with the standard prospective analysis. However, for the special case in which data on the binary phenotype and genetic and environmental risk factors can be summarized in a 2 × 2 × 2 table, the authors show here that there is no efficiency gain for estimating interaction effects, nor is there an efficiency gain for estimating the genetic and environmental main effects. This contrasts with the well-known result assuming that rare phenotype prevalence and gene-environment independence in the control population for the same data can lead to efficiency gain. This discrepancy is counterintuitive, since the 2 likelihoods are also approximately equal when the phenotype is rare. An explanation for the paradox based on a theoretical analysis is provided. Implications of these results for data analyses are also examined, and practical guidance on analyzing such case-control studies is offered.
    American journal of epidemiology 08/2011; 174(6):736-43. · 5.59 Impact Factor
  • Source
    Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: We derive new representations of the efficient score for coarse data problems based on Neumann series expansion. The representations can be applied to both ignorable and nonignorable coarse data. An approximation to the new representation may be used for computing locally efficient scores in such problems. We show that many of the successive approximation approaches to the computation of the locally efficient score proposed in the literature for coarse data problems can be derived as special cases of the representations. In addition, the representations lead to new algorithms for computing the locally efficient scores for the coarse data problems.
    Annals of the Institute of Statistical Mathematics 06/2011; 63(3):497-509. · 0.74 Impact Factor
  • Source
    Hua Yun Chen, Hui Xie, Yi Qian
    [Show abstract] [Hide abstract]
    ABSTRACT: Multiple imputation is a practically useful approach to handling incompletely observed data in statistical analysis. Parameter estimation and inference based on imputed full data have been made easy by Rubin's rule for result combination. However, creating proper imputation that accommodates flexible models for statistical analysis in practice can be very challenging. We propose an imputation framework that uses conditional semiparametric odds ratio models to impute the missing values. The proposed imputation framework is more flexible and robust than the imputation approach based on the normal model. It is a compatible framework in comparison to the approach based on fully conditionally specified models. The proposed algorithms for multiple imputation through the Markov chain Monte Carlo sampling approach can be straightforwardly carried out. Simulation studies demonstrate that the proposed approach performs better than existing, commonly used imputation approaches. The proposed approach is applied to imputing missing values in bone fracture data.
    Biometrics 01/2011; 67(3):799-809. · 1.41 Impact Factor
  • Source
    Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: A conditionally specified joint model is convenient to use in fields such as spatial data modeling, Gibbs sampling, and missing data imputation. One potential problem with such an approach is that the conditionally specified models may be incompatible, which can lead to serious problems in applications. We propose an odds ratio representation of a joint density to study the issue and derive conditions under which conditionally specified distributions are compatible and yield a joint distribution. Our conditions are the simplest to verify compared with those proposed in the literature. The proposal also explicitly construct joint densities that are fully compatible with the conditionally specified densities when the conditional densities are compatible, and partially compatible with the conditional densities when they are incompatible. The construction result is then applied to checking the compatibility of the conditionally specified models. Ways to modify the conditionally specified models based on the construction of the joint models are also discussed when the conditionally specified models are incompatible.
    Statistics [?] Probability Letters 04/2010; 80(7-8):670-677. · 0.53 Impact Factor
  • Source
    Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: The inverse of the nonparametric information operator is key for finding doubly robust estimators and the semiparametric efficient estimator in missing data problems. It is known that no closed-form expression for the inverse of the nonparametric information operator exists when missing data form nonmonotone patterns. The Neumann series is usually used for approximating the inverse. However, the Neumann series approximation is only known to converge in L2 norm, which is not sufficient for establishing statistical properties of the estimators yielded from the approximation. In this work, we show that L[infinity] convergence of the Neumann series approximations to the inverse of the nonparametric information operator and to the efficient scores in missing data problems can be obtained under very simple conditions. This paves the way to a study of the asymptotic properties of the doubly robust estimators and the locally semiparametric efficient estimator in those difficult situations.
    Statistics [?] Probability Letters 01/2010; 80(9-10):864-873. · 0.53 Impact Factor
  • Source
    Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Theory on semiparametric efficient estimation in missing data problems has been systematically developed by Robins and his coauthors. Except in relatively simple problems, semiparametric efficient scores cannot be expressed in closed forms. Instead, the efficient scores are often expressed as solutions to integral equations. Neumann series was proposed in the form of successive approximation to the efficient scores in those situations. Statistical properties of the estimator based on the Neumann series approximation are difficult to obtain and as a result, have not been clearly studied. In this paper, we reformulate the successive approximation in a simple iterative form and study the statistical properties of the estimator based on the reformulation. We show that a doubly-robust locally-efficient estimator can be obtained following the algorithm in robustifying the likelihood score. The results can be applied to, among others, the parametric regression, the marginal regression, and the Cox regression when data are subject to missing values and the missing data are missing at random. A simulation study is conducted to evaluate the performance of the approach and a real data example is analyzed to demonstrate the use of the approach.
    Scandinavian Journal of Statistics 12/2009; 36(4):713-734. · 1.17 Impact Factor
  • Source
    Hua Yun Chen, Shasha Gao
    [Show abstract] [Hide abstract]
    ABSTRACT: We study the problem of estimation and inference on the average treatment effect in a smoking cessation trial where an outcome and some auxiliary information were measured longitudinally, and both were subject to missing values. Dynamic generalized linear mixed effects models linking the outcome, the auxiliary information, and the covariates are proposed. The maximum likelihood approach is applied to the estimation and inference on the model parameters. The average treatment effect is estimated by the G-computation approach, and the sensitivity of the treatment effect estimate to the nonignorable missing data mechanisms is investigated through the local sensitivity analysis approach. The proposed approach can handle missing data that form arbitrary missing patterns over time. We applied the proposed method to the analysis of the smoking cessation trial.
    Statistics in Medicine 05/2009; 28(19):2451-72. · 2.04 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this randomized controlled trial, 169 persons with multiple sclerosis were randomly assigned to an immediate intervention group or a delayed control group using a crossover design. The outcome measures (Fatigue Impact Scale and SF-36 Health Survey) were measured four times before and after courses. This study investigated whether the immediate benefits of a 6-week, community-based, energy conservation course for persons with multiple sclerosis were maintained at 1-year follow-up. We performed intent-to-treat and compliers-only analyses using mixed effects analysis of variance models. Results showed that the beneficial effects were maintained 1-year postcourse compared with immediate postcourse. In addition, there were significant improvements in all three subscales of the Fatigue Impact Scale and in four subscales of SF-36 Health Survey 1-year postcourse compared with precourse. Together, these results provide strong evidence that the beneficial effects of the energy conservation course taught by occupational therapists were maintained up to 1-year postcourse.
    International Journal of Rehabilitation Research 01/2008; 30(4):305-13. · 1.06 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This study investigated gp120-binding antibody and neutralizing activity, at the gingival- and cervical-mucosal levels, in response to a bivalent gp120 candidate vaccine. Women who met the study's inclusion criteria for documented high-risk behaviors participated in a nested substudy of the multicenter phase 3 trial of human immunodeficiency virus (HIV)-vaccine efficacy, VAX004. Gingival, cervicovaginal lavage, and plasma specimens were collected at 6-month intervals for 3 years. Binding-antibody and neutralizing-activity assays quantified the presence of anti-HIV activity in mucosal specimens. Vaccine recipients were more likely than placebo recipients to have IgG binding antibodies in all 3 compartments tested and to have only IgA binding antibody in plasma (P<.0001). The relationship between vaccine and cervicovaginal IgG achieved significance (odds ratio [OR], 6.6 [P=.01]) but was weakened by the presence of cervicovaginal leukocytes. There was no relationship between immunization and the presence of neutralizing activity, in either bivariate or multivariate modeling (OR, 6.0 [P=.29]). Vaccination is associated with the presence of both gp120-binding IgG in all compartments and plasma IgA but not with neutralizing activity. There is a role for the measurement of mucosal immunity in response to candidate vaccines and, in particular, for a determination of HIV-specific neutralizing antibodies.
    The Journal of Infectious Diseases 12/2007; 196(11):1637-44. · 5.85 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Innate immune factors in mucosal secretions may influence human immunodeficiency virus type 1 (HIV-1) transmission. This study examined the levels of three such factors, genital tract lactoferrin [Lf], secretory leukocyte protease inhibitor [SLPI], and RANTES, in women at risk for acquiring HIV infection, as well as cofactors that may be associated with their presence. Women at high risk for HIV infection meeting established criteria (n = 62) and low-risk controls (n = 33) underwent cervicovaginal lavage (CVL), and the CVL fluid samples were assayed for Lf and SLPI. Subsets of 26 and 10 samples, respectively, were assayed for RANTES. Coexisting sexually transmitted infections and vaginoses were also assessed, and detailed behavioral information was collected. Lf levels were higher in high-risk (mean, 204 ng/ml) versus low-risk (mean, 160 ng/ml, P = 0.007) women, but SLPI levels did not differ, and RANTES levels were higher in only the highest-risk subset. Lf was positively associated only with the presence of leukocytes in the CVL fluid (P < 0.0001). SLPI levels were lower in women with bacterial vaginosis [BV] than in those without BV (P = 0.04). Treatment of BV reduced RANTES levels (P = 0.05). The influence, if any, of these three cofactors on HIV transmission in women cannot be determined from this study. The higher Lf concentrations observed in high-risk women were strongly associated with the presence of leukocytes, suggesting a leukocyte source and consistent with greater genital tract inflammation in the high-risk group. Reduced SLPI levels during BV infection are consistent with an increased risk of HIV infection, which has been associated with BV. However, the increased RANTES levels in a higher-risk subset of high-risk women were reduced after BV treatment.
    Clinical and Vaccine Immunology 09/2007; 14(9):1102-7. · 2.60 Impact Factor
  • Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a semiparametric odds ratio model to measure the association between two variables taking discrete values, continuous values, or a mixture of both. Methods for estimation and inference with varying degrees of robustness to model assumptions are studied. Semiparametric efficient estimation and inference procedures are also considered. The estimation methods are compared in a simulation study and applied to the study of associations among genital tract bacterial counts in HIV infected women.
    Biometrics 07/2007; 63(2):413-21. · 1.41 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: To assess the short-term efficacy and effectiveness of a six-week energy conservation course on fatigue impact, quality of life and self-efficacy for persons with multiple sclerosis (MS). In this randomized controlled trial, we randomly assigned 169 persons with MS to an immediate intervention group or a delayed control group using a crossover design. The outcome measures: Fatigue Impact Scale, SF-36 Health Survey and Self-Efficacy for Performing Energy Conservation Strategies were measured before and after courses and no intervention control periods. We performed intent-to-treat analysis and compliers-only analyses using mixed effects analysis of variance models. Taking the energy conservation course had significant effects on reducing the physical and social subscales of Fatigue Impact Scale and on increasing the Vitality subscale of the SF-36 scores compared with not taking the course. Additional subscales were significant depending on methods of analyses. Self-Efficacy for Performing Energy Conservation Strategies Assessment increased significantly (P <0.05) postcourse compared to precourse. Results support the efficacy and effectiveness of the energy conservation course to decrease fatigue impact, and to increase self-efficacy and some aspects of quality of life.
    Multiple Sclerosis 10/2005; 11(5):592-601. · 4.47 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: To determine whether there were any differences in the outcomes of individuals with multiple sclerosis who attended all six sessions of an energy conservation education programme compared with people who missed a session and received a self-study module. Secondary analysis of data from two naturally occurring groups emerging from a randomized control trial--compliers who received the intervention as intended (group 1) and noncompliers who received a modified intervention with self-study modules (group 2). Community settings in Chicago, Illinois and Minneapolis, Minnesota, USA. Ninety-two community-dwelling people with multiple sclerosis who were participating in an energy conservation education programme. Energy conservation education groups based on the 'Managing Fatigue' programme, which were facilitated by an occupational therapist. Self-study modules were sent to participants who missed a session. Fatigue Impact Scale (FIS), Self-Efficacy for Performing Energy Conservation Strategies Assessment, Energy Conservation Strategies Survey (ECSS), six subscales from the Medical Outcomes Study Short-Form Health Survey (SF-36). When comparing individuals who attended all six sessions with individuals who missed one or more sessions and received a self-study module, no significant differences were found after adjusting for multiple comparisons. Participants who used the self-study modules because they missed sessions of the Managing Fatigue programme experienced benefits from the course similar to those experienced by participants who fully complied with the intervention as intended. A new prospective study to validate the findings of this secondary analysis is required.
    Clinical Rehabilitation 09/2005; 19(5):475-81. · 2.19 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: To evaluate The Community Health Worker "promotor de salud" (CHW) model is evaluated as a tool for reducing eye injuries in Latino farm workers. In 2001, 786 workers on 34 farms were divided into three intervention blocks: (A) CHWs provided protective eyewear and training to farm workers; (B) CHWs provided eyewear but no training to farm workers; (C) eyewear was distributed to farm workers with no CHW present and no training. Pre- and post-intervention questionnaires demonstrated greater self-reported use of eyewear in all blocks after the intervention (P < 0.0001), with Block A showing the greatest change compared to B (P < 0.0001) and C (P = 0.03); this was supported by field observations. Block A showed the greatest improvement in knowledge on questions related to training content. CHWs were an effective tool to train farm workers in eye health and safety, improving the use of personal protective equipment and knowledge.
    American Journal of Industrial Medicine 12/2004; 46(6):607-13. · 1.97 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose an approach to modeling functional magnetic resonance imaging (fMRI) data that combines hierarchical polynomial models, Bayes estimation, and clustering. A cubic polynomial is used to fit the voxel time courses of event-related design experiments. The coefficients of the polynomials are estimated by Bayes estimation, in a two-level hierarchical model, which allows us to borrow strength from all voxels. The voxel-specific Bayes polynomial coefficients are then transformed to the times and magnitudes of the minimum and maximum points on the hemodynamic response curve, which are in turn used to classify the voxels as being activated or not. The procedure is demonstrated on real data from an event-related design experiment of visually guided saccades and shown to be an effective alternative to existing methods.
    NeuroImage 07/2004; 22(2):804-14. · 6.25 Impact Factor
  • Hua Yun Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Two likelihood representations corresponding to the prospective and retrospective analyses of the case-control design are derived for general outcome-dependent samples with arbitrary discrete or continuous outcomes and possibly non-multiplicative models. Parameter identification in the general outcome-dependent design is reduced to the simple problem of parameter identification in the general odds ratio function. Both likelihoods are shown to generate the same profile likelihood for the common parameter of interest. Maximum like- lihood estimators based on either likelihood are semiparametric efficient for the identifiable parameters. Copyright 2003 Royal Statistical Society.
    Journal of the Royal Statistical Society Series B (Statistical Methodology) 01/2003; 65(2):575-584. · 4.81 Impact Factor

Publication Stats

226 Citations
50.73 Total Impact Points

Institutions

  • 2003–2013
    • University of Illinois at Chicago
      • • Division of Epidemiology and Biostatistics
      • • Department of Occupational Therapy
      Chicago, IL, United States
  • 2008
    • University of Minnesota Duluth
      Duluth, Minnesota, United States