Holger Schwender

Heinrich-Heine-Universität Düsseldorf, Düsseldorf, North Rhine-Westphalia, Germany

Are you Holger Schwender?

Claim your profile

Publications (69)178.35 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: DNA copy number variants play an important part in the development of common birth defects such as oral clefts. Individual patients with multiple birth defects (including oral clefts) have been shown to carry small and large chromosomal deletions. We investigated the role of polymorphic copy number deletions by comparing transmission rates of deletions from parents to offspring in case-parent trios of European ancestry ascertained through a cleft proband with trios ascertained through a normal offspring. DNA copy numbers in trios were called using the joint hidden Markov model in the freely available PennCNV software. All statistical analyses were performed using Bioconductor tools in the open source environment R. We identified a 67 kb region in the gene MGAM on chromosome 7q34, and a 206 kb region overlapping genes ADAM3A and ADAM5 on chromosome 8p11, where deletions are more frequently transmitted to cleft offspring than control offspring. These genes or nearby regulatory elements may be involved in the etiology of oral clefts. Birth Defects Research (Part A), 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
    Birth Defects Research Part A Clinical and Molecular Teratology 03/2015; 103(4). DOI:10.1002/bdra.23362 · 2.21 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The porphyrias are a group of inherited metabolic diseases resulting from enzymatic deficiencies of specific haem biosynthetic enzymes. They can be classified as primarily acute and non-acute types. Clinically, the acute hepatic porphyrias (AHPs) are characterised by acute neurovisceral attacks. Patients with AHP may be at increased risk for development of hepatocellular carcinoma (HCC). However, systematic studies on the occurrence of other malignancies in patients with the AHPs have not been performed to date. Here, we studied the development of HCC and distinct malignant tumours in patients with the AHPs registered in a single European porphyria specialist centre. A questionnaire was designed and sent to all individuals (n = 122) diagnosed between 1970 and 2012 of whom a valid address was available (n = 82), requesting information on their personal and family history of cancer. Statistical analysis was performed to calculate incidence, prevalence and relative risk of HCC. To calculate confidence intervals, a Poisson distribution was assumed. Forty-nine patients (59.8%) returned a completed questionnaire. Overall, HCC was diagnosed in one female (2.1%), and the remaining patients reported on six distinct malignancies. We were able to confirm that HCC is an important complication in AHP. The patients in our cohort had an approximately 35-fold increased risk of developing HCC, similar to observations in other European countries. In addition, we detected colon, breast, uterine and thyroid cancer as well as lymphoma and a liver metastasis in patients with AHP. However, considering the small number of tumours and patients studied here, the data should be interpreted with caution, and further studies on cancer occurrence in AHP patients will require a multicentre setting.
  • Wolfgang Kaisers, Heiner Schaal, Holger Schwender
    [Show abstract] [Hide abstract]
    ABSTRACT: The open source environment R is the most widely used software to statistically explore biological data sets including sequence alignments. BAM is the de facto standard file format for sequence alignment. With rbamtools, we provide now a full spectrum of accessibility to BAM for R users such as reading, writing, extraction of subsets and plotting of alignment depth where the script syntax closely follows the SAM/BAM format. Additionally, rbamtools enables fast accumulative tabulation of splicing events over multiple BAM files. rbamtools is available on CRAN and on R-Forge. kaisers@med.uni-duesseldorf.de SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online. © The Author(s) 2015. Published by Oxford University Press.
    Bioinformatics 01/2015; DOI:10.1093/bioinformatics/btu846 · 4.62 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Case-parent trio studies considering genotype data from children affected by a disease and their parents are frequently used to detect single nucleotide polymorphisms (SNPs) associated with disease. The most popular statistical tests for this study design are transmission/disequilibrium tests (TDTs). Several types of these tests have been developed, for example, procedures based on alleles or genotypes. Therefore, it is of great interest to examine which of these tests have the highest statistical power to detect SNPs associated with disease. Comparisons of the allelic and the genotypic TDT for individual SNPs have so far been conducted based on simulation studies, since the test statistic of the genotypic TDT was determined numerically. Recently, however, it has been shown that this test statistic can be presented in closed form. In this article, we employ this analytic solution to derive equations for calculating the statistical power and the required sample size for different types of the genotypic TDT. The power of this test is then compared with the one of the corresponding score test assuming the same mode of inheritance as well as the allelic TDT based on a multiplicative mode of inheritance, which is equivalent to the score test assuming an additive mode of inheritance. This is, thus, the first time the power of these tests are compared based on equations, yielding instant results and omitting the need for time-consuming simulation studies. This comparison reveals that these tests have almost the same power, with the score test being slightly more powerful.
    Biometrical Journal 11/2014; 56(6). DOI:10.1002/bimj.201300148 · 1.24 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Non-syndromic cleft lip with or without cleft palate (NSCL/P) is a common disorder with complex etiology. The Bone Morphogenetic Protein 4 gene (BMP4) has been considered a prime candidate gene with evidence accumulated from animal experimental studies, human linkage studies, as well as candidate gene association studies. The aim of the current study is to test for linkage and association between BMP4 and NSCL/P that could be missed in genome-wide association studies (GWAS) when genotypic (G) main effects alone were considered. Methodology/Principal Findings We performed the analysis considering G and interactions with multiple maternal environmental exposures using additive conditional logistic regression models in 895 Asian and 681 European complete NSCL/P trios. Single nucleotide polymorphisms (SNPs) that passed the quality control criteria among 122 genotyped and 25 imputed single nucleotide variants in and around the gene were used in analysis. Selected maternal environmental exposures during 3 months prior to and through the first trimester of pregnancy included any personal tobacco smoking, any environmental tobacco smoke in home, work place or any nearby places, any alcohol consumption and any use of multivitamin supplements. A novel significant association held for rs7156227 among Asian NSCL/P and non-syndromic cleft lip and palate (NSCLP) trios after Bonferroni correction which was not seen when G main effects alone were considered in either allelic or genotypic transmission disequilibrium tests. Odds ratios for carrying one copy of the minor allele without maternal exposure to any of the four environmental exposures were 0.58 (95%CI = 0.44, 0.75) and 0.54 (95%CI = 0.40, 0.73) for Asian NSCL/P and NSCLP trios, respectively. The Bonferroni P values corrected for the total number of 117 tested SNPs were 0.0051 (asymptotic P = 4.39*10−5) and 0.0065 (asymptotic P = 5.54*10−5), accordingly. In European trios, no significant association was seen for any SNPs after Bonferroni corrections for the total number of 120 tested SNPs. Conclusions/Significance Our findings add evidence from GWAS to support the role of BMP4 in susceptibility to NSCL/P originally identified in linkage and candidate gene association studies.
    PLoS ONE 10/2014; 9(10):e109038. DOI:10.1371/journal.pone.0109038 · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Case-parent trio studies are commonly employed in genetics to detect variants underlying common complex disease risk. Both commercial and freely available software suites for genetic data analysis usually contain methods for case-parent trio designs. A user might, however, experience limitations with these packages, which can include missing functionality to extend the software if a desired analysis has not been implemented, and the inability to programmatically capture all the software versions used for low-level processing and high-level inference of genomic data, a critical consideration in particular for high-throughput experiments. Here, we present a software vignette (i.e., a manual with step by step instructions and examples to demonstrate software functionality) for reproducible genome-wide analyses of case-parent trio data using the open source Bioconductor package trio. The workflow for the practitioner uses data from previous genetic trio studies to illustrate functions for marginal association tests, assessment of parent-of-origin effects, power and sample size calculations, and functions to detect gene–gene and gene–environment interactions associated with disease.
    Genetic Epidemiology 07/2014; DOI:10.1002/gepi.21836 · 2.95 Impact Factor
  • Source
  • Source
    Wolfgang Kaisers, Holger Schwender, Heiner Schaal
    [Show abstract] [Hide abstract]
    ABSTRACT: Batch effects are artificial sources of variation due to experimental design. Batch effect is a widespread phenomenon in high througput data which can be minimized but not always be avoided. Therefore mechanisms for detection of batch effects are needed which requires comparison of multiple samples. Due to large data volumes 1e12 Bytes this can be technical challenging. We describe the application of hierarchical clustering (HC) on DNA k-mer profiles of multiple fastq files which creates tree structures based on similarity of sequence content. Ideally, HC tree adjacency reflects experimental treatment groups but the algorithm may also agglomerate according to sample preparation groups which then indicates the presence of batch effects. This introduces a new perspective in quality control. In order to perform fast analysis on large data sets we implemented a new algorithm. The algorithm and user interface are implemented a new R-package (seqTools). The result is now available on R-Forge for Linux and Windows. Using this implementation, we compared the transcriptome sequence content of 83 human tissue samples on 13 different flowcells. We pairwise compared flowcells where samples from the same tissue type (fibroblasts or jurkat cells) were present and <10% of the Phred scores were <10. Flowcell based clustering could be identified in 24 of 41 compaired flowcells. The effect predominantly appears on Phred quality deficient flowcells but can also be found between high quality samples. The identified batch effects appear amplified in aligned reads and attenuated in unmapped (i.e. discarded during alignment) reads. Filtering reads for high quality (Phred >30) does not remove the identified batch effects.
  • Wolfgang Kaisers, Holger Schwender, Heiner Schaal
    [Show abstract] [Hide abstract]
    ABSTRACT: Batch effects are artificial sources of variation due to experimental design. Batch effect is a widespread phenomenon in high througput data which can be minimized but not always be avoided. Therefore mechanisms for detection of batch effects are needed which requires comparison of multiple samples. Due to large data volumes 1e12 Bytes this can be technical challenging. We describe the application of hierarchical clustering (HC) on DNA k-mer profiles of multiple fastq files which creates tree structures based on similarity of sequence content. Ideally, HC tree adjacency reflects experimental treatment groups but the algorithm may also agglomerate according to sample preparation groups which then indicates the presence of batch effects. This introduces a new perspective in quality control. In order to perform fast analysis on large data sets we implemented a new algorithm. The algorithm and user interface are implemented a new R-package (seqTools). The result is now available on R-Forge for Linux and Windows. Using this implementation, we compared the transcriptome sequence content of 61 human tissue samples on 8 different flowcells. We pairwise compared flowcells where at least 4 samples from the same tissue type (fibroblasts or jurkat cells) were present and <10% of the Phred scores were <10. Strong flowcell based separation was identified in 6 (21 %), detectable flowcell based clustering in 17 (60.7 %) of 28 flowcell comparisons. The identified batch effects appear amplified in aligned reads and attenuated in unmapped (i.e. discarded during alignment) reads. Filtering reads for high quality (Phred >30) does not remove the identified batch effects.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Copy number variants (CNVs) may play an important part in the development of common birth defectssuch as oral clefts, and individual patients with multiple birth defects (including clefts) have beenshown to carry small and large chromosomal deletions. In this paper we investigate de novo deletionsdefined as DNA segments missing in an oral cleft proband but present in both unaffected parents.We compare de novo deletion frequencies in children of European ancestry with an isolated, nonsyndromicoral cleft to frequencies in European ancestry children from randomly sampled trios. We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1where de novo deletions occur more frequently among oral cleft cases than controls. We also observedwider de novo deletions among cleft lip palate (CLP) cases than seen among cleft palate (CP) and cleftlip (CL) cases. This study presents a region where de novo deletions appear to be involved in the etiology of oralclefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions aremore likely to interfere with normal craniofacial development and may result in more severe clefts.Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies.Follow-up studies are needed to further validate these findings and to potentially identify additionalstructural variants underlying oral clefts.
    BMC Genetics 02/2014; 15(1):24. DOI:10.1186/1471-2156-15-24 · 2.36 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Nonsyndromic cleft palate (CP) is one of the most common human birth defects and both genetic and environmental risk factors contribute to its etiology. We conducted a genome-wide association study (GWAS) using 550 CP case-parent trios ascertained in an international consortium. Stratified analysis among trios with different ancestries was performed to test for GxE interactions with common maternal exposures using conditional logistic regression models. While no single nucleotide polymorphism (SNP) achieved genome-wide significance when considered alone, markers in SLC2A9 and the neighboring WDR1 on chromosome 4p16.1 gave suggestive evidence of gene-environment interaction with environmental tobacco smoke (ETS) among 259 Asian trios when the models included a term for GxE interaction. Multiple SNPs in these two genes were associated with increased risk of nonsyndromic CP if the mother was exposed to ETS during the peri-conceptual period (3 months prior to conception through the first trimester). When maternal ETS was considered, fifteen of 135 SNPs mapping to SLC2A9 and 9 of 59 SNPs in WDR1 gave P values approaching genome-wide significance (10(-6)<P<10(-4)) in a test for GxETS interaction. SNPs rs3733585 and rs12508991 in SLC2A9 yielded P = 2.26×10(-7) in a test for GxETS interaction. SNPs rs6820756 and rs7699512 in WDR1 also yielded P = 1.79×10(-7) and P = 1.98×10(-7) in a 1 df test for GxE interaction. Although further replication studies are critical to confirming these findings, these results illustrate how genetic associations for nonsyndromic CP can be missed if potential GxE interaction is not taken into account, and this study suggest SLC2A9 and WDR1 should be considered as candidate genes for CP.
    PLoS ONE 02/2014; 9(2):e88088. DOI:10.1371/journal.pone.0088088 · 3.53 Impact Factor
  • Journal of Heuristics 02/2014; 21(1):1-24. DOI:10.1007/s10732-014-9269-7 · 1.36 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Abstract To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
    Statistical Applications in Genetics and Molecular Biology 01/2014; 13(1):1-22. DOI:10.1515/sagmb-2013-0028 · 1.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we establish a theoretical benchmark by quantifying the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).
    Frontiers in Genetics 12/2013; 4:252. DOI:10.3389/fgene.2013.00252
  • [Show abstract] [Hide abstract]
    ABSTRACT: The spatial organisation of the chromosomes in the nucleus is influenced by chromatin regions binding to the nucleic lamina, i.e., the inner part of the nucleic envelope. To investigate the architecture of chromosomes in the interphase nucleus, it is thus of high interest to detect such chromatin segments. This goal can be achieved by considering the fibrous protein Lamin B as a surrogate, since regions of high abundance of Lamin B can indicate chromatin segments attached to the nucleic lamina. We analyse ChIP-Seq (Chromatin-Immunoprecipitation Sequencing) data from an experiment that is designed to record Lamin B abundance. We introduce a Bayesian segmentation procedure in which a Markov Chain Monte Carlo (MCMC) algorithm is used for inference about the desired segmentation. The procedure is based on a Bayesian hierarchical model. Inference allows the distinction between regions of high versus low levels of Lamin B, and therefore, gives an insight into the binding of the chromatin to the nucleic envelope. An implementation of this approach is available in the statistical software environment R.
    Biochimica et Biophysica Acta 09/2013; 1844(1). DOI:10.1016/j.bbapap.2013.09.001 · 4.66 Impact Factor
  • Holger Schwender, Sylvia Rabstein, Katja Ickstadt
    CHANCE 08/2013; 19(3):3-8. DOI:10.1080/09332480.2006.10722794
  • Katja Ickstadt, Tina Mueller, Holger Schwender
    CHANCE 08/2013; 19(3):21-26. DOI:10.1080/09332480.2006.10722798
  • Holger Schwender, Anton Belousov
    CHANCE 08/2013; 19(3):15-20. DOI:10.1080/09332480.2006.10722797
  • [Show abstract] [Hide abstract]
    ABSTRACT: Statistical approaches to evaluate interactions between single nucleotide polymorphisms (SNPs) and SNP-environment interactions are of great importance in genetic association studies, as susceptibility to complex disease might be related to the interaction of multiple SNPs and/or environmental factors. With these methods under active development, algorithms to simulate genomic data sets are needed to ensure proper type I error control of newly proposed methods and to compare power with existing methods. In this paper we propose an efficient method for a haplotype-based simulation of case-parent trios when the disease risk is thought to depend on possibly higher-order epistatic interactions or gene-environment interactions with binary exposures.
    Human Heredity 03/2013; 75(1):12-22. DOI:10.1159/000348789 · 1.64 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A collection of 1,108 case-parent trios ascertained through an isolated, nonsyndromic cleft lip with or without cleft palate (CL/P) was used to replicate the findings from a genome-wide association study (GWAS) conducted by Beaty et al. (Nat Genet 42:525-529, 2010), where four different genes/regions were identified as influencing risk to CL/P. Tagging SNPs for 33 different genes were genotyped (1,269 SNPs). All four of the genes originally identified as showing genome-wide significance (IRF6, ABCA4 and MAF, plus the 8q24 region) were confirmed in this independent sample of trios (who were primarily of European and Southeast Asian ancestry). In addition, eight genes classified as 'second tier' hits in the original study (PAX7, THADA, COL8A1/FILIP1L, DCAF4L2, GADD45G, NTN1, RBFOX3 and FOXE1) showed evidence of linkage and association in this replication sample. Meta-analysis between the original GWAS trios and these replication trios showed PAX7, COL8A1/FILIP1L and NTN1 achieved genome-wide significance. Tests for gene-environment interaction between these 33 genes and maternal smoking found evidence for interaction with two additional genes: GRID2 and ELAVL2 among European mothers (who had a higher rate of smoking than Asian mothers). Formal tests for gene-gene interaction (epistasis) failed to show evidence of statistical interaction in any simple fashion. This study confirms that many different genes influence risk to CL/P.
    Human Genetics 03/2013; 132(7). DOI:10.1007/s00439-013-1283-6 · 4.52 Impact Factor

Publication Stats

654 Citations
178.35 Total Impact Points

Institutions

  • 2012–2015
    • Heinrich-Heine-Universität Düsseldorf
      • Institute of Virology
      Düsseldorf, North Rhine-Westphalia, Germany
  • 2011–2012
    • Johns Hopkins Bloomberg School of Public Health
      • Department of Biostatistics
      Baltimore, Maryland, United States
  • 2003–2012
    • Technische Universität Dortmund
      • Faculty of Statistics
      Dortmund, North Rhine-Westphalia, Germany
  • 2010–2011
    • Johns Hopkins University
      • • Department of Biostatistics
      • • Department of Medicine
      Baltimore, Maryland, United States