-
Susan Moir,
Joshua D Milner,
E Michael Gertz,
Jacek Puchalka,
Ulrich Baumann,
Eva-Doreen Pfister,
Daniel Kotlarz,
Petra Schwille,
Alejandro A Schäffer,
Jana Diestelhorst, [......],
Christoph Klein,
Gulbu Uzel,
Dietmar Pfeifer,
Thomas Weidemann,
Axel Schambach,
Peter N Robinson,
Christian J Braun,
Roland Jacobs,
Jochen Hecht,
Hans Kreipe
-
Daniel Kotlarz,
Natalia Zietara,
Gulbu Uzel,
Thomas Weidemann,
Christian J Braun,
Jana Diestelhorst,
Peter M Krawitz,
Peter N Robinson,
Jochen Hecht,
Jacek Puchalka, [......],
Eva-Doreen Pfister,
Eric P Hanson,
Axel Schambach,
Roland Jacobs,
Hans Kreipe,
Susan Moir,
Joshua D Milner,
Petra Schwille,
Stefan Mundlos,
Christoph Klein
[show abstract]
[hide abstract]
ABSTRACT: Primary immunodeficiencies (PIDs) represent exquisite models for studying mechanisms of human host defense. In this study, we report on two unrelated kindreds, with two patients each, who had cryptosporidial infections associated with chronic cholangitis and liver disease. Using exome and candidate gene sequencing, we identified two distinct homozygous loss-of-function mutations in the interleukin-21 receptor gene (IL21R; c.G602T, p.Arg201Leu and c.240_245delCTGCCA, p.C81_H82del). The IL-21R(Arg201Leu) mutation causes aberrant trafficking of the IL-21R to the plasma membrane, abrogates IL-21 ligand binding, and leads to defective phosphorylation of signal transducer and activator of transcription 1 (STAT1), STAT3, and STAT5. We observed impaired IL-21-induced proliferation and immunoglobulin class-switching in B cells, cytokine production in T cells, and NK cell cytotoxicity. Our study indicates that human IL-21R deficiency causes an immunodeficiency and highlights the need for early diagnosis and allogeneic hematopoietic stem cell transplantation in affected children.
Journal of Experimental Medicine 02/2013; · 13.85 Impact Factor
-
Hengameh Abdollahpour,
Giridharan Appaswamy,
Daniel Kotlarz,
Jana Diestelhorst,
Rita Beier,
Alejandro A Schäffer, E Michael Gertz,
Axel Schambach,
Hans H Kreipe,
Dietmar Pfeifer,
Karin R Engelhardt,
Nima Rezaei,
Bodo Grimbacher,
Sabine Lohrmann,
Roya Sherkat,
Christoph Klein
[show abstract]
[hide abstract]
ABSTRACT: We describe a novel clinical phenotype associating T- and B-cell lymphopenia, intermittent neutropenia, and atrial septal defects in 3 members of a consanguineous kindred. Their clinical histories included recurrent bacterial infections, viral infections, mucocutaneous candidiasis, cutaneous warts, and skin abscesses. Homozygosity mapping and candidate gene sequencing revealed a homozygous premature termination mutation in the gene STK4 (serine threonine kinase 4, formerly having the symbol MST1). STK4 is the human ortholog of Drosophila Hippo, the central constituent of a highly conserved pathway controlling cell growth and apoptosis. STK4-deficient lymphocytes and neutrophils exhibit enhanced loss of mitochondrial membrane potential and increased susceptibility to apoptosis. STK4 deficiency is a novel human primary immunodeficiency syndrome.
Blood 01/2012; 119(15):3450-7. · 9.90 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The regions encoding the coordinately regulated Th2 cytokines IL5, IL4 and IL13 are located on chromosomes 5 of man and 11 of mouse. They have been intensively studied because these interleukins have protective roles in helminth infections, but may lead to detrimental effects such as allergy, asthma, and fibrosis in lung and liver. We added to previous studies by comparing sequences of syntenic regions on chromosome 3 of the rabbit (Oryctolagus cuniculus) genome OryCun 2.0 assembly from a tuberculosis-susceptible strain, with the corresponding region of ENCODE ENm002 from a normal rabbit as well as with 9 other mammalian species. We searched for rabbit transcription factor binding sites in putative promoter and other non-coding regions of IL5, RAD50, IL13 and IL4. Although we identified several differences between the two donor rabbits in coding and non-coding regions of potential functional significance, confirmation awaits additional sequencing of other rabbits.
Immunology and Immunogenetics Insights 01/2011; 3:59-82.
-
Cristina Woellner, E Michael Gertz,
Alejandro A Schäffer,
Macarena Lagos,
Mario Perro,
Erik-Oliver Glocker,
Maria C Pietrogrande,
Fausto Cossu,
José L Franco,
Nuria Matamoros, [......],
Peter D Arkwright,
Jukka S Moilanen,
Dorothee Viemann,
Sujoy Khan,
László Maródi,
Andrew J Cant,
Alexandra F Freeman,
Jennifer M Puck,
Steven M Holland,
Bodo Grimbacher
[show abstract]
[hide abstract]
ABSTRACT: The hyper-IgE syndrome (HIES) is a primary immunodeficiency characterized by infections of the lung and skin, elevated serum IgE, and involvement of the soft and bony tissues. Recently, HIES has been associated with heterozygous dominant-negative mutations in the signal transducer and activator of transcription 3 (STAT3) and severe reductions of T(H)17 cells.
To determine whether there is a correlation between the genotype and the phenotype of patients with HIES and to establish diagnostic criteria to distinguish between STAT3 mutated and STAT3 wild-type patients.
We collected clinical data, determined T(H)17 cell numbers, and sequenced STAT3 in 100 patients with a strong clinical suspicion of HIES and serum IgE >1000 IU/mL. We explored diagnostic criteria by using a machine-learning approach to identify which features best predict a STAT3 mutation.
In 64 patients, we identified 31 different STAT3 mutations, 18 of which were novel. These included mutations at splice sites and outside the previously implicated DNA-binding and Src homology 2 domains. A combination of 5 clinical features predicted STAT3 mutations with 85% accuracy. T(H)17 cells were profoundly reduced in patients harboring STAT3 mutations, whereas 10 of 13 patients without mutations had low (<1%) T(H)17 cells but were distinct by markedly reduced IFN-gamma-producing CD4(+)T cells.
We propose the following diagnostic guidelines for STAT3-deficient HIES. Possible: IgE >1000IU/mL plus a weighted score of clinical features >30 based on recurrent pneumonia, newborn rash, pathologic bone fractures, characteristic face, and high palate. Probable: These characteristics plus lack of T(H)17 cells or a family history for definitive HIES. Definitive: These characteristics plus a dominant-negative heterozygous mutation in STAT3.
The Journal of allergy and clinical immunology 02/2010; 125(2):424-432.e8. · 9.17 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: While attempting to reanalyze published data from Agilent 4 x 44 human expression chips, we found that some of the 60-mer olignucleotide features could not be interpreted as representing single human genes. For example, some of the oligonucleotides align with the transcripts of more than one gene. We decided to check the annotations for all autosomes and the X chromosome systematically using bioinformatics methods.
Out of 42683 reporters, we found that 25505 (60%) passed all our tests and are considered "fully valid". 9964 (23%) reporters did not have a meaningful identifier, mapped to the wrong chromosome, or did not pass basic alignment tests preventing us from correlating the expression values of these reporters with a unique annotated human gene. The remaining 7214 (17%) reporters could be associated with either a unique gene or a unique intergenic location, but could not be mapped to a transcript in RefSeq. The 7214 reporters are further partitioned into three different levels of validity.
Expression array studies should evaluate the annotations of reporters and remove those reporters that have suspect annotations. This evaluation can be done systematically and semi-automatically, but one must recognize that data sources are frequently updated leading to slightly changing validation results over time.
BMC Genomics 11/2009; 10:566. · 4.07 Impact Factor
-
Erik-Oliver Glocker,
Daniel Kotlarz,
Kaan Boztug, E Michael Gertz,
Alejandro A Schäffer,
Fatih Noyan,
Mario Perro,
Jana Diestelhorst,
Anna Allroth,
Dhaarini Murugan, [......],
Ulrich Baumann,
Ulrich Salzer,
Sibylle Koletzko,
Neil Shah,
Anthony W Segal,
Axel Sauerbrey,
Stephan Buderus,
Scott B Snapper,
Bodo Grimbacher,
Christoph Klein
[show abstract]
[hide abstract]
ABSTRACT: The molecular cause of inflammatory bowel disease is largely unknown.
We performed genetic-linkage analysis and candidate-gene sequencing on samples from two unrelated consanguineous families with children who were affected by early-onset inflammatory bowel disease. We screened six additional patients with early-onset colitis for mutations in two candidate genes and carried out functional assays in patients' peripheral-blood mononuclear cells. We performed an allogeneic hematopoietic stem-cell transplantation in one patient.
In four of nine patients with early-onset colitis, we identified three distinct homozygous mutations in genes IL10RA and IL10RB, encoding the IL10R1 and IL10R2 proteins, respectively, which form a heterotetramer to make up the interleukin-10 receptor. The mutations abrogate interleukin-10-induced signaling, as shown by deficient STAT3 (signal transducer and activator of transcription 3) phosphorylation on stimulation with interleukin-10. Consistent with this observation was the increased secretion of tumor necrosis factor alpha and other proinflammatory cytokines from peripheral-blood mononuclear cells from patients who were deficient in IL10R subunit proteins, suggesting that interleukin-10-dependent "negative feedback" regulation is disrupted in these cells. The allogeneic stem-cell transplantation performed in one patient was successful.
Mutations in genes encoding the IL10R subunit proteins were found in patients with early-onset enterocolitis, involving hyperinflammatory immune responses in the intestine. Allogeneic stem-cell transplantation resulted in disease remission in one patient.
New England Journal of Medicine 11/2009; 361(21):2033-45. · 53.30 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Position specific score matrices (PSSMs) are derived from multiple sequence alignments to aid in the recognition of distant protein sequence relationships. The PSI-BLAST protein database search program derives the column scores of its PSSMs with the aid of pseudocounts, added to the observed amino acid counts in a multiple alignment column. In the absence of theory, the number of pseudocounts used has been a completely empirical parameter. This article argues that the minimum description length principle can motivate the choice of this parameter. Specifically, for realistic alignments, the principle supports the practice of using a number of pseudocounts essentially independent of alignment size. However, it also implies that more highly conserved columns should use fewer pseudocounts, increasing the inter-column contrast of the implied PSSMs. A new method for calculating pseudocounts that significantly improves PSI-BLAST's; retrieval accuracy is now employed by default.
Nucleic Acids Research 01/2009; 37(3):815-24. · 8.03 Impact Factor
-
Ulrich Salzer,
Chiara Bacchelli,
Sylvie Buckridge,
Qiang Pan-Hammarström,
Stephanie Jennings,
Vassilis Lougaris,
Astrid Bergbreiter,
Tina Hagena,
Jennifer Birmelin,
Alessandro Plebani, [......],
Are M Holm,
Jose L Franco,
Ilka Schulze,
Pascal Schneider, E Michael Gertz,
Alejandro A Schäffer,
Lennart Hammarström,
Adrian J Thrasher,
H Bobby Gaspar,
Bodo Grimbacher
[show abstract]
[hide abstract]
ABSTRACT: TNFRSF13B encodes transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI), a B cell- specific tumor necrosis factor (TNF) receptor superfamily member. Both biallelic and monoallelic TNFRSF13B mutations were identified in patients with common variable immunodeficiency disorders. The genetic complexity and variable clinical presentation of TACI deficiency prompted us to evaluate the genetic, immunologic, and clinical condition in 50 individuals with TNFRSF13B alterations, following screening of 564 unrelated patients with hypogammaglobulinemia. We identified 13 new sequence variants. The most frequent TNFRSF13B variants (C104R and A181E; n=39; 6.9%) were also present in a heterozygous state in 2% of 675 controls. All patients with biallelic mutations had hypogammaglobulinemia and nearly all showed impaired binding to a proliferation-inducing ligand (APRIL). However, the majority (n=41; 82%) of the pa-tients carried monoallelic changes in TNFRSF13B. Presence of a heterozygous mutation was associated with antibody deficiency (P< .001, relative risk 3.6). Heterozygosity for the most common mutation, C104R, was associated with disease (P< .001, relative risk 4.2). Furthermore, heterozygosity for C104R was associated with low numbers of IgD(-)CD27(+) B cells (P= .019), benign lymphoproliferation (P< .001), and autoimmune complications (P= .001). These associations indicate that C104R heterozygosity increases the risk for common variable immunodeficiency disorders and influences clinical presentation.
Blood 12/2008; 113(9):1967-76. · 9.90 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments.
We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program.
The scripts for performing evaluations are available upon request from the authors.
Bioinformatics 08/2008; 24(13):i15-23. · 5.47 Impact Factor
-
Proceedings 16th International Conference on Intelligent Systems for Molecular Biology (ISMB), Toronto, Canada, July 19-23, 2008; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: The DUST module has been used within BLAST for many years to mask low-complexity sequences. In this paper, we present a new implementation of the DUST module that uses the same function to assign a complexity score to a sequence, but uses a different rule by which high-scoring sequences are masked. The new rule masks every nucleotide masked by the old rule and occasionally masks more. The new masking rule corrects two related deficiencies with the old rule. First, the new rule is symmetric with respect to reversing the sequence. Second, the new rule is not context sensitive; the decision to mask a subsequence does not depend on what sequences flank it. The new implementation is at least four times faster than the old on the human genome. We show that both the percentage of additional bases masked and the effect on MegaBLAST outputs are very small.
Journal of Computational Biology 07/2006; 13(5):1028-40. · 1.55 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Protein sequence database search programs may be evaluated both for their retrieval accuracy--the ability to separate meaningful from chance similarities--and for the accuracy of their statistical assessments of reported alignments. However, methods for improving statistical accuracy can degrade retrieval accuracy by discarding compositional evidence of sequence relatedness. This evidence may be preserved by combining essentially independent measures of alignment and compositional similarity into a unified measure of sequence similarity. A version of the BLAST protein database search program, modified to employ this new measure, outperforms the baseline program in both retrieval and statistical accuracy on ASTRAL, a SCOP-based test set.
Nucleic Acids Research 02/2006; 34(20):5966-73. · 8.03 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: TBLASTN is a mode of operation for BLAST that aligns protein sequences to a nucleotide database translated in all six frames. We present the first description of the modern implementation of TBLASTN, focusing on new techniques that were used to implement composition-based statistics for translated nucleotide searches. Composition-based statistics use the composition of the sequences being aligned to generate more accurate E-values, which allows for a more accurate distinction between true and false matches. Until recently, composition-based statistics were available only for protein-protein searches. They are now available as a command line option for recent versions of TBLASTN and as an option for TBLASTN on the NCBI BLAST web server.
We evaluate the statistical and retrieval accuracy of the E-values reported by a baseline version of TBLASTN and by two variants that use different types of composition-based statistics. To test the statistical accuracy of TBLASTN, we ran 1000 searches using scrambled proteins from the mouse genome and a database of human chromosomes. To test retrieval accuracy, we modernize and adapt to translated searches a test set previously used to evaluate the retrieval accuracy of protein-protein searches. We show that composition-based statistics greatly improve the statistical accuracy of TBLASTN, at a small cost to the retrieval accuracy.
TBLASTN is widely used, as it is common to wish to compare proteins to chromosomes or to libraries of mRNAs. Composition-based statistics improve the statistical accuracy, and therefore the reliability, of TBLASTN results. The algorithms used by TBLASTN are not widely known, and some of the most important are reported here. The data used to test TBLASTN are available for download and may be useful in other studies of translated search algorithms.
BMC Biology 02/2006; 4:41. · 5.75 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Matches to repetitive sequences are usually undesirable in the output of DNA database searches. Repetitive sequences need not be matched to a query, if they can be masked in the database. RepeatMasker/Maskeraid (RM), currently the most widely used software for DNA sequence masking, is slow and requires a library of repetitive template sequences, such as a manually curated RepBase library, that may not exist for newly sequenced genomes.
We have developed a software tool called WindowMasker (WM) that identifies and masks highly repetitive DNA sequences in a genome, using only the sequence of the genome itself. WM is orders of magnitude faster than RM because WM uses a few linear-time scans of the genome sequence, rather than local alignment methods that compare each library sequence with each piece of the genome. We validate WM by comparing BLAST outputs from large sets of queries applied to two versions of the same genome, one masked by WM, and the other masked by RM. Even for genomes such as the human genome, where a good RepBase library is available, searching the database as masked with WM yields more matches that are apparently non-repetitive and fewer matches to repetitive sequences. We show that these results hold for transcribed regions as well. WM also performs well on genomes for which much of the sequence was in draft form at the time of the analysis.
WM is included in the NCBI C++ toolkit. The source code for the entire toolkit is available at ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/. Once the toolkit source is unpacked, the instructions for building WindowMasker application in the UNIX environment can be found in file src/app/winmasker/README.build.
Supplementary data are available at ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/windowmasker/windowmasker_suppl.pdf
Bioinformatics 02/2006; 22(2):134-41. · 5.47 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: Almost all protein database search methods use amino acid substitution matrices for scoring, optimizing, and assessing the statistical significance of sequence alignments. Much care and effort has therefore gone into constructing substitution matrices, and the quality of search results can depend strongly upon the choice of the proper matrix. A long-standing problem has been the comparison of sequences with biased amino acid compositions, for which standard substitution matrices are not optimal. To address this problem, we have recently developed a general procedure for transforming a standard matrix into one appropriate for the comparison of two sequences with arbitrary, and possibly differing compositions. Such adjusted matrices yield, on average, improved alignments and alignment scores when applied to the comparison of proteins with markedly biased compositions. Here we review the application of compositionally adjusted matrices and consider whether they may also be applied fruitfully to general purpose protein sequence database searches, in which related sequence pairs do not necessarily have strong compositional biases. Although it is not advisable to apply compositional adjustment indiscriminately, we describe several simple criteria under which invoking such adjustment is on average beneficial. In a typical database search, at least one of these criteria is satisfied by over half the related sequence pairs. Compositional substitution matrix adjustment is now available in NCBI's protein-protein version of blast.
FEBS Journal 11/2005; 272(20):5101-9. · 3.79 Impact Factor