[Show abstract][Hide abstract] ABSTRACT: While individual non-B DNA structures have been shown to impact gene expression, their broad regulatory role remains elusive. We utilized genomic variants and expression quantitative trait loci (eQTL) data to analyze genome-wide variation propensities of potential non-B DNA regions and their relation to gene expression. Independent of genomic location, these regions were enriched in nucleotide variants. Our results are consistent with previously observed mutagenic properties of these regions and counter a previous study concluding that G-quadruplex regions have a reduced frequency of variants. While such mutagenicity might undermine functionality of these elements, we identified in potential non-B DNA regions a signature of negative selection. Yet, we found a depletion of eQTL-associated variants in potential non-B DNA regions, opposite to what might be expected from their proposed regulatory role. However, we also observed that genes downstream of potential non-B DNA regions showed higher expression vari
Nucleic Acids Research 10/2014; · 8.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The analysis of individuals with severe congenital neutropenia (SCN) may shed light on the delicate balance of factors controlling the differentiation, maintenance and decay of neutrophils. We identify 9 distinct homozygous mutations in the JAGN1 gene encoding Jagunal homolog 1 in 14 individuals with SCN. JAGN1-mutant granulocytes are characterized by ultrastructural defects, a paucity of granules, aberrant N-glycosylation of multiple proteins and increased incidence of apoptosis. JAGN1 participates in the secretory pathway and is required for granulocyte colony-stimulating factor receptor-mediated signaling. JAGN1 emerges as a factor that is necessary in the differentiation and survival of neutrophils.
[Show abstract][Hide abstract] ABSTRACT: Severe congenital neutropenia (SCN) is characterized by low numbers of peripheral neutrophil granulocytes and a predisposition to life-threatening bacterial infections. We describe a novel genetic SCN type in two unrelated families associated with recessively inherited loss-of-function mutations in CSF3R, encoding the granulocyte colony-stimulating factor (G-CSF) receptor. Family A, with three affected children, carried a homozygous missense mutation (NM_000760.3:c.922C>T, NP_000751.1:p.Arg308Cys), which resulted in perturbed N-glycosylation and aberrant localization to the cell surface. Family B, with one affected infant, carried compound heterozygous deletions provoking frameshifts and premature stop codons (NM_000760.3:c.948_963del, NP_000751.1:p.Gly316fsTer322 and NM_000760.3:c.1245del, NP_000751.1:p.Gly415fsTer432). Despite peripheral SCN, all patients had morphological evidence of full myeloid cell maturation in bone marrow. None of the patients responded to treatment with recombinant human granulocyte colony-stimulating factor. Our study highlights the genetic and morphological SCN variability and provides evidence both for functional importance and redundancy of G-CSF receptor-mediated signaling in human granulopoiesis.
[Show abstract][Hide abstract] ABSTRACT: We report on the analyses of genes encoding immunoglobulin heavy and light chains in the rabbit 6.51× whole genome assembly. This OryCun2.0 assembly confirms previous mapping of the duplicated IGK1 and IGK2 loci to chromosome 2 and the IGL lambda light chain locus to chromosome 21. The most frequently rearranged and expressed IGHV1 that is closest to IG DH and IGHJ genes encodes rabbit VHa allotypes. The partially inbred Thorbecke strain rabbit used for whole-genome sequencing was homozygous at the IGK but heterozygous with the IGHV1a1 allele in one of 79 IGHV-containing unplaced scaffolds and IGHV1a2, IGHM, IGHG, and IGHE sequences in another. Some IGKV, IGLV, and IGHA genes are also in other unplaced scaffolds. By fluorescence in situ hybridization, we assigned the previously unmapped IGH locus to the q-telomeric region of rabbit chromosome 20. An approximately 3-Mb segment of human chromosome 14 including IGH genes predicted to map to this telomeric region based on synteny analysis could not be located on assembled chromosome 20. Unplaced scaffold chrUn0053 contains some of the genes that comparative mapping predicts to be missing. We identified discrepancies between previous targeted studies and the OryCun2.0 assembly and some new BAC clones with IGH sequences that can guide other studies to further sequence and improve the OryCun2.0 assembly. Complete knowledge of gene sequences encoding variable regions of rabbit heavy, kappa, and lambda chains will lead to better understanding of how and why rabbits produce antibodies of high specificity and affinity through gene conversion and somatic hypermutation.
[Show abstract][Hide abstract] ABSTRACT: Primary immunodeficiencies (PIDs) represent exquisite models for studying mechanisms of human host defense. In this study, we report on two unrelated kindreds, with two patients each, who had cryptosporidial infections associated with chronic cholangitis and liver disease. Using exome and candidate gene sequencing, we identified two distinct homozygous loss-of-function mutations in the interleukin-21 receptor gene (IL21R; c.G602T, p.Arg201Leu and c.240_245delCTGCCA, p.C81_H82del). The IL-21R(Arg201Leu) mutation causes aberrant trafficking of the IL-21R to the plasma membrane, abrogates IL-21 ligand binding, and leads to defective phosphorylation of signal transducer and activator of transcription 1 (STAT1), STAT3, and STAT5. We observed impaired IL-21-induced proliferation and immunoglobulin class-switching in B cells, cytokine production in T cells, and NK cell cytotoxicity. Our study indicates that human IL-21R deficiency causes an immunodeficiency and highlights the need for early diagnosis and allogeneic hematopoietic stem cell transplantation in affected children.
Journal of Experimental Medicine 02/2013; · 13.21 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We describe a pedigree of 71 individuals from the Republic of Cameroon in which at least 33 individuals have a clinical diagnosis of persistent stuttering. The high concentration of stuttering individuals suggests that the pedigree either contains a single highly penetrant gene variant or that assortative mating led to multiple stuttering-associated variants being transmitted in different parts of the pedigree. No single locus displayed significant linkage to stuttering in initial genome-wide scans with microsatellite and SNP markers. By dividing the pedigree into five subpedigrees, we found evidence for linkage to previously reported loci on 3q and 15q, and to novel loci on 2p, 3p, 14q, and a different region of 15q. Using the two-locus mode of Superlink, we showed that combining the recessive locus on 2p and a single-locus additive representation of the 15q loci is sufficient to achieve a two-locus score over 6 on the entire pedigree. For this 2p + 15q analysis, we show LOD scores ranging from 4.69 to 6.57, and the scores are sensitive to which marker is chosen for 15q. Our findings provide strong evidence for linkage at several loci.
[Show abstract][Hide abstract] ABSTRACT: Genetic causes of childhood-onset hypogammaglobulinemia are currently not well understood. Patients are sporadic, but autosomal dominant (AD) and recessive (AR) inheritance have been described. We performed genetic linkage analysis in AR-families with hypogammaglobulinemia. Four AR-families with childhood-onset humoral immune deficiency (ID) and autoimmunity (AI) shared evidence for a linkage interval on chromosome 4q. Sequencing of candidate genes revealed that patients carried a distinct homozygous mutation in LRBA. All mutations segregated with the disease, homozygous individuals showed symptoms, while heterozygous were healthy. Mutations caused loss of protein expression, defective B cell activation, increased susceptibility to apoptosis and reduced autophagy. Phosphorylation of the pro-apoptotic protein BAD was reduced but transient expression of full-length LRBA in LRBA-deficient B cells restored BAD phosphorylation. We conclude that mutations in LRBA cause an ID characterized by defects in B cell activation, autophagy and susceptibility to apoptosis that is associated with a phenotype of hypogammaglobulinemia and AI.
2012 Clinical Immunology Society Annual Meeting; 05/2012
[Show abstract][Hide abstract] ABSTRACT: Most autosomal genetic causes of childhood-onset hypogammaglobulinemia are currently not well understood. Most affected individuals are simplex cases, but both autosomal-dominant and autosomal-recessive inheritance have been described. We performed genetic linkage analysis in consanguineous families affected by hypogammaglobulinemia. Four consanguineous families with childhood-onset humoral immune deficiency and features of autoimmunity shared genotype evidence for a linkage interval on chromosome 4q. Sequencing of positional candidate genes revealed that in each family, affected individuals had a distinct homozygous mutation in LRBA (lipopolysaccharide responsive beige-like anchor protein). All LRBA mutations segregated with the disease because homozygous individuals showed hypogammaglobulinemia and autoimmunity, whereas heterozygous individuals were healthy. These mutations were absent in healthy controls. Individuals with homozygous LRBA mutations had no LRBA, had disturbed B cell development, defective in vitro B cell activation, plasmablast formation, and immunoglobulin secretion, and had low proliferative responses. We conclude that mutations in LRBA cause an immune deficiency characterized by defects in B cell activation and autophagy and by susceptibility to apoptosis, all of which are associated with a clinical phenotype of hypogammaglobulinemia and autoimmunity.
The American Journal of Human Genetics 05/2012; 90(6):986-1001. · 11.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We describe a novel clinical phenotype associating T- and B-cell lymphopenia, intermittent neutropenia, and atrial septal defects in 3 members of a consanguineous kindred. Their clinical histories included recurrent bacterial infections, viral infections, mucocutaneous candidiasis, cutaneous warts, and skin abscesses. Homozygosity mapping and candidate gene sequencing revealed a homozygous premature termination mutation in the gene STK4 (serine threonine kinase 4, formerly having the symbol MST1). STK4 is the human ortholog of Drosophila Hippo, the central constituent of a highly conserved pathway controlling cell growth and apoptosis. STK4-deficient lymphocytes and neutrophils exhibit enhanced loss of mitochondrial membrane potential and increased susceptibility to apoptosis. STK4 deficiency is a novel human primary immunodeficiency syndrome.
[Show abstract][Hide abstract] ABSTRACT: The regions encoding the coordinately regulated Th2 cytokines IL5, IL4 and IL13 are located on chromosomes 5 of man and 11 of mouse. They have been intensively studied because these interleukins have protective roles in helminth infections, but may lead to detrimental effects such as allergy, asthma, and fibrosis in lung and liver. We added to previous studies by comparing sequences of syntenic regions on chromosome 3 of the rabbit (Oryctolagus cuniculus) genome OryCun 2.0 assembly from a tuberculosis-susceptible strain, with the corresponding region of ENCODE ENm002 from a normal rabbit as well as with 9 other mammalian species. We searched for rabbit transcription factor binding sites in putative promoter and other non-coding regions of IL5, RAD50, IL13 and IL4. Although we identified several differences between the two donor rabbits in coding and non-coding regions of potential functional significance, confirmation awaits additional sequencing of other rabbits.
Immunology and Immunogenetics Insights 01/2011; 3:59-82.
[Show abstract][Hide abstract] ABSTRACT: This paper concerns the generation of support vector machine classifiers for solving the pattern recognition problem in machine
learning. A method is proposed based on interior-point methods for convex quadratic programming. This interior-point method
uses a linear preconditioned conjugate gradient method with a novel preconditioner to compute each iteration from the previous.
An implementation is developed by adapting the object-oriented package OOQP to the problem structure. Numerical results are
provided, and computational experience is discussed.
KeywordsMachine learning-Support vector machines-Quadratic programming-Interior-point methods-Krylov-space methods-Matrix-free preconditioning
Computational Optimization and Applications 11/2010; 47(3):431-453. · 1.28 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The hyper-IgE syndrome (HIES) is a primary immunodeficiency characterized by infections of the lung and skin, elevated serum IgE, and involvement of the soft and bony tissues. Recently, HIES has been associated with heterozygous dominant-negative mutations in the signal transducer and activator of transcription 3 (STAT3) and severe reductions of T(H)17 cells.
To determine whether there is a correlation between the genotype and the phenotype of patients with HIES and to establish diagnostic criteria to distinguish between STAT3 mutated and STAT3 wild-type patients.
We collected clinical data, determined T(H)17 cell numbers, and sequenced STAT3 in 100 patients with a strong clinical suspicion of HIES and serum IgE >1000 IU/mL. We explored diagnostic criteria by using a machine-learning approach to identify which features best predict a STAT3 mutation.
In 64 patients, we identified 31 different STAT3 mutations, 18 of which were novel. These included mutations at splice sites and outside the previously implicated DNA-binding and Src homology 2 domains. A combination of 5 clinical features predicted STAT3 mutations with 85% accuracy. T(H)17 cells were profoundly reduced in patients harboring STAT3 mutations, whereas 10 of 13 patients without mutations had low (<1%) T(H)17 cells but were distinct by markedly reduced IFN-gamma-producing CD4(+)T cells.
We propose the following diagnostic guidelines for STAT3-deficient HIES. Possible: IgE >1000IU/mL plus a weighted score of clinical features >30 based on recurrent pneumonia, newborn rash, pathologic bone fractures, characteristic face, and high palate. Probable: These characteristics plus lack of T(H)17 cells or a family history for definitive HIES. Definitive: These characteristics plus a dominant-negative heterozygous mutation in STAT3.
The Journal of allergy and clinical immunology 02/2010; 125(2):424-432.e8. · 12.05 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The molecular cause of inflammatory bowel disease is largely unknown.
We performed genetic-linkage analysis and candidate-gene sequencing on samples from two unrelated consanguineous families with children who were affected by early-onset inflammatory bowel disease. We screened six additional patients with early-onset colitis for mutations in two candidate genes and carried out functional assays in patients' peripheral-blood mononuclear cells. We performed an allogeneic hematopoietic stem-cell transplantation in one patient.
In four of nine patients with early-onset colitis, we identified three distinct homozygous mutations in genes IL10RA and IL10RB, encoding the IL10R1 and IL10R2 proteins, respectively, which form a heterotetramer to make up the interleukin-10 receptor. The mutations abrogate interleukin-10-induced signaling, as shown by deficient STAT3 (signal transducer and activator of transcription 3) phosphorylation on stimulation with interleukin-10. Consistent with this observation was the increased secretion of tumor necrosis factor alpha and other proinflammatory cytokines from peripheral-blood mononuclear cells from patients who were deficient in IL10R subunit proteins, suggesting that interleukin-10-dependent "negative feedback" regulation is disrupted in these cells. The allogeneic stem-cell transplantation performed in one patient was successful.
Mutations in genes encoding the IL10R subunit proteins were found in patients with early-onset enterocolitis, involving hyperinflammatory immune responses in the intestine. Allogeneic stem-cell transplantation resulted in disease remission in one patient.
New England Journal of Medicine 11/2009; 361(21):2033-45. · 54.42 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: While attempting to reanalyze published data from Agilent 4 x 44 human expression chips, we found that some of the 60-mer olignucleotide features could not be interpreted as representing single human genes. For example, some of the oligonucleotides align with the transcripts of more than one gene. We decided to check the annotations for all autosomes and the X chromosome systematically using bioinformatics methods.
Out of 42683 reporters, we found that 25505 (60%) passed all our tests and are considered "fully valid". 9964 (23%) reporters did not have a meaningful identifier, mapped to the wrong chromosome, or did not pass basic alignment tests preventing us from correlating the expression values of these reporters with a unique annotated human gene. The remaining 7214 (17%) reporters could be associated with either a unique gene or a unique intergenic location, but could not be mapped to a transcript in RefSeq. The 7214 reporters are further partitioned into three different levels of validity.
Expression array studies should evaluate the annotations of reporters and remove those reporters that have suspect annotations. This evaluation can be done systematically and semi-automatically, but one must recognize that data sources are frequently updated leading to slightly changing validation results over time.
[Show abstract][Hide abstract] ABSTRACT: Position specific score matrices (PSSMs) are derived from multiple sequence alignments to aid in the recognition of distant protein sequence relationships. The PSI-BLAST protein database search program derives the column scores of its PSSMs with the aid of pseudocounts, added to the observed amino acid counts in a multiple alignment column. In the absence of theory, the number of pseudocounts used has been a completely empirical parameter. This article argues that the minimum description length principle can motivate the choice of this parameter. Specifically, for realistic alignments, the principle supports the practice of using a number of pseudocounts essentially independent of alignment size. However, it also implies that more highly conserved columns should use fewer pseudocounts, increasing the inter-column contrast of the implied PSSMs. A new method for calculating pseudocounts that significantly improves PSI-BLAST's; retrieval accuracy is now employed by default.
Nucleic Acids Research 01/2009; 37(3):815-24. · 8.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: TNFRSF13B encodes transmembrane activator and calcium modulator and cyclophilin ligand interactor (TACI), a B cell- specific tumor necrosis factor (TNF) receptor superfamily member. Both biallelic and monoallelic TNFRSF13B mutations were identified in patients with common variable immunodeficiency disorders. The genetic complexity and variable clinical presentation of TACI deficiency prompted us to evaluate the genetic, immunologic, and clinical condition in 50 individuals with TNFRSF13B alterations, following screening of 564 unrelated patients with hypogammaglobulinemia. We identified 13 new sequence variants. The most frequent TNFRSF13B variants (C104R and A181E; n=39; 6.9%) were also present in a heterozygous state in 2% of 675 controls. All patients with biallelic mutations had hypogammaglobulinemia and nearly all showed impaired binding to a proliferation-inducing ligand (APRIL). However, the majority (n=41; 82%) of the pa-tients carried monoallelic changes in TNFRSF13B. Presence of a heterozygous mutation was associated with antibody deficiency (P< .001, relative risk 3.6). Heterozygosity for the most common mutation, C104R, was associated with disease (P< .001, relative risk 4.2). Furthermore, heterozygosity for C104R was associated with low numbers of IgD(-)CD27(+) B cells (P= .019), benign lymphoproliferation (P< .001), and autoimmune complications (P= .001). These associations indicate that C104R heterozygosity increases the risk for common variable immunodeficiency disorders and influences clinical presentation.
[Show abstract][Hide abstract] ABSTRACT: The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments.
We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program.
The scripts for performing evaluations are available upon request from the authors.
[Show abstract][Hide abstract] ABSTRACT: The DUST module has been used within BLAST for many years to mask low-complexity sequences. In this paper, we present a new implementation of the DUST module that uses the same function to assign a complexity score to a sequence, but uses a different rule by which high-scoring sequences are masked. The new rule masks every nucleotide masked by the old rule and occasionally masks more. The new masking rule corrects two related deficiencies with the old rule. First, the new rule is symmetric with respect to reversing the sequence. Second, the new rule is not context sensitive; the decision to mask a subsequence does not depend on what sequences flank it. The new implementation is at least four times faster than the old on the human genome. We show that both the percentage of additional bases masked and the effect on MegaBLAST outputs are very small.
Journal of Computational Biology 07/2006; 13(5):1028-40. · 1.56 Impact Factor