Carlo Sidore

Vanderbilt University, Nashville, MI, USA

Are you Carlo Sidore?

Claim your profile

Publications (6)88.83 Total impact

  • Article: Genotype calling and haplotyping in parent-offspring trios.
    [show abstract] [hide abstract]
    ABSTRACT: Emerging sequencing technologies allow common and rare variants to be systematically assayed across the human genome in many individuals. In order to improve variant detection and genotype calling, raw sequence data are typically examined across many individuals. Here, we describe a method for genotype calling in settings where sequence data are available for unrelated individuals and parent-offspring trios and show that modeling trio information can greatly increase the accuracy of inferred genotypes and haplotypes, especially on low to modest depth sequencing data. Our method considers both linkage disequilibrium (LD) patterns and the constraints imposed by family structure when assigning individual genotypes and haplotypes. Using simulations, we show trios provide higher genotype calling accuracy across the frequency spectrum, both overall and at hard-to-call heterozygous sites. In addition, trios provide greatly improved phasing accuracy - improving the accuracy of downstream analyses (such as genotype imputation) that rely on phased haplotypes. To further evaluate our approach, we analyzed data on the first 508 individuals sequenced by the SardiNIA sequencing project. Our results show that our method reduces the genotyping error rate by 50% compared to analysis using existing methods that ignore family structure. We anticipate our method will facilitate genotype calling and haplotype inference for many ongoing sequencing projects.
    Genome Research 10/2012; · 13.61 Impact Factor
  • Source
    Article: A likelihood-based framework for variant calling and de novo mutation detection in families.
    [show abstract] [hide abstract]
    ABSTRACT: Family samples, which can be enriched for rare causal variants by focusing on families with multiple extreme individuals and which facilitate detection of de novo mutation events, provide an attractive resource for next-generation sequencing studies. Here, we describe, implement, and evaluate a likelihood-based framework for analysis of next generation sequence data in family samples. Our framework is able to identify variant sites accurately and to assign individual genotypes, and can handle de novo mutation events, increasing the sensitivity and specificity of variant calling and de novo mutation detection. Through simulations we show explicit modeling of family relationships is especially useful for analyses of low-frequency variants and that genotype accuracy increases with the number of individuals sequenced per family. Compared with the standard approach of ignoring relatedness, our methods identify and accurately genotype more variants, and have high specificity for detecting de novo mutation events. The improvement in accuracy using our methods over the standard approach is particularly pronounced for low-frequency variants. Furthermore the family-aware calling framework dramatically reduces Mendelian inconsistencies and is beneficial for family-based analysis. We hope our framework and software will facilitate continuing efforts to identify genetic factors underlying human diseases.
    PLoS Genetics 10/2012; 8(10):e1002944. · 8.69 Impact Factor
  • Source
    Article: A genome-wide association scan on the levels of markers of inflammation in Sardinians reveals associations that underpin its complex regulation.
    [show abstract] [hide abstract]
    ABSTRACT: Identifying the genes that influence levels of pro-inflammatory molecules can help to elucidate the mechanisms underlying this process. We first conducted a two-stage genome-wide association scan (GWAS) for the key inflammatory biomarkers Interleukin-6 (IL-6), the general measure of inflammation erythrocyte sedimentation rate (ESR), monocyte chemotactic protein-1 (MCP-1), and high-sensitivity C-reactive protein (hsCRP) in a large cohort of individuals from the founder population of Sardinia. By analysing 731,213 autosomal or X chromosome SNPs and an additional ∼1.9 million imputed variants in 4,694 individuals, we identified several SNPs associated with the selected quantitative trait loci (QTLs) and replicated all the top signals in an independent sample of 1,392 individuals from the same population. Next, to increase power to detect and resolve associations, we further genotyped the whole cohort (6,145 individuals) for 293,875 variants included on the ImmunoChip and MetaboChip custom arrays. Overall, our combined approach led to the identification of 9 genome-wide significant novel independent signals-5 of which were identified only with the custom arrays-and provided confirmatory evidence for an additional 7. Novel signals include: for IL-6, in the ABO gene (rs657152, p = 2.13×10(-29)); for ESR, at the HBB (rs4910472, p = 2.31×10(-11)) and UCN119B/SPPL3 (rs11829037, p = 8.91×10(-10)) loci; for MCP-1, near its receptor CCR2 (rs17141006, p = 7.53×10(-13)) and in CADM3 (rs3026968, p = 7.63×10(-13)); for hsCRP, within the CRP gene (rs3093077, p = 5.73×10(-21)), near DARC (rs3845624, p = 1.43×10(-10)), UNC119B/SPPL3 (rs11829037, p = 1.50×10(-14)), and ICOSLG/AIRE (rs113459440, p = 1.54×10(-08)) loci. Confirmatory evidence was found for IL-6 in the IL-6R gene (rs4129267); for ESR at CR1 (rs12567990) and TMEM57 (rs10903129); for MCP-1 at DARC (rs12075); and for hsCRP at CRP (rs1205), HNF1A (rs225918), and APOC-I (rs4420638). Our results improve the current knowledge of genetic variants underlying inflammation and provide novel clues for the understanding of the molecular mechanisms regulating this complex process.
    PLoS Genetics 01/2012; 8(1):e1002480. · 8.69 Impact Factor
  • Source
    Article: Fine mapping of five loci associated with low-density lipoprotein cholesterol detects variants that double the explained heritability.
    [show abstract] [hide abstract]
    ABSTRACT: Complex trait genome-wide association studies (GWAS) provide an efficient strategy for evaluating large numbers of common variants in large numbers of individuals and for identifying trait-associated variants. Nevertheless, GWAS often leave much of the trait heritability unexplained. We hypothesized that some of this unexplained heritability might be due to common and rare variants that reside in GWAS identified loci but lack appropriate proxies in modern genotyping arrays. To assess this hypothesis, we re-examined 7 genes (APOE, APOC1, APOC2, SORT1, LDLR, APOB, and PCSK9) in 5 loci associated with low-density lipoprotein cholesterol (LDL-C) in multiple GWAS. For each gene, we first catalogued genetic variation by re-sequencing 256 Sardinian individuals with extreme LDL-C values. Next, we genotyped variants identified by us and by the 1000 Genomes Project (totaling 3,277 SNPs) in 5,524 volunteers. We found that in one locus (PCSK9) the GWAS signal could be explained by a previously described low-frequency variant and that in three loci (PCSK9, APOE, and LDLR) there were additional variants independently associated with LDL-C, including a novel and rare LDLR variant that seems specific to Sardinians. Overall, this more detailed assessment of SNP variation in these loci increased estimates of the heritability of LDL-C accounted for by these genes from 3.1% to 6.5%. All association signals and the heritability estimates were successfully confirmed in a sample of ∼10,000 Finnish and Norwegian individuals. Our results thus suggest that focusing on variants accessible via GWAS can lead to clear underestimates of the trait heritability explained by a set of loci. Further, our results suggest that, as prelude to large-scale sequencing efforts, targeted re-sequencing efforts paired with large-scale genotyping will increase estimates of complex trait heritability explained by known loci.
    PLoS Genetics 07/2011; 7(7):e1002198. · 8.69 Impact Factor
  • Article: Low-coverage sequencing: implications for design of complex trait association studies.
    [show abstract] [hide abstract]
    ABSTRACT: New sequencing technologies allow genomic variation to be surveyed in much greater detail than previously possible. While detailed analysis of a single individual typically requires deep sequencing, when many individuals are sequenced it is possible to combine shallow sequence data across individuals to generate accurate calls in shared stretches of chromosome. Here, we show that, as progressively larger numbers of individuals are sequenced, increasingly accurate genotype calls can be generated for a given sequence depth. We evaluate the implications of low-coverage sequencing for complex trait association studies. We systematically compare study designs based on genotyping of tagSNPs, sequencing of many individuals at depths ranging between 2× and 30×, and imputation of variants discovered by sequencing a subset of individuals into the remainder of the sample. We show that sequencing many individuals at low depth is an attractive strategy for studies of complex trait genetics. For example, for disease-associated variants with frequency >0.2%, sequencing 3000 individuals at 4× depth provides similar power to deep sequencing of >2000 individuals at 30× depth but requires only ~20% of the sequencing effort. We also show low-coverage sequencing can be used to build a reference panel that can drive imputation into additional samples to increase power further. We provide guidance for investigators wishing to combine results from sequenced, genotyped, and imputed samples.
    Genome Research 04/2011; 21(6):940-51. · 13.61 Impact Factor
  • Source
    Article: Variants within the immunoregulatory CBLB gene are associated with multiple sclerosis.
    [show abstract] [hide abstract]
    ABSTRACT: A genome-wide association scan of approximately 6.6 million genotyped or imputed variants in 882 Sardinian individuals with multiple sclerosis (cases) and 872 controls suggested association of CBLB gene variants with disease, which was confirmed in 1,775 cases and 2,005 controls (rs9657904, overall P = 1.60 x 10(-10), OR = 1.40). CBLB encodes a negative regulator of adaptive immune responses, and mice lacking the ortholog are prone to experimental autoimmune encephalomyelitis, the animal model of multiple sclerosis.
    Nature Genetics 06/2010; 42(6):495-7. · 35.53 Impact Factor

Top Journals

Institutions

  • 2012
    • Vanderbilt University
      Nashville, MI, USA
  • 2011
    • National Research Council
      • Institute of Neurogenetics and Neuropharmacology IRGB
      Roma, Latium, Italy
    • University of North Carolina at Chapel Hill
      • Department of Genetics
      Chapel Hill, NC, USA