The signatures of autozygosity among patients with colorectal cancer.
ABSTRACT Previous studies have shown that among populations with a high rate of consanguinity, there is a significant increase in the prevalence of cancer. Single nucleotide polymorphism (SNP) array data (Affymetrix, 50K XbaI) analysis revealed long regions of homozygosity in genomic DNAs taken from tumor and matched normal tissues of colorectal cancer (CRC) patients. The presence of these regions in the genome may indicate levels of consanguinity in the individual's family lineage. We refer to these autozygous regions as identity-by-descent (IBD) segments. In this study, we compared IBD segments in 74 mostly Caucasian CRC patients (mean age of 66 years) to two control data sets: (a) 146 Caucasian individuals (mean age of 80 years) who participated in an age-related macular degeneration (AMD) study and (b) 118 cancer-free Caucasian individuals from the Framingham Heart Study (mean age of 67 years). Our results show that the percentage of CRC patients with IBD segments (>or=4 Mb length and 50 SNPs probed) in the genome is at least twice as high as the AMD or Framingham control groups. Also, the average length of these IBD regions in the CRC patients is more than twice the length of the two control data sets. Compared with control groups, IBD segments are found to be more common among individuals of Jewish background. We believe that these IBD segments within CRC patients are likely to harbor important CRC-related genes with low-penetrance SNPs and/or mutations, and, indeed, two recently identified CRC predisposition SNPs in the 8q24 region were confirmed to be homozygous in one particular patient carrying an IBD segment covering the region.
Article: Mass homozygotes accumulation in the NCI-60 cancer cell lines as compared to HapMap Trios, and relation to fragile site location.[show abstract] [hide abstract]
ABSTRACT: Runs of homozygosity (ROH) represents extended length of homozygotes on a long genomic distance. In oncology, it is known as loss of heterozygosity (LOH) if identified exclusively in cancer cell rather than in matched control cell. Studies have identified several genomic regions which show consistent ROH in different kinds of carcinoma. To query whether this consistency can be observed on broader spectrum, both in more cancer types and in wider genomic regions, we investigated ROH patterns in the National Cancer Institute 60 cancer cell line panel (NCI-60) and HapMap Caucasian healthy trio families. Using results from Affymetrix 500 K SNP arrays, we report a genome wide significant association of ROH regions between the NCI-60 and HapMap samples, with much a higher level of ROH (11 fold) in the cancer cell lines. Analysis shows that more severe ROH found in cancer cells appears to be the extension of existing ROH in healthy state. In the HapMap trios, the adult subgroup had a slightly but significantly higher level (1.02 fold) of ROH than did the young subgroup. For several ROH regions we observed the co-occurrence of fragile sites (FRAs). However, FRA on the genome wide level does not show a clear relationship with ROH regions.PLoS ONE 01/2012; 7(2):e31628. · 4.09 Impact Factor
Article: Integrative genomic analysis reveals extended germline homozygosity with lung cancer risk in the PLCO cohort.[show abstract] [hide abstract]
ABSTRACT: Susceptibility to common cancers is multigenic resulting from low-to-high penetrance predisposition-factors and environmental exposure. Genomic studies suggest germline homozygosity as a novel low-penetrance factor contributing to common cancers. We hypothesized that long homozygous regions (tracts-of-homozygosity [TOH]) harbor tobacco-dependent and independent lung-cancer predisposition (or protection) genes. We performed in silico genome-wide SNP-array-based analysis of lung-cancer patients of European-ancestry from the PLCO screening-trial cohort to identify TOH regions amongst 788 cancer-cases and 830 ancestry-matched controls. Association analyses was then performed between presence of lung cancer and common(c)TOHs (operationally defined as 10 or more subjects sharing ≥100 identical homozygous calls), aTOHs (allelically-matched groups within a cTOH), demographics and tobacco-exposure. Finally, integration of significant c/aTOH with transcriptome was performed to functionally-map lung-cancer risk-genes. After controlling for demographics and smoking, we identified 7 cTOHs and 5 aTOHs associated with lung cancer (adjusted p<0.01). Three cTOHs were over-represented in cases over controls (OR = 1.75-2.06, p = 0.007-0.001), whereas 4 were under-represented (OR = 0.28-0.69, p = 0.006-0.001). Interaction between smoking status and cTOH3/aTOH2 (2p16.3-2p16.1) was observed (adjusted p<0.03). The remaining significant aTOHs have ORs 0.23-0.50 (p = 0.004-0.006) and 2.95-3.97 (p = 0.008-0.001). After integrating significant cTOH/aTOHs with publicly-available lung-cancer transcriptome datasets followed by filtering based on lung cancer and its relevant pathways revealed 9 putative predisposing genes (p<0.0001). In conclusion, differentially-distributed cTOH/aTOH genomic variants between cases and controls harbor sets of plausible differentially-expressed genes accounting for the complexity of lung-cancer predisposition.PLoS ONE 01/2012; 7(2):e31975. · 4.09 Impact Factor
[show abstract] [hide abstract]
ABSTRACT: Identification of disease variants via homozygosity mapping and investigation of the effects of genome-wide homozygosity regions on traits of biomedical importance have been widely applied recently. Nonetheless, the existing methods and algorithms to identify long tracts of homozygosity (TOH) are not able to provide efficient and rigorous regions for further downstream association investigation. We expanded current methods to identify TOHs by defining "surrogate-TOH", a region covering a cluster of TOHs with specific characteristics. Our defined surrogate-TOH includes cTOH, viz a common TOH region where at least ten TOHs present; gTOH, whereby a group of highly overlapping TOHs share proximal boundaries; and aTOH, which are allelically-matched TOHs. Searching for gTOH and aTOH was based on a repeated binary spectral clustering algorithm, where a hierarchy of clusters is created and represented by a TOH cluster tree. Based on the proposed method of identifying different species of surrogate-TOH, our cgaTOH software was developed. The software provides an intuitive and interactive visualization tool for better investigation of the high-throughput output with special interactive navigation rings, which will find its applicability in both conventional association studies and more sophisticated downstream analyses. NCBI genome map viewer is incorporated into the system. Moreover, we discuss the choice of implementing appropriate empirical ranges of critical parameters by applying to disease models. This method identifies various patterned clusters of SNPs demonstrating extended homozygosity, thus one can observe different aspects of the multi-faceted characteristics of TOHs.PLoS ONE 01/2013; 8(3):e57772. · 4.09 Impact Factor