Indranil Mukhopadhyay

Indian Statistical Institute, Baranagore, Bengal, India

Are you Indranil Mukhopadhyay?

Claim your profile

Publications (21)51.15 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Deregulation of miRNA expression may contribute to tumorigenesis and other patho-physiology associated with cancer. Using TLDA, expression of 762 miRNAs was checked in 18 pairs of gingivo buccal cancer-adjacent control tissues. Expression of significantly deregulated miRNAs was further validated in cancer and examined in two types of precancer (leukoplakia and lichen planus) tissues by primer-specific TaqMan assays. Biological implications of these miRNAs were assessed bioinformatically. Expression of hsa-miR-1293, hsa-miR-31, hsa-miR-31* and hsa-miR-7 were significantly up-regulated and those of hsa-miR-206, hsa-miR-204 and hsa-miR-133a were significantly down-regulated in all cancer samples. Expression of only hsa-miR-31 was significantly up-regulated in leukoplakia but none in lichen planus samples. Analysis of expression heterogeneity divided 18 cancer samples into clusters of 13 and 5 samples and revealed that expression of 30 miRNAs (including the above-mentioned 7 miRNAs), was significantly deregulated in the cluster of 13 samples. From database mining and pathway analysis it was observed that these miRNAs can significantly target many of the genes present in different cancer related pathways such as ''proteoglycans in cancer'', PI3K-AKT etc. which play important roles in expression of different molecular features of cancer. Expression of hsa-miR-31 was significantly up-regulated in both cancer and leukoplakia tissues and, thus, may be one of the molecular markers of leukoplakia which may progress to gingivo-buccal cancer. Copyright: ß 2014 De Sarkar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper and its Supporting Information files. Funding: All funding came from Department of Biotechnolgy, Government of India, and Indian Statistical Institute to Prof. Bidyut Roy. The funder has no role in study design, data collection, analysis, decision to publish and manuscript preparation. Competing Interests: The authors have declared that no competing interests exist.
    PLoS ONE 08/2014; · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MicroRNAs have been implicated in cancer but studies on their role in precancer, such as leukoplakia, are limited. Sequence variations at eight miRNA and four miRNA processing genes were studied in 452 healthy controls and 299 leukoplakia patients to estimate risk of disease.
    Journal of biomedical science. 05/2014; 21(1):48.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Human papillomavirus (HPV) types 16/18 are reportedly most common in cervical cancer (CaCx) with geographical variation of genotypes. HPV16 predominates both in squamous cell carcinoma (SCC) and adenocarcinoma in India, contrary to reported global predominance of HPV18 in the latter. Our study was aimed to determine the occurrence of HPV16/18 among histopathological types of cervical intra-epithelial neoplasia (CIN) and invasive CaCx from North Bengal, India and to identify any major deviation from the known Indian scenario of distribution of HPV16/18 genotypes in cases of SCC and adenocarcinoma. This was a retrospective, cross-sectional, case-only type of study, in which 40 cases were histopathologically diagnosed as CIN/CaCx, on which polymerase chain reaction (PCR), deoxyribonucleic acid (DNA)-sequencing and bioinformatics by basic search local alignment tool were performed for HPV-genotyping. The distribution of HPV genotypes among cases of SCC and adenocarcinoma was compared by Fisher's exact-test. HPV was detected in 97.5% (39/40) cases. HPV16-infected cases (32/39; 82.05%) predominated over HPV18-infected ones (7/39; 17.95%). However, HPV18-only infection was significantly (P = 0.0045, one-sided Fisher's exact test) more among adenocarcinoma (3/4; 75%) than SCC (2/26; 7.69%) contrary to HPV16-only infection (SCC = 24/26, 92.31%; adenocarcinoma = 1/4; 25%) whereas both CIN3 cases were HPV16-positive. Predominance of HPV18 over HPV16 in cases of adenocarcinoma in this region was contrasting to that of earlier Indian studies suggesting research on HPV18 related cervical carcinogenesis. PCR and DNA-sequencing could prove to be highly effective tools in HPV detection and genotyping. The study reported HPV16/18 infection in almost 98% of the cases, the knowledge about which might prove useful in future population based studies on HPV genotyping and designing of appropriate HPV-vaccines for this region.
    Journal of mid-life health. 01/2014; 5(1):14-22.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Alcohol induced liver disease or alcoholic liver disease (ALD), a complex trait, encompasses a gamut of pathophysiological alterations in the liver due to continuous exposure to a toxic amount of alcohol (more than 80g per day). Of all chronic heavy drinkers, only 15-20% develops hepatitis or cirrhosis concomitantly or in succession. Several studies revealed that inter-individual as well as inter-ethnic genetic variation is one of the major factors that predispose to ALD. The role of genetic factors in ALD has long been sought for in ethnically distinct population groups. ALD is fast emerging as an important cause of chronic liver disease in India; even in populations such as "Bengalis" who were "culturally immune" earlier. While the genetic involvement in the pathogenesis of ALD is being sought for in different races, the complex pathophysiology of ALD as well as the knowledge of population level diversity of the relevant alcohol metabolizing and inflammatory pathways mandates the need for well designed studies of genetic factors in ethnically distinct population groups. An array of cytokines plays a critical role as mediators of injury, inflammation, fibrosis and cirrhosis in ALD. We, therefore, studied the association of polymorphisms in five relevant cytokine genes with "clinically significant" ALD in an ethnic "Bengali" population in Eastern India. Compared with "alcoholic" controls without liver disease (n=110), TNFα -238AA genotype, IL1β -511CC genotype, TGFβ1 -509CC genotype and IL10 -592AA genotype were significantly overrepresented in ALD patients (n=181; OR=2.4 and 95% CI 1.2-5.5, P(genotype)=0.042, P(allelic)=0.008; OR=2.7 and 95% CI 1.2-5.9, P(genotype)=0.018, P(allelic)=0.023; OR=4.7 and 95% CI 1.7-13.1, P(genotype)=0.003, P(allelic)=0.014; and OR=2.2 and 95% CI 1.1-4.8, P(genotype)=0.04, P(allelic)=0.039 respectively). Moreover a cumulative genetic risk analysis revealed a significant trend for developing ALD with an increase in the number of risk alleles on IL10 and TGFβ1 loci among alcoholics. The risk genotype of IL1β and TGFβ1 also influences the total bilirubin, albumin and alanine aminotransferase levels among alcoholic "Bengalis". The present study is the first case-control study from Eastern India that comprehensively identified polymorphic markers in TNFα, IL10, IL1β and TGFβ1 genes to be associated with ALD in the Bengali population, accentuating the significance of genetic factors in clinical expressions of ALD.
    Gene 08/2012; 509(1):178-88. · 2.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This study was undertaken to decipher the interdependent roles of (i) methylation within E2 binding site I and II (E2BS-I/II) and replication origin (nt 7862) in the long control region (LCR), (ii) expression of viral oncogene E7, (iii) expression of the transcript (E7-E1∧E4) that encodes E2 repressor protein and (iv) viral load, in human papillomavirus 16 (HPV16) related cervical cancer (CaCx) pathogenesis. The results revealed over-representation (p<0.001) of methylation at nucleotide 58 of E2BS-I among E2-intact CaCx cases compared to E2-disrupted cases. Bisulphite sequencing of LCR revealed overrepresentation of methylation at nucleotide 58 or other CpGs in E2BS-I/II, among E2-intact cases than E2-disrupted cases and lack of methylation at replication origin in case of both. The viral transcript (E7-E1∧E4) that produces the repressor E2 was analyzed by APOT (amplification of papillomavirus oncogenic transcript)-coupled-quantitative-RT-PCR (of E7 and E4 genes) to distinguish episomal (pure or concomitant with integrated) from purely integrated viral genomes based on the ratio, E7 C(T)/E4 C(T). Relative quantification based on comparative C(T) (theshold cycle) method revealed 75.087 folds higher E7 mRNA expression in episomal cases over purely integrated cases. Viral load and E2 gene copy numbers were negatively correlated with E7 C(T) (p = 0.007) and E2 C(T) (p<0.0001), respectively, each normalized with ACTB C(T), among episomal cases only. The k-means clustering analysis considering E7 C(T) from APOT-coupled-quantitative-RT-PCR assay, in conjunction with viral load, revealed immense heterogeneity among the HPV16 positive CaCx cases portraying integrated viral genomes. The findings provide novel insights into HPV16 related CaCx pathogenesis and highlight that CaCx cases that harbour episomal HPV16 genomes with intact E2 are likely to be distinct biologically, from the purely integrated viral genomes in terms of host genes and/or pathways involved in cervical carcinogenesis.
    PLoS ONE 01/2012; 7(9):e44678. · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genetic variations in toll-like receptors and cytokine genes of the innate immune pathways have been implicated in controlling parasite growth and the pathogenesis of Plasmodium falciparum mediated malaria. We previously published genetic association of TLR4 non-synonymous and TNF-α promoter polymorphisms with P.falciparum blood infection level and here we extend the study considerably by (i) investigating genetic dependence of parasite-load on interleukin-12B polymorphisms, (ii) reconstructing gene-gene interactions among candidate TLRs and cytokine loci, (iii) exploring genetic and functional impact of epistatic models and (iv) providing mechanistic insights into functionality of disease-associated regulatory polymorphisms. Our data revealed that carriage of AA (P = 0.0001) and AC (P = 0.01) genotypes of IL12B 3'UTR polymorphism was associated with a significant increase of mean log-parasitemia relative to rare homozygous genotype CC. Presence of IL12B+1188 polymorphism in five of six multifactor models reinforced its strong genetic impact on malaria phenotype. Elevation of genetic risk in two-component models compared to the corresponding single locus and reduction of IL12B (2.2 fold) and lymphotoxin-α (1.7 fold) expressions in patients'peripheral-blood-mononuclear-cells under TLR4Thr399Ile risk genotype background substantiated the role of Multifactor Dimensionality Reduction derived models. Marked reduction of promoter activity of TNF-α risk haplotype (C-C-G-G) compared to wild-type haplotype (T-C-G-G) with (84%) and without (78%) LPS stimulation and the loss of binding of transcription factors detected in-silico supported a causal role of TNF-1031. Significantly lower expression of IL12B+1188 AA (5 fold) and AC (9 fold) genotypes compared to CC and under-representation (P = 0.0048) of allele A in transcripts of patients' PBMCs suggested an Allele-Expression-Imbalance. Allele (A+1188C) dependent differential stability (2 fold) of IL12B-transcripts upon actinomycin-D treatment and observed structural modulation (P = 0.013) of RNA-ensemble were the plausible explanations for AEI. In conclusion, our data provides functional support to the hypothesis that de-regulated receptor-cytokine axis of innate immune pathway influences blood infection level in P. falciparum malaria.
    PLoS ONE 01/2012; 7(10):e46441. · 3.53 Impact Factor
  • Source
    Indranil Mukhopadhyay, Sujayam Saha, Saurabh Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: Clinical binary end-point traits are often governed by quantitative precursors. Hence it may be a prudent strategy to analyze a clinical end-point trait by considering a multivariate phenotype vector, possibly including both quantitative and qualitative phenotypes. A major statistical challenge lies in integrating the constituent phenotypes into a reduced univariate phenotype for association analyses. We assess the performances of certain reduced phenotypes using analysis of variance and a model-free quantile-based approach. We find that analysis of variance is more powerful than the quantile-based approach in detecting association, particularly for rare variants. We also find that using a principal component of the quantitative phenotypes and the residual of a logistic regression of the binary phenotype on the quantitative phenotypes may be an optimal method for integrating a binary phenotype with quantitative phenotypes to define a reduced univariate phenotype.
    BMC proceedings 11/2011; 5 Suppl 9:S73.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies have helped us identify thousands of common variants associated with several widespread complex diseases. However, for most traits, these variants account for only a small fraction of phenotypic variance or heritability. Next-generation sequencing technologies are being used to identify additional rare variants hypothesized to have higher effect sizes than the already identified common variants, and to contribute significantly to the fraction of heritability that is still unexplained. Several pooling strategies have been proposed to test the joint association of multiple rare variants, because testing them individually may not be optimal. Within a gene or genomic region, if there are both rare and common variants, testing their joint association may be desirable to determine their synergistic effects. We propose new methods to test the joint association of several rare and common variants with binary and quantitative traits. Our association test for quantitative traits is based on genotypic and phenotypic measures of similarity between pairs of individuals. For the binary trait or case-control samples, we recently proposed an association test based on the genotypic similarity between individuals. Here, we develop a modified version of this test for rare variants. Our tests can be used for samples taken from multiple subpopulations. The power of our test statistics for case-control samples and quantitative traits was evaluated using the GAW17 simulated data sets. Type I error rates for the proposed tests are well controlled. Our tests are able to identify some of the important causal genes in the GAW17 simulated data sets.
    BMC proceedings 11/2011; 5 Suppl 9:S89.
  • [Show abstract] [Hide abstract]
    ABSTRACT: The determination and ordering of the influencers of customer satisfaction are of paramount interest in various service industries. The theory of logistic regression may be exploited to relate customer satisfaction usually measured in an ordinal scale with possible covariates, measured in metrical, ordinal or binary scales. However, some of the confounders are themselves determined by other covariates under study. This necessitates the use of a simultaneous equation with ordinal endogenous variables. We propose one such approach and demonstrate its efficacy with a real life example.
    Total Quality Management and Business Excellence 01/2011; 22(1):117-130. · 0.59 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We tested the hypothesis that cervical cancers (CaCx) harbor high HPV16 viral load compared to controls and this is influenced by E2 status and age of subjects. Viral load (natural log transformed values) per 100ng genomic DNA was estimated (152 cases and 87 controls) by Taqman assay. Median viral load was significantly higher (Mann-Whitney U test) among cases (17.21) compared to controls (9.86), irrespective of E2 status or upon considering E2 status as a covariate in logistic regression model (p<0.001). Viral load of E2 intact cases (17.80) was significantly higher (p<0.001) compared to those with disrupted E2 (9.78). At equivalent probability of being a case, viral load was higher among individuals (i) of lower age, irrespective of E2 status, and (ii) with intact E2 but of similar age as those with disrupted E2. Thus viral load in association with E2 status and/or age might be of causal relevance in CaCx pathogenesis.
    Virology 06/2010; 402(1):197-202. · 3.35 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent studies suggest that glaucoma is a neurodegenerative disease in which secondary degenerative losses occur after primary insult by raised Intraocular pressure (IOP) or by other associated factors. It has been reported that polymorphisms in the IL1A and IL1B genes are associated with Primary Open Angle Glaucoma (POAG). The purpose of our study was to investigate the role of these polymorphisms in eastern Indian POAG patients. The study involved 315 unrelated POAG patients, consisting of 116 High Tension Glaucoma (HTG) patients with intra ocular pressure (IOP) > 21 mmHg and 199 non-HTG patients (presenting IOP < 20 mmHg), and 301 healthy controls from eastern India. Genotypes were determined by polymerase chain reaction and restriction digestion for three single nucleotide polymorphisms (SNPs): IL1A (-889C/T; rs1800587), IL1B (-511C/T; rs16944) and IL1B (3953C/T; rs1143634). Haplotype frequency was determined by Haploview 4.1 software. The association of individual SNPs and major haplotypes was evaluated using chi-square statistics. The p-value was corrected for multiple tests by Bonferroni method. No significant difference was observed in the allele and genotype frequencies for IL1A and IL1B SNPs between total pool of POAG patients and controls. However, on segregating the patient pool to HTG and non-HTG groups, weak association was observed for IL1A polymorphism (-889C/T) where -889C allele was found to portray risk (OR = 1.380; 95% CI = 1.041-1.830; p = 0.025) for non-HTG patients. Similarly, 3953T allele of IL1B polymorphism (+3953C/T) was observed to confer risk to HTG group (OR = 1.561; 95% CI = 1.022-2.385; p = 0.039). On haplotype analysis it was observed that TTC was significantly underrepresented in non-HTG patients (OR = 0.538; 95% CI = 0.356- 0.815; p = 0.003) while TCT haplotype was overrepresented in HTG patients (OR = 1.784; 95% CI = 1.084- 2.937; p = 0.022) compared to control pool. However, after correction for multiple tests by Bonferroni method, an association of only TTC haplotype with non-HTG cases sustained (pcorrected = 0.015) and expected to confer protection. The study suggests that the genomic region containing the IL1 gene cluster influences the POAG pathogenesis mostly in non-HTG patients in eastern India. A similar study in additional and larger cohorts of patients in other population groups is necessary to further substantiate the observation.
    BMC Medical Genetics 01/2010; 11:99. · 2.54 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In a genetic association study, it is often desirable to perform an overall test of whether any or all single-nucleotide polymorphisms (SNPs) in a gene are associated with a phenotype. Several such tests exist, but most of them are powerful only under very specific assumptions about the genetic effects of the individual SNPs. In addition, some of the existing tests assume that the direction of the effect of each SNP is known, which is a highly unlikely scenario. Here, we propose a new kernel-based association test of joint association of several SNPs. Our test is non-parametric and robust, and does not make any assumption about the directions of individual SNP effects. It can be used to test multiple correlated SNPs within a gene and can also be used to test independent SNPs or genes in a biological pathway. Our test uses an analysis of variance paradigm to compare variation between cases and controls to the variation within the groups. The variation is measured using kernel functions for each marker, and then a composite statistic is constructed to combine the markers into a single test. We present simulation results comparing our statistic to the U-statistic-based method by Schaid et al. ([2005] Am. J. Hum. Genet. 76:780-793) and another statistic by Wessel and Schork ([2006] Am. J. Hum. Genet. 79:792-806). We consider a variety of different disease models and assumptions about how many SNPs within the gene are actually associated with disease. Our results indicate that our statistic has higher power than other statistics under most realistic conditions.
    Genetic Epidemiology 09/2009; 34(3):213-21. · 4.02 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Animal and human studies of addiction indicate that the D2 dopamine receptor (DRD2) plays a critical role in the mechanism of drug reward. D2 receptor density in the brains of alcoholics has been shown to be reduced relative to controls. Previous studies of DRD2 in association with alcohol dependence using variation in the TaqI A locus were highly controversial. Recently, a synonymous mutation, C957T, in the coding region of the human DRD2 gene has been identified which appears to have functional effects including alteration in receptor availability. In order to determine if susceptibility to alcohol dependence (AD) within multiplex alcohol dependence families would be altered by the C957T in the coding region of the D2 gene, within-family association was studied in members of Caucasian multiplex alcohol dependence families. Members of control families with no personal alcohol or substance dependence history were included for case/control comparisons. Analyses performed to detect within-family association showed evidence favoring an association for the C957T polymorphism (P = 0.038). Linkage analyses of polymorphisms in this region showed that only the C957T locus remained of interest (P = 0.015). Evidence for the C957T T allele having a role in AD susceptibility at the population level using a case/control comparison was statistically marginal (P = 0.062), but was consistent with the family data results. These results support a role for DRD2 as a susceptibility gene for alcohol dependence within multiplex families at high risk for developing alcohol dependence.
    American Journal of Medical Genetics Part B Neuropsychiatric Genetics 07/2008; 147B(4):517-26. · 3.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA sequences are known to scale like fractals. We study coding DNA and find, however, that these are not characterised by a single scaling index. The behaviour changes, somewhat abruptly, at about a few hundred bases. We study the Discrete Fourier Transform and find that the index-change appears in the plot of power spectrum vs. frequency. Interestingly, an enhancement of the spectral density is observed in the region of the index change. We study the peak heights at a frequency of (1/3) as a function of the length scale and find, once again, changes in the scaling index near these critical lengths.
    EPL (Europhysics Letters) 01/2007; 62(2):271. · 2.26 Impact Factor
  • Indranil Mukhopadhyay, Sudipta Chatterjee, Aditya Chatterjee
    [Show abstract] [Hide abstract]
    ABSTRACT: The plant 'Heat Rate' (HR) is a measure of overall efficiency of a thermal power generating system. It depends on a large number of factors, some of which are non-measurable, while data relating to others are seldom available and recorded. However, coal quality (expressed in terms of 'effective heat value' (EHV) as kcal/kg) transpires to be one of the important factors that influences HR values and data on EHV are available in any thermal power generating system. In the present work, we propose a prediction interval of the HR values on the basis of only EHV, keeping in mind that coal quality is one of the important (but not the only) factors that have a pronounced effect on the combustion process and hence on HR. The underlying theory borrows the idea of providing simultaneous confidence interval (SCI) to the coefficients of a p-th p(≥1) order autoregressive model (AR(p)). The theory has been substantiated with the help of real life data from a power utility (after suitable base and scale transformation of the data to maintain the confidentiality of the classified document). Scope for formulating strategies to enhance the economy of a thermal power generating system has also been explored.
    Journal of Applied Statistics 01/2007; 34(3):249-259. · 0.45 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: MOTIVATION: Microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes. Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation. Many clustering methods including hierarchical clustering, K-means, PAM, SOM, mixture model-based clustering and tight clustering have been widely used in the literature. Yet no comprehensive comparative study has been performed to evaluate the effectiveness of these methods. RESULTS: In this paper, six gene clustering methods are evaluated by simulated data from a hierarchical log-normal model with various degrees of perturbation as well as four real datasets. A weighted Rand index is proposed for measuring similarity of two clustering results with possible scattered genes (i.e. a set of noise genes not being clustered). Performance of the methods in the real data is assessed by a predictive accuracy analysis through verified gene annotations. Our results show that tight clustering and model-based clustering consistently outperform other clustering methods both in simulated and real data while hierarchical clustering and SOM perform among the worst. Our analysis provides deep insight to the complicated gene clustering problem of expression profile and serves as a practical guideline for routine microarray cluster analysis.
    Bioinformatics 11/2006; 22(19):2405-12. · 5.47 Impact Factor
  • Source
    Indranil Mukhopadhyay, Anup Som, Satyabrata Sahoo
    [Show abstract] [Hide abstract]
    ABSTRACT: This article deals with the relationship between vocabulary (total number of distinct oligomers or "words") and text-length (total number of oligomers or "words") for a coding DNA sequence (CDS). For natural human languages, Heaps established a mathematical formula known as Heaps' law, which relates vocabulary to text-length. Our analysis shows that Heaps' law fails to model this relationship for CDSs. Here we develop a mathematical model to establish the relationship between the number of type of words (vocabulary) and the number of words sampled (text-length) for CDSs, when non-overlapping nucleotide strings with the same length are treated as words. We use tangent-hyperbolic function, which captures the saturation property of vocabulary. Based on the parameters of the model, we formulate a mathematical equation, known as "equation of word organization", whose parameters essentially indicate that nucleotide organization of coding sequences are different from one another. We also compare the word organization of CDSs with the random word distribution and conclude that a CDS is neither similar to a natural human language nor to a random one. Moreover, these sequences have their unique nucleotide organization and it is completely structured for specific biological functioning.
    Theory in Biosciences 09/2006; 125(1):1-17. · 0.93 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Using affected sibling pairs, the mean allele sharing statistic tests for linkage by testing if the mean proportion of alleles that are identical-by-descent (IBD) is equal to a half. The behavior of some versions of the mean allele sharing test statistic depends on whether or not families that are uninformative for their IBD status are included; the SIBPAL version provides less significant values when all families (informative and uninformative) are used than when only informative families are used. Here, we investigate this behavior both analytically and by simulation. Our investigation shows that the main issue is the choice of the variance estimator in the denominator of the statistic. The choice of the denominator is very important and is still not totally resolved. Our mathematical explanation supported by our simulation study might aid in the search for an optimum solution.
    Statistical Applications in Genetics and Molecular Biology 02/2006; 5:Article13. · 1.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Using the Genetic Analysis Workshop 14 (GAW14) simulated dataset, we compare microsatellite and single-nucleotide polymorphism (SNP) markers in terms of two measures of information content, the traditional entropy-based information content measure, and a new "relative information" measure. Both attempt to measure the amount of information contained in the markers about the identity-by-descent (IBD) sharing among relatives. The performance of the two information measures are compared based on their variability and ability to predict change in the LOD score (Delta LOD) as map density increases for SNP markers. Although in a linked region, LOD scores are correlated with measures of information, we observe that none of the measures predict the LOD score itself very well. In an unlinked region, the LOD score is not related to either measures of information. The information content of microsatellite markers with 7.5-cM spacing is slightly higher than that of SNP markers with 3-cM spacing. At these map densities, microsatellites are found to be uniformly more informative than SNPs irrespective of their level of heterozygosity. For SNPs, we found that as the level of heterozygosity increases, the information content increases. As reported in all other previous studies, we also found that high-density SNPs have higher information content compared to low-density microsatellites. Performance of both the two information measures considered here are similar, but the relative information measure predicts Delta LOD as marker density increases better than the traditional entropy-based information measure.
    BMC Genetics 01/2006; 6 Suppl 1:S27. · 2.81 Impact Factor
  • Source
    Indranil Mukhopadhyay, Eleanor Feingold, Daniel E Weeks
    The American Journal of Human Genetics 11/2004; 75(4):716-8; author reply 723-7. · 11.20 Impact Factor

Publication Stats

219 Citations
51.15 Total Impact Points

Institutions

  • 2009–2014
    • Indian Statistical Institute
      • Human Genetics Unit (HGU)
      Baranagore, Bengal, India
  • 2011
    • Genome Institute of Singapore
      Tumasik, Singapore
  • 2006–2008
    • University of Pittsburgh
      • • Department of Psychiatry
      • • Department of Human Genetics
      Pittsburgh, PA, United States
  • 2007
    • University of Burdwan
      • Department of Statistics
      Burdwān, Bengal, India