Article

Comparison of threshold selection methods for microarray gene co-expression matrices.

Department of Animal Science, University of Tennessee, Knoxville, Tennessee, USA. .
BMC Research Notes 12/2009; 2:240. DOI:10.1186/1756-0500-2-240 pp.240
Source: PubMed

ABSTRACT Network and clustering analyses of microarray co-expression correlation data often require application of a threshold to discard small correlations, thus reducing computational demands and decreasing the number of uninformative correlations. This study investigated threshold selection in the context of combinatorial network analysis of transcriptome data.
Six conceptually diverse methods - based on number of maximal cliques, correlation of control spots with expressed genes, top 1% of correlations, spectral graph clustering, Bonferroni correction of p-values, and statistical power - were used to estimate a correlation threshold for three time-series microarray datasets. The validity of thresholds was tested by comparison to thresholds derived from Gene Ontology information. Stability and reliability of the best methods were evaluated with block bootstrapping.Two threshold methods, number of maximal cliques and spectral graph, used information in the correlation matrix structure and performed well in terms of stability. Comparison to Gene Ontology found thresholds from number of maximal cliques extracted from a co-expression matrix were the most biologically valid. Approaches to improve both methods were suggested.
Threshold selection approaches based on network structure of gene relationships gave thresholds with greater relevance to curated biological relationships than approaches based on statistical pair-wise relationships.

0 0
 · 
0 Bookmarks
 · 
34 Views
  • Source
    Article: Uncovering the overlapping community structure of complex networks in nature and society.
    [show abstract] [hide abstract]
    ABSTRACT: Many complex systems in nature and society can be described in terms of networks capturing the intricate web of connections among the units they are made of. A key question is how to interpret the global organization of such networks as the coexistence of their structural subunits (communities) associated with more highly interconnected parts. Identifying these a priori unknown building blocks (such as functionally related proteins, industrial sectors and groups of people) is crucial to the understanding of the structural and functional properties of networks. The existing deterministic methods used for large networks find separated communities, whereas most of the actual networks are made of highly overlapping cohesive groups of nodes. Here we introduce an approach to analysing the main statistical features of the interwoven sets of overlapping communities that makes a step towards uncovering the modular structure of complex systems. After defining a set of new characteristic quantities for the statistics of communities, we apply an efficient technique for exploring overlapping communities on a large scale. We find that overlaps are significant, and the distributions we introduce reveal universal features of networks. Our studies of collaboration, word-association and protein interaction graphs show that the web of communities has non-trivial correlations and specific scaling properties.
    Nature 07/2005; 435(7043):814-8. · 36.28 Impact Factor
  • Article: Threshold selection in gene co-expression networks using spectral graph theory techniques.
    BMC Bioinformatics. 01/2009; 10:4.
  • Source
    Article: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.
    [show abstract] [hide abstract]
    ABSTRACT: We sought to create a comprehensive catalog of yeast genes whose transcript levels vary periodically within the cell cycle. To this end, we used DNA microarrays and samples from yeast cultures synchronized by three independent methods: alpha factor arrest, elutriation, and arrest of a cdc15 temperature-sensitive mutant. Using periodicity and correlation algorithms, we identified 800 genes that meet an objective minimum criterion for cell cycle regulation. In separate experiments, designed to examine the effects of inducing either the G1 cyclin Cln3p or the B-type cyclin Clb2p, we found that the mRNA levels of more than half of these 800 genes respond to one or both of these cyclins. Furthermore, we analyzed our set of cell cycle-regulated genes for known and new promoter elements and show that several known elements (or variations thereof) contain information predictive of cell cycle regulation. A full description and complete data sets are available at http://cellcycle-www.stanford.edu
    Molecular Biology of the Cell 01/1999; 9(12):3273-97. · 4.94 Impact Factor

Full-text (2 Sources)

View
7 Downloads
Available from
11 Oct 2012

Keywords

block bootstrapping.Two threshold methods
 
clustering analyses
 
co-expression matrix
 
correlation matrix structure
 
correlation threshold
 
curated biological relationships
 
Gene Ontology
 
Gene Ontology information
 
gene relationships
 
maximal cliques
 
microarray co-expression correlation data
 
network structure
 
small correlations
 
spectral graph clustering
 
statistical pair-wise relationships
 
threshold selection
 
Threshold selection approaches
 
time-series microarray datasets
 
transcriptome data
 
uninformative correlations