Conference Paper

Uncovering potential biomarkers in ovarian carcinoma via biclustering of DNA microarray data

University of Minnesota Duluth, Duluth, Minnesota, United States
DOI: 10.1109/GENSIPS.2006.353179 Conference: Genomic Signal Processing and Statistics, 2006. GENSIPS '06. IEEE International Workshop on


The NIH/NCI estimates that one out of 57 women will develop ovarian cancer during their lifetime. Ovarian cancer is 90 percent curable when detected early. Unfortunately, many cases of ovarian cancer are not diagnosed until advanced stages because most women do not develop noticeable symptoms. This paper presents an exhaustive identification of all potential biomarkers for the diagnosis of early-stage and/or recurrent ovarian cancer using a unique and comprehensive set of gene expression data. The data set was generated by Gene Logic Inc. from ovarian normal and cancerous tissues as well as non-ovarian tissues collected at the University of Minnesota by Skubitz et al. In particular, the paper shows the ability of a modified biclustering technique combined with sensitivity analysis of gene expression levels to identify all potential biomarkers found by prior studies as well as several more promising candidates that had been missed in the literature. Furthermore, unlike most prior studies, this work screens all candidate biomarkers using two additional techniques: immunohistochemical analysis and reverse transcriptase polymerase chain reaction.

3 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose group-biomarkers as an alternative to the traditional single biomarkers used to date for the detection of ovarian cancer. Group-biomarkers are a set of genes that are used simultaneously for the diagnosis of early-stage and/or recurrent cancer. We describe a procedure for identifying such group-biomarkers from a data set of gene expression levels corresponding to normal and diseased ovarian tissue as well as tissue from other organs. The procedure starts with a list of potential single biomarkers. It then uses an order preserving biclustering step to identify other genes that are co-regulated with the candidate single biomarkers across the normal and diseased ovarian tissue and tissue from other organ. We present a statistical analysis that demonstrates that group-biomarkers have a much better detection performance than single biomarkers as exhibited by receiver operating characteristics curves.
    Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on; 05/2007
  • [Show abstract] [Hide abstract]
    ABSTRACT: Inspired by a huge amount of empirical study of real world networks such as the Internet, the Web, as well as various social and biological networks, researchers have in recent years developed several random graph models to help us to understand the most fundamental properties of these systems. Simple characteristics observed in many real world networks are 1.) a high clustering coefficient, i.e., if vertex $u$ is connected to vertex $v$, and vertex $v$ is connected to vertex $w$, then $u$ is likely connected to $w$; 2.) a power law degree distribution, i.e., the number of nodes of degree $d$ is proportional to $d^{-\alpha}$, where $\alpha$ is some constant; 3.) small mean geodesic distance, i.e., the minimum distance between two randomly chosen nodes is small. Most known random graph models with the properties above are generated by using some global graph properties (e.g., in the preferential attachment model it is assumed that each node knows the degree of any other node). However, in large real world networks the nodes usually do not possess any global information about the network. In this paper, we present new random graph models, which imitate a simple growth behavior observed in many dynamically evolving real world networks. More precisely, in our graphs we implement the fact that if a new node joins the network, then this node becomes likely connected to some old members which already share an edge in the corresponding graph. Although we do not necessarily require any global knowledge about the graph, we are able to show that in addition to a large clustering coefficient a power law degree distribution is implicitly obtained. Thus, our results provide an explanation why the behavior described above seems to be one of the essential factors for the shape of many real world networks. Additionally, by using proper parameters we can control the power law exponent in our graphs.
  • [Show abstract] [Hide abstract]
    ABSTRACT: A biclustering algorithm named ROBA has been used in a number of recent works. We present a time and space efficient implementation of ROBA that reduces the time and space complexity by an order of L where L is the number of distinct values present in the data. Our implementation runs almost 11 times faster than the existing implementation on Yeast gene expression dataset. We also improve ROBA and then use it to present an iterative algorithm that can find all perfect biclusters with constant values, constant values on rows and constant values on columns. Though our algorithm may take exponential time in the worst case, we use some subtle observations to reduce computational time and space. Experimental result reveals that our algorithm runs in reasonable time on Yeast gene expression dataset and finds almost 10 times more biclusters than ROBA.
    01/2010; DOI:10.1109/ICBBE.2010.5518207