M.D. Wang

Emory University, Atlanta, GA, USA

Are you M.D. Wang?

Claim your profile

Publications (52)13.83 Total impact

  • Conference Proceeding: Automatic batch-invariant color segmentation of histological cancer images
    [show abstract] [hide abstract]
    ABSTRACT: We propose an automatic color segmentation system that (1) incorporates domain knowledge to guide histological image segmentation and (2) normalizes images to reduce sensitivity to batch effects. Color segmentation is an important, yet difficult, component of image-based diagnostic systems. User-interactive guidance by domain experts-i.e., pathologists-often leads to the best color segmentation or “ground truth” regardless of stain color variations in different batches. However, such guidance limits the objectivity, reproducibility and speed of diagnostic systems. Our system uses knowledge from pre-segmented reference images to normalize and classify pixels in patient images. The system then refines the segmentation by re-classifying pixels in the original color space. We test our system on four batches of H&E stained images and, in comparison to a system with no normalization (39% average accuracy), we obtain an average segmentation accuracy of 85%.
    Biomedical Imaging: From Nano to Macro, 2011 IEEE International Symposium on; 05/2011
  • Conference Proceeding: WebPK, a web-based tool for custom pharmacokinetic simulation
    J. Srimani, R.A. Moffitt, M.D. Wang
    [show abstract] [hide abstract]
    ABSTRACT: Drug bioavailability is a major failing point of new pharmaceuticals i.e. drugs fail to reach their target or fail to stay there long enough for therapeutic effect. Compounding this issue, significant variability exists between patients and how they metabolize and distribute a drug. We present WebPK, a web-based tool for simulation of custom pharmacokinetic models. Model parameters can be entered manually or uploaded as a file. Simulation computations are performed on the server side, and thus require minimal client resources, which makes WebPK suitable for mobile devices. Time series biodistribution data are returned to the user in graphical and numerical form for quick interpretation or archiving. Results generated from WebPK are consistent with previously published pharmacokinetic models. This work is expected to provide physicians with access to easy simulation of patient pharmacokinetic profiles, which will allow for the prescription of more efficient and personalized drug regimens. URL: http://webpk.bme.gatech.edu.
    Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE; 10/2010
  • Conference Proceeding: Automatic tip selection for microtubule dynamics quantification
    [show abstract] [hide abstract]
    ABSTRACT: Microtubule (MT) dynamics quantification includes modeling of elongation, rapid shortening, and pauses. It indicates the effect of the cancer treatment drug paclitaxel because the drug causes MTs to bundle, which will in turn inhibit successful mitosis of cancerous cells. Thus, automatic MT dynamics analysis has been researched intensely because it allows for faster evaluation of potential cancer treatments and better understanding of drug effects on a cell. However, most current literatures still use manual initialization. In this work, we propose an automatic initialization algorithm that selects isolated and active tips for tracking. We use a Gaussian match filter to enhance the MT structures, and a novel technique called Pixel Nucleus Analysis (PNA) for isolated MT tip detection. To find dynamic tips, we applied a masked FFT in the temporal domain followed by K-means clustering. To evaluate the selected tips, we used a low level tip linking algorithm, and show the results of applying the algorithm to a model image and five MCF-7 breast cancer cell line images captured using fluorescent confocal microscopy. Finally, we compare tip selection criteria with existing automatic selection algorithms. We conclude that the proposed analysis is an effective technique based on three criteria which include outer region selection, separation, and MT dynamics.
    Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE; 10/2010
  • Conference Proceeding: Automated classification of renal cell carcinoma subtypes using bag-of-features
    [show abstract] [hide abstract]
    ABSTRACT: Color variation in medical images degrades the classification performance of computer aided diagnosis systems. Traditionally, color segmentation algorithms mitigate this variability and improve performance. However, consistent and robust segmentation remains an open research problem. In this study, we avoid the tenuous phase of color segmentation by adapting a bag-of-features approach using scale invariant features for classification of renal cell carcinoma subtypes. Previous work shows that features from each subtype match those from expertly chosen template images. In this paper, we show that the performance of this match-based methodology greatly depends on the quality of the template images. To avoid this uncertainty, we propose a bag-of-features approach that does not require expert knowledge and instead learns a “vocabulary” of morphological characteristics from training data. We build a support vector machine using feature histograms and evaluate this method using 40 iterations of 3-fold cross validation. We achieve classification accuracy above 90% for a heterogeneous dataset labeled by an expert pathologist, showing its potential for future clinical applications.
    Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE; 10/2010
  • Source
    Article: k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction.
    [show abstract] [hide abstract]
    ABSTRACT: In the clinical application of genomic data analysis and modeling, a number of factors contribute to the performance of disease classification and clinical outcome prediction. This study focuses on the k-nearest neighbor (KNN) modeling strategy and its clinical use. Although KNN is simple and clinically appealing, large performance variations were found among experienced data analysis teams in the MicroArray Quality Control Phase II (MAQC-II) project. For clinical end points and controls from breast cancer, neuroblastoma and multiple myeloma, we systematically generated 463,320 KNN models by varying feature ranking method, number of features, distance metric, number of neighbors, vote weighting and decision threshold. We identified factors that contribute to the MAQC-II project performance variation, and validated a KNN data analysis protocol using a newly generated clinical data set with 478 neuroblastoma patients. We interpreted the biological and practical significance of the derived KNN models, and compared their performance with existing clinical factors.
    The Pharmacogenomics Journal 08/2010; 10(4):292-309. · 4.54 Impact Factor
  • Conference Proceeding: Deblurring molecular images using desorption electrospray ionization mass spectrometry
    [show abstract] [hide abstract]
    ABSTRACT: Traditional imaging techniques for studying the spatial distribution of biological molecules such as proteins, metabolites, and lipids, require the a priori selection of a handful of target molecules. Imaging mass spectrometry provides a means to analyze thousands of molecules at a time within a tissue sample, adding spatial detail to proteomic, metabolomic, and lipidomic studies. Compared to traditional microscopic images, mass spectrometric images have reduced spatial resolution and require a destructive acquisition process. In order to increase spatial detail, we propose a constrained acquisition path and signal degradation model enabling the use of a general image deblurring algorithm. Our analysis shows the potential of this approach and supports prior observations that the effect of the sprayer focuses on a central region much smaller than the extent of the spray.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Quality control of highly multiplexed proteomic immunostaining with quantum dots: Correcting for crosstalk
    [show abstract] [hide abstract]
    ABSTRACT: The process of developing molecular assays for disease diagnosis and prognosis requires cross-disciplinary research which monitors quality and reproducibility at all levels. This paper discusses challenges in the quality control of highly multiplexed Quantum Dot (QD) staining and provides a method for improving accuracy of QD quantification in two phases. Phase one is the estimation of unintended crosstalk between multiplexed QD-antibody reporters, and phase two is digital correction of this crosstalk. Results show that crosstalk varies among tissues and reagents, and in some cases it can be on the same order of magnitude as the original intended signal. In cases where target protein expression is assumed to be independent, crosstalk can be empirically estimated from imaging data and corrected for. This work is expected to improve the overall reproducibility and quantification of multiplexed QD staining.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Automated classification of renal cell carcinoma subtypes using scale invariant feature transform
    [show abstract] [hide abstract]
    ABSTRACT: The task of analyzing tissue biopsies performed by a pathologist is challenging and time consuming. It suffers from intra- and inter-user variability. Computer assisted diagnosis (CAD) helps to reduce such variations and speed up the diagnostic process. In this paper, we propose an automatic computer assisted diagnostic system for renal cell carcinoma subtype classification using scale invariant features. We capture the morphological distinctness of various subtypes and we have used them to classify a heterogeneous data set of renal cell carcinoma biopsy images. Our technique does not require color segmentation and minimizes human intervention. We circumvent user subjectivity using automated analysis and cater for intra-class heterogeneities using multiple class templates. We achieve a classification accuracy of 83% using a Bayesian classifier.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Extraction of informative cell features by segmentation of densely clustered tissue images
    S. Kothari, Q. Chaudry, M.D. Wang
    [show abstract] [hide abstract]
    ABSTRACT: This paper presents a fast methodology for the estimation of informative cell features from densely clustered RGB tissue images. The features estimated include nuclei count, nuclei size distribution, nuclei eccentricity (roundness) distribution, nuclei closeness distribution and cluster size distribution. Our methodology is a three step technique. Firstly, we generate a binary nuclei mask from an RGB tissue image by color segmentation. Secondly, we segment nuclei clusters present in the binary mask into individual nuclei by concavity detection and ellipse fitting. Finally, we estimate informative features for every nuclei and their distribution for the complete image. The main focus of our work is the development of a fast and accurate nuclei cluster segmentation technique for densely clustered tissue images. We also developed a simple graphical user interface (GUI) for our application which requires minimal user interaction and can efficiently extract features from nuclei clusters, making it feasible for clinical applications (less than 2 minutes for a 1.9 megapixel tissue image).
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Simplevisgrid: Grid services for visualization of diverse biomedical knowledge and molecular systems data
    T.H. Stokes, M.D. Wang
    [show abstract] [hide abstract]
    ABSTRACT: Biomedical data visualization is a great challenge due to the scale, complexity, and diversity of systems, system component interactions and experimental data. Standards for interoperable data are a good start to addressing these problems, but standardization of visualization technologies is an emerging topic. SimpleVisGrid builds on Cancer Biomedical Informatics Grid (caBIG) common infrastructure for cancer research, and clearly specifies and extends three standard data formats for inputs and outputs to grid services: comma-separated values (CSV), Portable Network Graphics (PNG), and Scalable Vector Graphics (SVG). Four prototype visualizations are available: 2D array data quality visualization, correlation heatmaps between high-dimensional data and associated meta-data, feature landscapes, and biochemical or semantic network graphs. The services and data model are prepared for submission for caBIG Silver-level compatibility review and for integration into automated research workflows. Making these tools available to caBIG developers and ultimately to biomedical researchers can (1) help with biomedical communication, discovery, and decision-making, (2) encourage more research on standardization of visualization formats, and (3) improve the efficiency of large data transfers across the grid.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Development of an automatic quantification method for cancer tissue microarray study
    [show abstract] [hide abstract]
    ABSTRACT: Clinical histopathology is based on the analysis of immunohistochemistry (IHC) stained tissue images. Selection of antibodies for detecting the presence, type, and grade of cancerous tissue has a great influence on the diagnostic potential of IHC tests. Automated evaluation methods for tissue microarrays applied to many combinations of antibody and tissue type can speed development of new clinical assays. We present an automatic method that successfully quantifies stain intensity, fraction of cells stained and sub-cellular location of staining in tissue microarray images. The method combines an opponent color preprocessor and a novel statistical approach for identifying brown and blue staining, followed by multilevel morphological processing. We verify the capability of our method by comparing the results to manually annotated image databases. We also demonstrate cross-tissue robustness using two clinical case study data.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Emerging translational bioinformatics: Knowledge-guided biomarker identification for cancer diagnostics
    [show abstract] [hide abstract]
    ABSTRACT: Advances in high-throughput genomic and proteomic technology have led to a growing interest in cancer biomarkers. These biomarkers can potentially improve the accuracy of cancer subtype prediction and subsequently, the success of therapy. In this paper, we describe emerging technology for enabling translational bioinformatics by improving biomarker identification. Specifically, we present an application that uses prior knowledge to identify the most biologically relevant gene ranking algorithm. Identification of statistically and biologically relevant biomarkers from high-throughput data can be unreliable due to the nature of the data - e.g., high technical variability, small sample size, and high dimension size. Furthermore, due to the lack of available training samples, data-driven machine learning methods are often insufficient without the support of knowledge-based algorithms. As a case study, we apply these knowledge-driven methods to renal cancer data and identify genes that are potential biomarkers for cancer subtype classification.
    Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE; 10/2009
  • Conference Proceeding: Automated cell counting and cluster segmentation using concavity detection and ellipse fitting techniques
    S. Kothari, Q. Chaudry, M.D. Wang
    [show abstract] [hide abstract]
    ABSTRACT: This paper presents a novel, fast and semi-automatic method for accurate cell cluster segmentation and cell counting of digital tissue image samples. In pathological conditions, complex cell clusters are a prominent feature in tissue samples. Segmentation of these clusters is a major challenge for development of an accurate cell counting methodology. We address the issue of cluster segmentation by following a three step process. The first step involves pre-processing required to obtain the appropriate nuclei cluster boundary image from the RGB tissue samples. The second step involves concavity detection at the edge of a cluster to find the points of overlap between two nuclei. The third step involves segmentation at these concavities by using an ellipse-fitting technique. Once the clusters are segmented, individual nuclei are counted to give the cell count. The method was tested on four different types of cancerous tissue samples and shows promising results with a low percentage error, high true positive rate and low false discovery rate.
    Biomedical Imaging: From Nano to Macro, 2009. ISBI '09. IEEE International Symposium on; 08/2009
  • Conference Proceeding: Improving renal cell carcinoma classification by automatic region of interest selection
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we present an improved automated system for classification of pathological image data of renal cell carcinoma. The task of analyzing tissue biopsies, generally performed manually by expert pathologists, is extremely challenging due to the variability in the tissue morphology, the preparation of tissue specimen, and the image acquisition process. Due to the complexity of this task and heterogeneity of patient tissue, this process suffers from inter-observer and intra-observer variability. In continuation of our previous work, which proposed a knowledge-based automated system, we observe that real life clinical biopsy images which contain necrotic regions and glands significantly degrade the classification process. Following the pathologistpsilas technique of focusing on selected region of interest (ROI), we propose a simple ROI selection process which automatically rejects the glands and necrotic regions thereby improving the classification accuracy. We were able to improve the classification accuracy from 90% to 95% on a significantly heterogeneous image data set using our technique.
    BioInformatics and BioEngineering, 2008. BIBE 2008. 8th IEEE International Conference on; 11/2008
  • Conference Proceeding: A two dimensional simulation of microtubule dynamics
    [show abstract] [hide abstract]
    ABSTRACT: We propose a two dimensional model to simulate microtubule dynamics. Microtubules are polymers that are important in many cell functions including cell division. In particular, chemotherapy targets microtubule dynamics in order to slow cancer cell reproduction. Traditional stochastic or chemical models for microtubule dynamics are one-dimensional, focusing on one variable such as length or concentration. We combine a traditional microtubule instability model and a chemical model and propose a two dimensional space for these models to interact. This gives a more realistic simulation of microtubule dynamics as it allows interaction of different microtubules. It can also simulate microtubule movement under different conditions. Our approach quantifies microtubule images and models microtubule dynamics within a synthesis-analysis framework.
    Technology and Applications in Biomedicine, 2008. ITAB 2008. International Conference on; 06/2008
  • Conference Proceeding: Can we trust biomarkers? visualization and quantification of outlier probes in high density oligonucleotide microarrays
    [show abstract] [hide abstract]
    ABSTRACT: One of the top priorities in translating genomics to disease diagnosis and treatment is the reliability and reproducibility of oligonucleotide microarrays. Previous work such as dChip (Li and Wong, 2001) and caCORRECT (Stokes et al., 2007) have tried to detect outlier probes and remove artifacts. This paper presents a more advanced method that quantifies and visualizes the direct impact of outlier probes on genes of interest (i.e. biomarkers). Thousands of papers on microarray have been published, and many of these papers have claimed to discover new biomarkers. However, many biomarkers cannot be reproduced. Using our research result, the wide community of microarray users can rescreen hundreds of oligo microarray data, and overlay previous published biomarkers so to get rid of noisy ones. The methods are being incorporated into the next version of caCORRECT, which is available at http://www.caCORRECT.bme.gatech.edu.
    Life Science Systems and Applications Workshop, 2007. LISA 2007. IEEE/NIH; 12/2007
  • Conference Proceeding: Computer Aided Histopathological Classification of Cancer Subtypes
    [show abstract] [hide abstract]
    ABSTRACT: In this paper we present the results of our effort to develop a computer aided diagnosis system for pathological imaging data using renal cell carcinoma as a case study. Traditionally, cancer diagnosis is performed by an expert pathologist studying biopsy tissue under a microscope. Due to the complex nature of the task and the heterogeneity of patient tissue, these methods are not only time consuming but also suffer from subjective variability. To improve the repeatability and accuracy of the diagnosis process, a computational diagnosis system is proposed here. In this paper we report that with our novel knowledge-based methodology, we are able to achieve high level of classification accuracy (98%) when trying to classify 64 images (n=64) using a simple Bayesian classifier based on 8 extracted features and complete-leave-one-out cross-validation. This methodology is implemented in MATLAB and is expected to aid pathologists in the clinical setting to diagnose renal cell carcinoma as well as other types of cancer.
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on; 11/2007
  • Source
    Conference Proceeding: Multivariate Analysis of Imaging Mass Spectrometry Data
    [show abstract] [hide abstract]
    ABSTRACT: Imaging mass spectrometry can be used to reveal spatial distributions of multiple molecular species in a 2D biological sample. Due to the large amount of data produced by this technology, it is difficult and time-consuming to manually extract meaningful results from imaging mass spectrometry experimentation. We have developed and implemented an original approach to easily and consistently process mass spectrometry imaging data with the goal of automatically identifying interesting regions of molecule expression. Based on multivariate analysis techniques such as principal component analysis, the system allows researchers to conveniently define and visualize spatial regions based on spectral similarity. Features of our system are demonstrated on mouse cerebellum data.
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on; 11/2007
  • Conference Proceeding: Microtubule Dynamics Classification Using a Statistical Model of the Movement of Outer Tips
    [show abstract] [hide abstract]
    ABSTRACT: A new method is proposed for tracking the dynamics of microtubules. It combines a salient point extraction mechanism for segmenting plus-end tips, a robust tracking method capable of locating the trajectories of a large number of feature points, and a classification algorithm capable of determining if the level of activity of a given microtubule video is typical of that of a treated or a control cell. Our method does not rely on the precise tracking of a single microtubule like many previous works, but instead focuses on the generalized movement of ending tips as a whole, which gives a more statistically reliable interpretation of the movement of microtubules. The proposed algorithm is tested using twenty videos of breast cancer microtubules -ten are treated with Taxol and ten are control. We are able to correctly classify those test videos 85% of the time, which is comparable in accuracy, but uses a less complex algorithm than other algorithms.
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on; 11/2007
  • Conference Proceeding: Estimating Classification Error to Identify Biomarkers in Time Series Expression Data
    J.H. Phan, M.D. Wang
    [show abstract] [hide abstract]
    ABSTRACT: One of the primary objectives in the study of human diseases is the development of accurate and early diagnostic tests using molecular profiling technology. These investigations usually focus on feature selection with the goal of building a classifier using only the most clinically relevant features. With time series molecular profiles, each patient's assay contains observations measured at several time points. Using traditional time series classification methods, we can only determine a patient's diagnosis after obtaining all time points, eliminating the possibility of early diagnosis. This problem can be alleviated by dividing the time series into smaller overlapping sub-series. Unfortunately, these sub-series are not independent and identically distributed (iid). Consequently, when we estimate classification error for feature selection using traditional methods, we may encounter estimation bias. In response, we have developed a novel method that ranks time series biomarkers using specialized blocked error estimation methods designed to reduce estimation bias. Our investigation applies special cross validation and bootstrap methods, including h-block, hv-block cross validation, and blocked bootstrap to synthetic and clinical time series data. Results indicate a clear decrease in estimation bias using these methods on synthetic time series data. Similar results for a drug treatment dataset show further evidence that these blocked algorithms can improve biomarker identification.
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on; 11/2007