Article

Optimization of reference library used in content-based medical image retrieval scheme.

Department of Radiology, University of Pittsburgh, 3362 Fifth Avenue, Pittsburgh, Pennsylvania 15213, USA.
Medical Physics (Impact Factor: 3.01). 12/2007; 34(11):4331-9. DOI: 10.1118/1.2795826
Source: PubMed

ABSTRACT Building an optimal image reference library is a critical step in developing the interactive computer-aided detection and diagnosis (I-CAD) systems of medical images using content-based image retrieval (CBIR) schemes. In this study, the authors conducted two experiments to investigate (1) the relationship between I-CAD performance and size of reference library and (2) a new reference selection strategy to optimize the library and improve I-CAD performance. The authors assembled a reference library that includes 3153 regions of interest (ROI) depicting either malignant masses (1592) or CAD-cued false-positive regions (1561) and an independent testing data set including 200 masses and 200 false-positive regions. A CBIR scheme using a distance-weighted K-nearest neighbor algorithm is applied to retrieve references that are considered similar to the testing sample from the library. The area under receiver operating characteristic curve (Az) is used as an index to evaluate the I-CAD performance. In the first experiment, the authors systematically increased reference library size and tested I-CAD performance. The result indicates that scheme performance improves initially from Az= 0.715 to 0.874 and then plateaus when the library size reaches approximately half of its maximum capacity. In the second experiment, based on the hypothesis that a ROI should be removed if it performs poorly compared to a group of similar ROIs in a large and diverse reference library, the authors applied a new strategy to identify "poorly effective" references. By removing 174 identified ROIs from the reference library, I-CAD performance significantly increases to Az = 0.914 (p < 0.01). The study demonstrates that increasing reference library size and removing poorly effective references can significantly improve I-CAD performance.

0 Followers
 · 
84 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Computer-aided diagnosis of masses in mammograms is important to the prevention of breast cancer. Many approaches tackle this problem through content-based image retrieval (CBIR) techniques. However, most of them fall short of scalability in the retrieval stage, and their diagnostic accuracy is therefore restricted. To overcome this drawback, we propose a scalable method for retrieval and diagnosis of mam-mographic masses. Specifically, for a query mammographic region of interest (ROI), SIFT descriptors are extracted and searched in a vocabulary tree, which stores all the quantized descriptors of previously diagnosed mammographic ROIs. In addition, to fully exert the discriminative power of SIFT descriptors, contextual information in the vocabulary tree is employed to refine the weights of tree nodes. The retrieved ROIs are then used to determine whether the query ROI contains a mass. This method has excellent scalability due to the low spatial-temporal cost of vocabulary tree. Retrieval precision and diagnostic accuracy are evaluated on 5005 ROIs extracted from the digital database for screening mammography (DDSM), which demonstrate the efficacy of our approach.
    2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI 2014); 04/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Development of a fully automated system retrieving visually similar images is a task that could be helpful as the basis of a computer-assisted diagnostic (CADx) tool in mammography. Our study aims at a better understanding of the concept of visual similarity as it pertains to mammographic masses. Such understanding is a necessary step for building effective perceptually-driven image retrieval systems. In our study we deconstruct the concept of visual mass similarity into three components: similarity of size, similarity of shape, and similarity of margin. We present the results of a pilot observer study to determine the importance of each component when human observers assess the overall similarity of two masses. Seven observers of various expertise participated in the study: 1 highly experienced mammographer, 1 expert in visual perception, 3 CAD researchers, and 2 novices. Each observer assessed the similarity between 100 pairs of mammographic regions of interest (ROIs) depicting benign and malignant masses. Visual similarity was assessed in four categories (shape, size, margin, overall) using a web-based interface and a 10-point rating scale. Preliminary analysis of the results suggests the following. First, there is a moderate agreement between observers in similarity assessment for all mentioned categories. Second, all components substantially affect the overall similarity rating, with mass margin having the highest significance and mass size having the lowest significance relatively to the other factors. These findings varied somewhat based on the observer's expertise. Third, some low-level morphological features extracted from the masses can be used to mimic the overall visual similarity ratings and its specific components.
    Proceedings of SPIE - The International Society for Optical Engineering 01/2008; DOI:10.1117/12.772125 · 0.20 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Computer-aided diagnosis of masses in mammograms is important to the prevention of breast cancer. Many approaches tackle this problem through content-based image retrieval (CBIR) techniques. However, most of them fall short of scalability in the retrieval stage, and their diagnostic accuracy is therefore restricted. To overcome this drawback, we propose a scalable method for retrieval and diagnosis of mammographic masses. Specifically, for a query mammographic region of interest (ROI), SIFT features are extracted and searched in a vocabulary tree, which stores all the quantized features of previously diagnosed mammographic ROIs. In addition, to fully exert the discriminative power of SIFT features, contextual information in the vocabulary tree is employed to refine the weights of tree nodes. The retrieved ROIs are then used to determine whether the query ROI contains a mass. The presented method has excellent scalability due to the low spatial-temporal cost of vocabulary tree. Extensive experiments are conducted on a large dataset of 11,553 ROIs extracted from the digital database for screening mammography (DDSM), which demonstrate the accuracy and scalability of our approach.
    IEEE transactions on bio-medical engineering 10/2014; 62(2). DOI:10.1109/TBME.2014.2365494 · 2.23 Impact Factor

Full-text (2 Sources)

Download
36 Downloads
Available from
May 26, 2014