Conference PaperPDF Available

Data Analysis of Medical Images

Authors:

Abstract and Figures

Medical imaging techniques are increasingly and widely used in the medical community, especially hospitals and healthcare institutions. The doctors, including clinicians and radiologists, can view the internal construction of the particular organ or tissue of patients so that they can confirm the corresponding diagnosis and treatment suggestions to patients. The medical images with the distortion can affect the decisions of doctors at the clinical level. The paper presents a study on the impact of visual distortion on medical images.
Content may be subject to copyright.
INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYS TEMS, VOL. 1, NO. 1, NOVEMBER 2009
Data Analysis of Medical Images
Yuhao Sun, Gabriela Mogos
Abstract Medical imaging techniques are increasingly and
widely used in the medical community, especially hospitals and
healthcare institutions. The doctors, including clinicians and
radiologists, can view the internal construction of the particular
organ or tissue of patients so that they can confirm the
corresponding diagnosis and treatment suggestions to patients.
The medical images with the distortion can affect the decisions
of doctors at the clinical level. The paper presents a study on the
impact of visual distortion on medical images.
Index Terms Medical imaging, Image Quality Assessment,
Subjective IQA, Objective IQA, No-Reference IQA
I. INTRODUCTION
Medical imaging techniques are widely adopted in the
medical community worldwide in order to assist doctors to give
precise judgments to patients [1]. Typical medical imaging
techniques include X-ray Computed Tomography (CT),
Magnetic Resonance Imaging (MRI), Ultrasound Imaging and
other types [2]. Currently, medical imaging techniques still
have several flaws, in which image distortion is one of them.
Visual distortion on medical images generally is demonstrated
as blurry, contrast-distorted and noise-distorted [1]. However,
the impact of visual distortion on medical images is not usually
positive and cannot be easily ignored; the images with
distortion potentially influence the judgments from doctors,
which possibly cause the results of misdiagnosis, missed-
diagnosis, or other inaccurate judgements [2]. Thus, it is
significant to find a new and efficient mathematical model to
find the possibly distorted images and alert doctors in advance.
Currently, many good mathematical models of Image Quality
Assessment have been proved as efficient when detecting the
distortions of images, including Blind/Referenceless Image
Spatial Quality Evaluator (BRISQUE) [3], Naturalness Image
Quality Evaluator (NIQE) [4], Perception based Image Quality
Evaluator (PIQE) [5] and others [6-10]. However, some of them
are lack of the proof that they can work still well under medical
imaging circumstances.
This paper proposes a methodology to perceive the impact of
visual distortion on medical images. Specifically, we have
proved which mathematical model can work precisely to
determine potential distorted medical images, by using two
significant Image Quality Assessments, i.e., Subjective Image
Quality Assessment (Subjective IQA) and Objective Image
Quality Assessment (Objective IQA) [1].
Manuscript received .
All authors are with the Department of Computer Science and Software
Engineering, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, China.
(emails:yuhao.sun20@imperial.ac.uk, Gabriela.Mogos@xjtlu.edu.cn).
II. METHODOLOGY
A suitable methodology has been designed to achieve goals
smoothly. The used methodology contains three main steps:
Dataset Construction, Subjective IQA and Objective IQA.
Firstly, the project dataset has been constructed, i.e., medical
images dataset. A project-tailored and high-quality dataset is
the key to result in positive outcomes. The dataset aims to
consist of 50 medical images of CT scans, in which 20 images
of them are deemed as distortion-free and provided by a highly
reputed hospital or healthcare institution. Other images are
processed to distorted status computationally by 20 images
aforementioned accessed from the hospital, by using MATLAB
purely. Possible distorted status can be blurry, contrast- and
noise-distorted. According to the requirements of image files,
image quality, currency and image annotations, we adopted the
images in a dataset named DeepLesion, provided by NIH
(National Institutes of Health) [12]. Samples of medical images
in DeepLesion can be referred to Fig 1.
In order to simulate the reality to the maximum extent, more
than 50 doctors have been invited to complete a questionnaire
so that we can precisely allocate the number of distorted images
to different distorted types as real as possible. In the
questionnaire, doctors have to select one or more possible
distorted types which may happen during their working time.
The eventual statistics show “Blurry” happened most in the
reality of medical imaging (76% respondents chose); “Contrast-
distorted” is the second frequent distorted type (43% chose); the
third one is “Obscured-distorted” (31% chose); “Compression”
is the least distorted circumstance (16% chose). Additionally,
12% of respondents have chosen to make a note on other types
of distortion, including image artifacts and other human-
intervention occasions.
Eventually, three significant types of distortion, which are
“Blurry”, “Contrast-distorted” and “Obscured-distorted”, have
been selected. The number of distorted images to each type is
based on the statistics aforementioned they are 76%, 43% and
31% respectively. Alternatively, “Obscured-distorted” has been
changed to “Noise-distorted” as it will be tough to simulate the
obscuration on the images by mere computers. Therefore, 30
distorted images can be calculated as the percentage ratio (after
suitable adjustment and round up or down) 76%: 43%: 31%
17: 9: 4. Finally, we built the project dataset successfully,
which consists of 20 good quality (regarded as distortion-free)
CT scans, 17 blurry CT scans, 9 contrast-distorted CT scans,
and 4 noise distorted CT scans. Sample of a group of processed
images can be referred to Fig 2 in the Appendix.
INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 1, NO. 1, NOVEMBER 2009
The second step of the whole methodology is to conduct
Subjective IQA. Subjective IQA is one of two significant IQA
approaches and regarded as the most trustworthy approach to
decide the quality of an image, as the respondents of Subjective
IQA are purely human beings same to the reality. In
Subjective IQA of our project, we have selected three doctors
from a local highly reputed hospital, including two clinicians
and one radiologist. At least two of three doctors have more
than thirteen-year clinical experience in their specific
professions. All three doctors have been invited to give scores
to all medical images in the dataset we have constructed earlier
(50 CT scans), according to their own experiences only and
without any other references. During the whole process of
Subjective IQA, the possible influential factors (IFs) which
might cause negative results to the project, including System IF,
Context IF and Human IF, have been well considered. To
minimize the possible adverse effects, all respondents have
completed the Subjective IQA at the same time and in the most
suitable place where was remarkably similar to their daily
working places. For example, the clinicians have conducted
Subjective IQA in departmental clinics, and the radiologist has
conducted Subjective IQA in the office of the department of
radiology, with a high-quality display. Additionally, “Daily
Emotional Self-Declaration” has been provided for doctors to
answer, in order to ensure their responses were not negatively
affected by their potential negative emotions.
Three doctors are deemed as professionals in their fields, i.e.,
clinical departments and radiology. The doctors have given a
score to each medical image, based on a 5-point Likert scale. If
the score of the image is closer to five, the quality of the image
is better and vice versa. All doctors have been asked to
complete the Subjective IQA within the given time, 50 mins
even though it has been found that their average completed time
is only around to 30 mins. However, to set a fixed time period,
i.e. 50 mins, is one of the requirements to conduct a Subjective
IQA.
The third and final step of the whole methodology is to
conduct Objective IQA. Objective IQA is another essential IQA
approach. In Objective IQA, various mathematical models are
core, and they can generate numerical score results. Generally,
there are three types of Objective IQA which are Full-Reference
IQA (FR-IQA), Reduced-Reference IQA (RR-IQA), and No-
Reference IQA (NR-IQA), according to the availability of
reference images [11]. Equivalent to the approach as the same
as how medical images are usually assessed in hospitals, NR-
IQA should be considered in our case, which means there is no
reference image involved. Three mathematical models of NR-
IQA, including Blind/Referenceless Image Spatial Quality
Evaluator (BRISQUE) [3], Naturalness Image Quality
Evaluator (NIQE) [4], Perception based Image Quality
Evaluator (PIQE) [5], will be adopted. With the toolbox
provided by MATLAB, our Objective IQA will be conducted
wholly within MATLAB.
Different from Subjective IQA, it is not too sure if the
generated results of Objective IQA are within the expectations
or not. To ensure the results of Objective IQA are reasonable
and logical, conducting “Initial Check” to every mathematical
model used is essential. In each model’s Initial Check, images
have been checked by groups the qualities of distortion-free
images should be better than those are distorted. For example,
an acceptable group of scores possibly can be 40 (a good-
quality image), 50 (a blurry-distorted image), 60 (a contrast-
distorted image) the fewer scores stand for the better qualities,
and this principle works for all three models.
III. RESULTS
The results of Subjective IQA and Objective IQA are
numerically, but with different range and meaning. In
Subjective IQA, the results are in range 1 and 5 in integral form;
the larger score stands for, the better quality. In Objective IQA,
the results are in decimal form and range 0 to 100; the lower
score stands for, the better quality, this is following the rules of
NR-IQA mathematical models.
As mentioned earlier, for each assessment, initial check
firstly conducted to confirm the rationality of results in case of
any errors, outliers and some accidental circumstances. A valid
result by initial checks shows that the quality of distortion-free
images is better than those distorted images numerically.
In Subjective IQA, two clinicians and one radiologist have
made scores to every single medical image.
In Objective IQA, BRISQUE and NIQE have been first
conducted because of their influences among IQA researches.
After the initial check to BRISQUE and NIQE, only 30% and
28% of results have matched our expectations. Due to
unsatisfactory results obtained with BRISQUE and NIQE, it is
mandatory to add another mathematical model, PIQE. After the
initial check to the performance of PIQE, 82% of results of
PIQE match our expectations. Eventually, the results generated
by PIQE are fully considered as the final results of Objective
IQA and all data will be analyzed in the following chapter.
IV. COMPARATIVE ANALYSIS
According to the numerical results of Subjective IQA and
Objective IQA above generated, comparative analysis can
provide further information in detail. The analysis results have
been divided into three categories according to the functions,
which are the main results, clinical results and computational
results. Main results are towards to project aims and objectives.
In clinical results, some ideas from clinical perspectives have
been raised, i.e., hospital or healthcare institutions relevant. In
computational results, several novel findings in computer
science discipline have been discussed. Following contents will
demonstrate three above categories respectively.
For the main results, with comparison, the results of
Subjective IQA and Objective IQA are highly similar, up to 80%
similarity. Additionally, the mathematical model named
Perception based Image Quality Evaluator (PIQE) has
INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 1, NO. 1, NOVEMBER 2009
performed well in the experiments on the condition of medical
images, especially X-ray CT scans.
For the clinical results, firstly, there were no significant
differences between the results from clinicians and radiologist.
They usually can have the same or similar decisions toward a
single case. Secondly, both doctors were sensitive to the
changes of contrast values of images. Thirdly, both doctors
were not highly sensitive to the changes of blurry values when
the standard deviation for Gaussian filter is within a specific
range. This is explainable that human eyes can accept slight
changes in images but perceiving the same information.
For the computational results, there are several novel
findings. Firstly, and most significantly, two famous NR-IQA
mathematical models, BRISQUE and NIQE have been proved
during experiments that they did not work well among X-ray
CT scans, which is out of our original expectations. In addition
to clinical results aforementioned, it has been proved that there
will be no adverse effects to doctors within the range (0.5, 1.0)
for Gaussian filter with a standard deviation when the image is
distorted by blurry. Within a range of grayscale for contrast
values, specifically (0.1, 0.3) for ‘low_in’ and (0.6, 0.8) for
‘high_in’, it possibly will improve the quality of images from
doctors’ perspective. All of the numerical factors involved
above are relevant to the functions in MATLAB toolboxes,
especially imgaussfilt() and imadjust().
V. CONCLUSIONS
Through a three-step methodology by Dataset Construction,
Subjective Image Quality Assessment and Objective Image
Quality Assessment, we surprisingly have found the
mathematical model, Perception based Image Quality
Evaluator (PIQE), can be considered as an efficient model when
working in medical imaging discipline, especially X-ray CT
scans. Additionally, the inadequacy of Blind/Referenceless
Image Spatial Quality Evaluator (BRISQUE) and Naturalness
Image Quality Evaluator (NIQE) in medical imaging discipline
has been proved during the research. Lastly, we have
summarized other minor results which have found during the
research from the clinical and computational perspectives,
respectively.
ACKNOWLEDGEMENT
We wish to thank the Research Institute of Big Data
Analytics (RIBDA), Xi’an Jiaotong-Liverpool University,
Suzhou, China, for supporting our contributions to this paper
through the RIBDA conference subsidy fund.
REFERENCES
[1] L. Lévêque, H. Liu, S. Barakovic, J. B. Husic, M. Martini,
M. Outtas, L. Zhang, A. Kumcu, L. Platisa, R. Rodrigues
et al., “On the subjective assessment of the perceived
quality of medical images and videos,” in 2018 Tenth
International Conference on Quality of Multimedia
Experience (QoMEX). IEEE, 2018, pp. 16.
[2] P. Suetens, Fundamentals of medical imaging. Cambridge
university press, 2017.
[3] A. Mittal, A. K. Moorthy, and A. C. Bovik, “No-reference
image quality assessment in the spatial domain,” IEEE
Transactions on image processing, vol. 21, no. 12, pp.
46954708, 2012.
[4] A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a
“completely blind” image quality analyzer,” IEEE Signal
Processing Letters, vol. 20, no. 3, pp. 209212, 2012.
[5] N. Venkatanath, D. Praneeth, M. C. Bh, S. S.
Channappayya, and S. S. Medasani, “Blind image quality
evaluation using perception-based features,” in 2015
Twenty First National Conference on Communications
(NCC). IEEE, 2015, pp. 16.
[6] H. R. Sheikh, A. C. Bovik, and L. Cormack, “No-
reference quality assessment using natural scene statistics:
Jpeg2000,” IEEE Transactions on Image Processing, vol.
14, no. 11, pp. 19181927, 2005.
[7] L. Liang, S. Wang, J. Chen, S. Ma, D. Zhao, and W. Gao,
“No-reference perceptual image quality metric using
gradient profiles for jpeg2000,” Signal Processing: Image
Communication, vol. 25, no. 7, pp. 502516, 2010.
[8] T. Brandão and M. P. Queluz, “No-reference image
quality assessment based on dct domain statistics,” Signal
processing, vol. 88, no. 4, pp. 822833, 2008.
[9] Z. Wang, H. R. Sheikh, and A. C. Bovik, “No-reference
perceptual quality assessment of jpeg compressed images,”
in Proceedings. International Conference on Image
Processing, vol. 1. IEEE, 2002, pp. II.
[10] R. Ferzli and L. J. Karam, “A no-reference objective
image sharpness metric based on the notion of just
noticeable blur (jnb),” IEEE transactions on image
processing, vol. 18, no. 4, pp. 717728, 2009.
[11] Z. Wang and A. C. Bovik, “Modern image quality
assessment,” Synthesis Lectures on Image, Video, and
Multimedia Processing, vol. 2, no. 1, pp. 1156, 2006.
[12] K. Yan, X.Wang, L. Lu, and R. M. Summers, “Deeplesion:
automated mining of large-scale lesion annotations and
universal lesion detection with deep learning,” Journal of
medical imaging, vol. 5, no. 3, p. 036501, 2018.
INTERNATIONAL JOURNAL OF DESIGN, ANALYSIS AND TOOLS FOR INTERGRATED CIRCUITS AND SYSTEMS, VOL. 1, NO. 1, NOVEMBER 2009
Fig. 1. Sample Medical Images in DeepLesion (with annotation and without)
Fig. 2. Sample of a Group of Processed Images.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Extracting, harvesting, and building large-scale annotated radiological image datasets is a greatly important yet challenging problem. Meanwhile, vast amounts of clinical annotations have been collected and stored in hospitals' picture archiving and communication systems (PACS). These types of annotations, also known as bookmarks in PACS, are usually marked by radiologists during their daily workflow to highlight significant image findings that may serve as reference for later studies. We propose to mine and harvest these abundant retrospective medical data to build a large-scale lesion image dataset. Our process is scalable and requires minimum manual annotation effort. We mine bookmarks in our institute to develop DeepLesion, a dataset with 32,735 lesions in 32,120 CT slices from 10,594 studies of 4,427 unique patients. There are a variety of lesion types in this dataset, such as lung nodules, liver tumors, enlarged lymph nodes, and so on. It has the potential to be used in various medical image applications. Using DeepLesion, we train a universal lesion detector that can find all types of lesions with one unified framework. In this challenging task, the proposed lesion detector achieves a sensitivity of 81.1% with five false positives per image.
Article
Full-text available
This chapter examines objective criteria for the evaluation of image quality as perceived by an average human observer. The focus is on image fidelity, i.e., how close an image is to a given original or reference image. This paradigm of image quality assessment (QA) is also known as full reference image QA. Three classes of image QA algorithms that correlate with visual perception significantly better are discussed-human vision based metrics, Structural SIMilarity (SSIM) metrics, and information theoretic metrics. Each of these techniques approaches the image QA problem from a different perspective and using different first principles. In addition to these QA techniques, this chapter also highlights the similarities, dissimilarities, and interplay between these seemingly diverse techniques.
Article
This paper proposes a novel no-reference Perception-based Image Quality Evaluator (PIQUE) for real-world imagery. A majority of the existing methods for blind image quality assessment rely on opinion-based supervised learning for quality score prediction. Unlike these methods, we propose an opinion unaware methodology that attempts to quantify distortion without the need for any training data. Our method relies on extracting local features for predicting quality. Additionally, to mimic human behavior, we estimate quality only from perceptually significant spatial regions. Further, the choice of our features enables us to generate a fine-grained block level distortion map. Our algorithm is competitive with the state-of-the-art based on evaluation over several popular datasets including LIVE IQA, TID & CSIQ. Finally, our algorithm has low computational complexity despite working at the block-level.
Article
An important aim of research on the blind image quality assessment (IQA) problem is to devise perceptual models that can predict the quality of distorted images with as little prior knowledge of the images or their distortions as possible. Current state-of-the-art “general purpose” no reference (NR) IQA algorithms require knowledge about anticipated distortions in the form of training examples and corresponding human opinion scores. However we have recently derived a blind IQA model that only makes use of measurable deviations from statistical regularities observed in natural images, without training on human-rated distorted images, and, indeed without any exposure to distorted images. Thus, it is “completely blind.” The new IQA model, which we call the Natural Image Quality Evaluator (NIQE) is based on the construction of a “quality aware” collection of statistical features based on a simple and successful space domain natural scene statistic (NSS) model. These features are derived from a corpus of natural, undistorted images. Experimental results show that the new index delivers performance comparable to top performing NR IQA models that require training on large databases of human opinions of distorted images. A software release is available at http://live.ece.utexas.edu/research/quality/niqe_release.zip.
Article
We propose a natural scene statistic (NSS)-based distortion-generic blind/no-reference (NR) image quality assessment (IQA) model which operates in the spatial domain. The new model, dubbed Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) does not compute distortion specific features such as ringing, blur or blocking, but instead uses scene statistics of locally normalized luminance coefficients to quantify possible losses of naturalness in the image due to the presence of distortions, thereby leading to a holistic measure of quality. The underlying features used derive from the empirical distribution of locally normalized luminances and products of locally normalized luminances under a spatial natural scene statistic model. No transformation to another coordinate frame (DCT, wavelet, etc) is required, distinguishing it from prior no reference IQA approaches. Despite its simplicity, we are able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) and highly competitive to all present-day distortion-generic NR IQA algorithms. BRISQUE has very low computational complexity, making it well suited for real time applications. BRISQUE features may be used for distortion-identification as well. To illustrate a new practical application of BRISQUE, we describe how a non-blind image denoising algorithm can be augmented with BRISQUE in order to perform blind image denoising. Results show that BRISQUE augmentation leads to performance improvements over the stateof- the-art. A software release of BRISQUE is available online: http://live.ece.utexas.edu/research/quality/BRISQUE release.zip for public use and evaluation.
Article
This work presents a perceptual-based no-reference objective image sharpness/blurriness metric by integrating the concept of just noticeable blur into a probability summation model. Unlike existing objective no-reference image sharpness/blurriness metrics, the proposed metric is able to predict the relative amount of blurriness in images with different content. Results are provided to illustrate the performance of the proposed perceptual-based sharpness metric. These results show that the proposed sharpness metric correlates well with the perceived sharpness being able to predict with high accuracy the relative amount of blurriness in images with different content.
Article
No-reference measurement of perceptual image quality is a crucial and challenging issue in modern image processing applications. One of the major difficulties is that some inherent features of natural images and artifacts are possibly rather ambiguous. In this paper, we tackle this problem using statistical information on image gradient profiles and propose a novel quality metric for JPEG2000 images. The key part of the metric is a histogram representing the sharpness distribution of the gradient profiles, from which a blur metric that is insensitive to inherently blurred structures in the natural image is established. Then a ringing metric is built based on ringing visibilities of regions associated with the gradient profiles. Finally, a combination model optimized through plenty of experiments is developed to predict the perceived image quality. The proposed metric achieves performance competitive with the state-of-the-art no-reference metrics on public datasets and is robust to various image contents.
Article
This paper proposes a no-reference quality assessment metric for images subject to quantization noise in block-based DCT (discrete cosine transform) domain, as those resulting from JPEG or MPEG encoding. The proposed method is based on natural scene statistics of the DCT coefficients, whose distribution is usually modeled by a Laplace probability density function, with parameter λ. A new method for λ estimation from quantized coefficient data is presented; it combines maximum-likelihood with linear prediction estimates, exploring the correlation between λ values at adjacent DCT frequencies. The resulting coefficient distributions are then used for estimating the local error due to lossy encoding. Local error estimates are also perceptually weighted, using a well-known perceptual model by Watson. When confronted with subjective quality evaluation data, results show that the quality scores that result from the proposed algorithm are well correlated with the human perception of quality. Since no knowledge about the original (reference) images is required, the proposed method resembles a no-reference quality metric for image evaluation.
Conference Paper
Human observers can easily assess the quality of a distorted image without examining the original image as a reference. By contrast, designing objective No-Reference (NR) quality measurement algorithms is a very difficult task. Currently, NR quality assessment is feasible only when prior knowledge about the types of image distortion is available. This research aims to develop NR quality measurement algorithms for JPEG compressed images. First, we established a JPEG image database and subjective experiments were conducted on the database. We show that Peak Signal-to-Noise Ratio (PSNR), which requires the reference images, is a poor indicator of subjective quality. Therefore, tuning an NR measurement model towards PSNR is not an appropriate approach in designing NR quality metrics. Furthermore, we propose a computational and memory efficient NR quality assessment model for JPEG images. Subjective test results are used to train the model, which achieves good quality prediction performance.