Conference PaperPDF Available

SSIM image quality metric for denoised images

Authors:
  • University of the Commonwealth Caribbean, Kingston, Jamaica

Abstract and Figures

The mean square error (MSE) and its related metrics such as peak signal to noise ratio (PSNR), root mean square error (RMSE), mean absolute error (MAE), and signal to noise ratio (SNR) have been the basis for mathematically defined image quality measurement for a long time. These methods are all based on the MSE. Denoisng quality has also been traditionally measured in terms of the MSE or its derivatives. But none of these metrics takes the structural fidelity of the image into account. Here, we investigate the structural changes that occur during the denoising process. In particular, we ascertain the structural fidelity of TV-denoised images.
Content may be subject to copyright.
SSIM Image Quality Metric for Denoised Images
PETER NDAJAH, HISAKAZU KIKUCHI, MASAHIRO YUKAWA,
HIDENORI WATANABE and SHOGO MURAMATSU
Department of Electrical and Electronics Engineering,
Niigata University,
JAPAN
email: ndajah@telecom0.eng.niigata-u.ac.jp
September 3, 2010
Abstract
The mean square error (MSE) and its related metrics such as peak
signal to noise ratio (PSNR), root mean square error (RMSE), mean
absolute error (MAE), and signal to noise ratio (SNR) have been the
basis for mathematically defined image quality measurement for a long
time. These methods are all based on the MSE. Denoisng quality has
also been traditionally measured in terms of the MSE or its derivatives.
But none of these metrics takes the structural fidelity of the image into
account. Here, we investigate the structural changes that occur during
the denoising process. In particular, we ascertain the structural fidelity
of TV-denoised images.
Keywords: SSIM, MSE, TV, PSNR, NOISE, DENOISE, METRIC
1MSE-based Im-
age Quality Mea-
sure
The MSE has been the basis for
image quality measure. Usu-
ally, one of the images (the orig-
inal) is assumed to contain no
distortions while the other im-
age is contaminated by noise or
some other kind of error. Sup-
pose x={xi|i=1,2, ..., N }and
y={yi|i=1,2, ..., N }where xi
and yiare the ith samples in x
and yand Nis the number of
signal samples. Then the MSE
between the signals is
MSE(x,y)= 1
N
N
i=1
(xiyi)2
ei=(xiyi) is referred to as error
signal. An image is a two dimen-
sional signal so the MSE is given
as
d(x,y)=N
i=1
|ei|2
Generally, in image processing,
the MSE is often used in the form
of the peak signal to noise ra-
tio(PSNR) measure
PSNR =10log
10
L2
MSE
1
ADVANCES in VISUALIZATION, IMAGING and SIMULATION
ISSN: 1792-6130
53
ISBN: 978-960-474-246-2
The PSNR is more useful than the
MSE only when images of differ-
ent dynamic ranges are being com-
pared otherwise it is equivalent to
the MSE [7].
2Drawbacks of
MSE-based Im-
age Quality Mea-
sure
In imaging, the true aim of any
denoising method is to improve
the visual quality and fidelity of a
noisy image but the MSE does not
take into account image depen-
dencies such as textures, order-
ings, patterns, etc. all of which af-
fect image perception quality. Im-
age pixel order transmit vital in-
formation about the structure of
a visual scene. Unfortunately the
MSE does not measure this. The
correlation between the error sig-
nal and the underlying image sig-
nificantly affects perceptual image
distortion but this is also ignored
by the MSE. The MSE does not
take into account the signs of the
error (since its square is used) sig-
nal added to an image. However,
the visual fidelity of the resulting
image has been proved to be dras-
tically different. Since all images
are treated equally in the formu-
lation of the MSE, image content-
dependent variations in image fi-
delity cannot be accouted for.
3 Structural Simi-
larity (SSIM)
The SSIM is a recently proposed
image fidelity measure which has
proved highly effective in mea-
suring the fidelity of signals. The
SSIM approach was originally mo-
tivated by the observation that
natural images have highly struc-
tured signals with strong neigh-
borhood dependencies. These de-
pendencies carry useful informa-
tion about the structures of the
objects in the visual scene.
The human visual system is highly
adapted to extract structural in-
formation from visual scenes. For
this reason, image fidelity mea-
surement should retain the signal
structure as an important content.
A distinction has to be made be-
tween non-structural distortions
such as variations in luminance,
contrast, Gamma distortions, and
spatial shift(these do not change
the structure of the image in any
way) and the structural distor-
tions such as additive Gaussian
noise, blur and lossy compres-
sion(e.g. JPEG). These distort
the structure of the image signifi-
cantly.
The human visual system is highly
sensitive to structural distortions
and easily compensates for non-
structural distortions. The main
function of the SSIM is to simu-
late this functionality.
Let x={xi|i=1,2, ..., N}and
y={yi|i=1,2, ..., N }be the
original and the test image signals
respectively. Then, the SSIM
Q=4σxy ¯x¯y
(σ2
x+σ2
y)[(¯x)2+(¯y)2](1)
The above equation can be
rewritten as
Q=σxy
σxσy
·x¯y
x)2+(¯y)2·2σxσy
σ2
x+σ2
y
(2)
The SSIM measures distor-
tionsasacombinationofthree
2
ADVANCES in VISUALIZATION, IMAGING and SIMULATION
ISSN: 1792-6130
54
ISBN: 978-960-474-246-2
factors: loss of correlation, lumi-
nance distortion and contrast dis-
tortion. The first component in
(2) is the correlation coefficient
between xand y.Itmeasures
the degree of correlation between
xand y. Its dynamic range is
[1,1] and the best value 1 is ob-
tained when yiis linear with re-
spect to xifor all i=1,2, ..., N i.e.
yi=axi+b. The second compo-
nent has a value range of [0,1]. It
measures the mean luminance be-
tween x. It equals 1 if and only if
¯xy. The third compnent meau-
res the similarity of the contrast
between xand y. Its range is also
[0,1], where the best value is 1.
This occurs only when σx=σy.
4 A Comparison of the MSE and the SSIM
Reference Image
MSE = 0, SSIM =1 Contrast Stretch
MSE = 255, SSIM = 0.9172 Negative Image
MSE = 255, SSIM = −0.1632
Gaussian White Noise
MSE = 255, SSIM = 0.5927 Lossy compression
MSE = 255, SSIM = 0.6947 Blurred Image
MSE = 255, SSIM = 0.7722
Figure 1: Images with different structural distortions but the same MSE
values
Figure 1 illustrates the shortcom-
ings of the MSE. In all the images
shown, the MSE = 255 even when
the visual structures are greatly
distorted. The SSIM on the other
hand seems to reflects the struc-
tural changes in the images more
faithfully. This is the advantage of
the SSIM over the MSE. The hu-
man visual system (HVS) is very
sensitive to structural changes,
therefore any metric that will be
well correlated to the HVS must
take into account the structural
dependencies of the signal samples
in order to provide effective pre-
3
ADVANCES in VISUALIZATION, IMAGING and SIMULATION
ISSN: 1792-6130
55
ISBN: 978-960-474-246-2
Reference Image: MSE = 255, SSIM = 1
(a) Denoised Image: λ = 60,τ = 0.01,
MSE = 255, SSIM = 0.653430
(b)
Denoised Image: λ = 12, τ = 0.01,
MSE = 255, SSIM = 0.892388
(c)
Denoised Image: λ = 2, τ = 0.01,
MSE = 255, SSIM = 0.748494
(d)
Denoised Image: λ = 1, τ = 0.01,
MSE = 255, SSIM = 0.712412
(e)
Denoised Image: λ = 0.5, τ = 0.01,
MSE = 255, SSIM = 0.685501
(f)
Figure 2: Denoised images showing values MSE values and SSIM index
values
dictions of image quality. As of-
ten happens during denoising of
images, structural changes such
as blurring can happen. Most
denoising algorithms do not ac-
tually ’remove’ the noise. It is
more a process of noise minimiza-
tion rather than removal. The
amount of noise still left in the im-
age sample after the denoising op-
eration depends on the amount of
noise originally in the image be-
fore the denoising operation. But
the MSE-based metrics may not
be able to capture this reality be-
cause they are not designed to to
measure the structural distortions
that may occur.
5 Denoised Image
Structural Fi-
delity
So why use the SSIM index to
measure the quality of denoised
images? Because the MSE-based
metrics do not tell the whole story.
The ultimate objective of denois-
ing is to produce an image that
is judged to be a good representa-
tion of the reference image (known
or unknown). The HVS is the ul-
timate judge of what a good qual-
ity image is. This means that the
structural fidelity of the denoised
image is of utmost importance be-
cause the HVS uses the structural
fidelity to measure the quality of
an image. The MSE-based met-
rics fail to measure the structural
improvement or degradation in an
image after denoising. This is be-
4
ADVANCES in VISUALIZATION, IMAGING and SIMULATION
ISSN: 1792-6130
56
ISBN: 978-960-474-246-2
cause in the MSE-based metrics,
the signal samples are considered
to be independent of each other.
As we can see in Figure 2, the de-
noised images have different SSIM
values (as judged by the HVS) but
they have practically the same
MSE values.
The total variation denoising al-
gorithm was used to denoise the
images because of its effectiveness
and also because it has tunable
parameters λand τthat control
the effectiveness of the denoising
process. We have varied the val-
ues of λand kept τconstant in the
experiments.
6Conclusion
We used the lena image as the test
image in our experiments. As Fig-
ure 2 shows, the changes in struc-
tural similarity indices of the im-
ages correlate somewhat with hu-
man visual system. For example,
when λ2, ((d)-(f)), the algo-
rithm causes blurring in the im-
ages. The SSIM index reflects this
fact as the SSIM values become
progressively smaller with reduc-
ing visual quality of the images,
However, the MSE remained the
same throughout our experiments.
for this reason, it may be use-
ful to use the SSIM as an alter-
native metric of denoised image
quality since it is a good measure
of the structural degradation or
improvement in a denoised image.
References
[1] Z. Wang, L. Lu and A.C.
Bovik. Video Quality Assess-
ment based on Structural Dis-
tortion Analysis. IEEE Inter-
national Conference on Image
Processing, Genoa, Italy, Sept.
11-14, 2005.
[2] Z. Wang, A.C. Bovik and
E.P. Simoncelli. Structural
Approaches to Image Quality
Assessment in Handbook of
Image and Video Processing
(Al Bovik, ed.), 2nd Edition,
Academic Press, 2005.
[3] Z. Wang, A.C. Bovik and L.
Lu. Why is Image Quality
Assessment Difficult? IEEE
International Conference on
Acoustics, Speech and Signal
Processing, May, 2002
[4] Z. Wang and A.C. Bovik. A
Universal Image Quality In-
dex. IEEE Signal Processing
Letters, vol. 9. no.3, pp. 81-84,
March 2002
[5] Z. Wang, A.C. Bovik, H.R.
Sheikh and E. P. Simon-
celli. Image Quality Assess-
ment: From error Visibility
to Structural Similarity. IEEE
Transactions on Image Pro-
cessing, vol. 13, no. 4, pp. 600-
612, 2004
[6] Zhou Wang and Alan C.
Bovik. Mean Squared Error:
Love it ot Leave it? IEEE Sig-
nal Processing Magazine, Jan-
uary 2009.
[7] Zhou Wang and Alan C.
Bovik. A Universal Image
Quality Index. IEEE Signal
Processing Letters, vol. 9 no.
3, pp. 81-84, 2002.
5
ADVANCES in VISUALIZATION, IMAGING and SIMULATION
ISSN: 1792-6130
57
ISBN: 978-960-474-246-2
... Our algorithm was 4.57 times faster than the reference at 512 × 512 pixels and 2.66 times faster at 1024 × 1024 pixels. To further evaluate the image quality in Figure 4a-d, the following image quality evaluation functions were introduced: PSNR, RMSE and SSIM [33]. The image quality evaluation functions for the four images are show in Figure 5. ...
... The image quality evaluation functions for the four images are show in Figure 5. To further evaluate the image quality in Figure 4a-d, the following image quality evaluation functions were introduced: PSNR, RMSE and SSIM [33]. The image quality evaluation functions for the four images are show in Figure 5. ...
Article
Full-text available
Confocal laser scanning microscopy is one of the most widely used tools for high-resolution imaging of biological cells. However, the imaging resolution of conventional confocal technology is limited by diffraction, and more complex optical principles and expensive optical-mechanical structures are usually required to improve the resolution. This study proposed a deep residual neural network algorithm that can effectively improve the imaging resolution of the confocal microscopy in real time. The reliability and real-time performance of the algorithm were verified through imaging experiments on different biological structures, and an imaging resolution of less than 120 nm was achieved in a more cost-effective manner. This study contributes to the real-time improvement of the imaging resolution of confocal microscopy and expands the application scenarios of confocal microscopy in biological imaging.
... The hybrid approach combines the advantages of two methods to denoise CT and MR brain images efficiently. The performance of the proposed method is exhaustively tested by established metrics like PSNR, SSIM, MSE, and IQI (Ndajah et al., 2010) to confirm denoised images meet the accuracy required for medical diagnosis and decision. The key steps of the proposed hybrid denoising technique are given below: ...
Article
Full-text available
p dir="ltr"> Noise in CT-MR brain images poses a critical challenge, significantly impacting diagnostic accuracy as well as clinical decision-making. Current medical image-denoising techniques struggle to effectively remove noise while preserving crucial image features. The hybrid technique’s potential that combines complementary denoising algorithms is highlighted by the limitations of these approaches. This paper develops & evaluates a denoising method that combines the strengths of Particle Swarm optimised, Non-Local Means (NLM) & Wiener filtering. The proposed approach effectively denoises by leveraging non-local self-similarity in brain images. Particle Swarm Optimisation algorithm is employed to fine-tune the smoothing parameter of NLM denoising, ensuring optimal performance, while the Wiener filter helps to address the trade-off between noise reduction as well as edge preservation. The proposed denoising technique outperforms the traditional methods including median, gaussian, NLM, and wiener filter in terms of peak signal-to-noise ratio (PSNR), image quality index (IQI), mean square error (MSE), and structural similarity index (SSIM). </div
... To evaluate our model's results, we use the metrics Peak Signal Noise Ratio (PSNR) [27], a quantitative measure defined by the ratio of the signal's maximum power of the signal to the power of residual errors and Structural Similarity Index Measure (SSIM) [28], a quantitative measure of spatial reconstruction in image generation. SSIM quantifies the perceptual quality of the generated image. ...
Preprint
Full-text available
In the last few years, the fusion of multi-modal data has been widely studied for various applications such as robotics, gesture recognition, and autonomous navigation. Indeed, high-quality visual sensors are expensive, and consumer-grade sensors produce low-resolution images. Researchers have developed methods to combine RGB color images with non-visual data, such as thermal, to overcome this limitation to improve resolution. Fusing multiple modalities to produce visually appealing, high-resolution images often requires dense models with millions of parameters and a heavy computational load, which is commonly attributed to the intricate architecture of the model. We propose LapGSR, a multimodal, lightweight, generative model incorporating Laplacian image pyramids for guided thermal super-resolution. This approach uses a Laplacian Pyramid on RGB color images to extract vital edge information, which is then used to bypass heavy feature map computation in the higher layers of the model in tandem with a combined pixel and adversarial loss. LapGSR preserves the spatial and structural details of the image while also being efficient and compact. This results in a model with significantly fewer parameters than other SOTA models while demonstrating excellent results on two cross-domain datasets viz. ULB17-VT and VGTSR datasets.
... Homogenous textures (i.e., those with regular or stochastic textures) process better using our technique. Three primary metrics are used to quantitatively assess the designed method: Peak Signal to Noise Ratio (PSNR) [40], Structural Similarity (SSIM) [41] and Learned Perceptual Image Patch Similarity (LPIPS) [42]. The results are compared to those of the state-of-the-art counterparts with weight ratios of 20-40%, 40-60%, and 60-80%. ...
... The similarity of the input and reconstructed images at the output was calculated to evaluate the training of the autoencoder. The structural similarity score of the images was calculated using SSIM [73,74], the structural similarity index. SSIM is a widely used image fidelity measure in traditional image processing techniques that evaluates the similarity between the structural elements of the images and their ground truth. ...
Article
Full-text available
Welding is an extensively used technique in manufacturing, and as for every other process, there is the potential for defects in the weld joint that could be catastrophic to the manufactured products. Different welding processes use different parameter settings, which greatly impact the quality of the final welded products. The focus of research in weld defect detection is to develop a non-destructive testing method for weld quality assessment based on observing the weld with an RGB camera. Deep learning techniques have been widely used in the domain of weld defect detection in recent times, but the majority of them use, for example, X-ray images. An RGB image-based solution is attractive, as RGB cameras are comparatively inexpensive compared to X-ray image solutions. However, the number of publicly available RGB image datasets for weld defect detection is comparatively lower than that of X-ray image datasets. This work achieves a complete weld quality assessment involving lap shear strength prediction and visual weld defect detection from an extremely limited dataset. First, a multimodal dataset is generated by the fusion of image data features extracted using a convolutional autoencoder (CAE) designed in this experiment and input parameter settings data. The fusion of the dataset reduced lap shear strength (LSS) prediction errors by 34% compared to prediction errors using only input parameter settings data. This is a promising result, considering the extremely small dataset size. This work also achieves visual weld defect detection on the same limited dataset with the help of an ultrasonic weld defect dataset generated using offline and online data augmentation. The weld defect detection achieves an accuracy of 74%, again a promising result that meets standard requirements. The combination of lap shear strength prediction and visual defect detection leads to a complete inspection to avoid premature failure of the ultrasonic weld joints. The weld defect detection was compared against the publicly available image dataset for surface defect detection.
... Wang et al 2004, Mohan et al 2022, Zuo et al 2022, especially in the context of denoising(Ndajah et al 2010). PSNR may not always align with subjective human perception ...
Article
Full-text available
In this paper, we use five types of deep-learning algorithms for denoising scanning electron microscope (SEM) measurement data. Denoising of SEM images is an important task since the images often suffer from noise, which can make it difficult to accurately interpret the data. We also investigate realistic SEM denoising characteristics using a variety of metrics to assess the quality of denoised images. Overall, we find that the trained generative models provide superior denoising performance and that it is crucial to objectively quantify the performance, just like in the scanning process itself. It is anticipated that the deep-learning based technique can accelerate image measurements, which can be utilized for very fast analytical investigations. We also demonstrate that the success of a generative model may depend on the appropriate assessment of noise characteristics in the specific image data analysis of interest. Moreover, it is addressed that denoising performance can be properly evaluated when a relevant metrics that aligns well with human visual systems.
... Our observations show that standard metrics for image-toimage translation do not align well with the human qualitative assessment of synthetic images. While MSE, MAE, PSNR and SSIM are standard metrics for image-to-image translation tasks, we would also like to highlight that MSE and PSNR do not capture blurring (Ndajah et al., 2010). Moreover, PSNR and SSIM are highly sensitive to rotations, spatial shifts and scaling (Wang and Bovik, 2009), as well as Gaussian noise (Kotevski and Mitrevski, 2009). ...
Preprint
Full-text available
Cerebrovascular disease often requires multiple imaging modalities for accurate diagnosis, treatment, and monitoring. Computed Tomography Angiography (CTA) and Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) are two common non-invasive angiography techniques, each with distinct strengths in accessibility, safety, and diagnostic accuracy. While CTA is more widely used in acute stroke due to its faster acquisition times and higher diagnostic accuracy, TOF-MRA is preferred for its safety, as it avoids radiation exposure and contrast agent-related health risks. Despite the predominant role of CTA in clinical workflows, there is a scarcity of open-source CTA data, limiting the research and development of AI models for tasks such as large vessel occlusion detection and aneurysm segmentation. This study explores diffusion-based image-to-image translation models to generate synthetic CTA images from TOF-MRA input. We demonstrate the modality conversion from TOF-MRA to CTA and show that diffusion models outperform a traditional U-Net-based approach. Our work compares different state-of-the-art diffusion architectures and samplers, offering recommendations for optimal model performance in this cross-modality translation task.
Article
Full-text available
Audio‐driven talking face generation is essentially a cross‐modal mapping from audio to video frames. The main challenge lies in the intricate one‐to‐many mapping, which affects lip sync accuracy. And the loss of facial details during image reconstruction often results in visual artifacts in the generated video. To overcome these challenges, this paper proposes to enhance the quality of generated talking faces with a new spatio‐temporal consistency. Specifically, the temporal consistency is achieved through consecutive frames of the each phoneme, which form temporal modules that exhibit similar lip appearance changes. This allows for adaptive adjustment in the lip movement for accurate sync. The spatial consistency pertains to the uniform distribution of textures within local regions, which form spatial modules and regulate the texture distribution in the generator. This yields fine details in the reconstructed facial images. Extensive experiments show that our method can generate more natural talking faces than previous state‐of‐the‐art methods in both accurate lip sync and realistic facial details.
Article
Full-text available
The developed algorithm increases the confidentiality and secu-rity of hidden information within HDTV files, by using a shared secret mathematical function between the two parties of the communication, to determine the digital video frames that will be used in the process of concealing information. So, the serial use of video frames for the em-bedding process is excluded to include secret information. After de-termining the hiding frames, the luminance component of the masking frame is divided into (8*8 pixels) blocks, the largest value for the pixel within each block was found and the secret message was stored within this pixel using the LSB algorithm on (2or3MSB) bit as a first case study. Then the smallest value of the lighting pixel was searched for, and the masking action was carried out as a second case. Finally, the pixel located on the edge of the block from the upper left corner was masked. The developed algorithm was built using the Matlab environ-ment. The quality parameters (MSE, PSNR, SSIM), stego-capacity, retrieval of the secret message, and the time required for the conceal-ment process at the sender's side were studied. These results were compared to determine which method is the best in the concealment process in terms of confidentiality and quality, and in restoring the se-cret text.
Article
Full-text available
The proposed algorithm tries to divide the video frame into blocks (8*8pixels), find the largest value for the luminance pixel, and then store the secret message within this pixel using the LSB algorithm on bit (2,3) MSB, so we get the video blind. Then, masking was per-formed on the smallest value of the luminance pixel using the LSB al-gorithm. In addition, masking was performed in the pixel located on the edge of the block from the upper left corner, and it was built using Matlab environment. The quality parameters (MSE, PSNR, SSIM), the hiding capacity, and the percentage of recovering the secret message from the video frames at the receiving end were all studied. These re-sults were compared to determine which method is the best in the con-cealment process in terms of secrecy and quality and recovering the secret text. Keywords: Steganography, Secret, Frame, luminance, SSIM, MSE, PSNR.
Article
Full-text available
Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu/∼lcv/ssim/.
Chapter
This chapter introduces the basic ideas and algorithms of structural approaches for image quality assessment. It describes the concepts, the structural similarity (SSIM) index algorithm, and the image synthesis-based performance evaluation algorithm in the image space. It demonstrates the fact that image distortions along different directions in the image space have different perceptual meanings. The structural approaches attempt to separate the directions associated with structural distortions from those with nonstructural distortions. This separation gives a new coordinate system in the image space. The new coordinate system is not fixed as in traditional image decomposition frameworks but has adapted to the underlying image structures.
Article
Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000.
Article
Objective image and video quality measures play important roles in a variety of image and video processing applications, such as compression, communication, printing, analysis, registration, restoration, enhancement and watermarking. Most proposed quality assessment approaches in the literature are error sensitivity-based methods. In this paper, we follow a new philosophy in designing image and video quality metrics, which uses structural distortion as an estimate of perceived visual distortion. A computationally efficient approach is developed for full-reference (FR) video quality assessment. The algorithm is tested on the video quality experts group Phase I FR-TV test data set.
Conference Paper
Objective image/video quality measures play important roles in various image/video processing applications, such as compression, communication, printing, analysis, registration, restoration and enhancement. Most proposed quality assessment approaches in the literature are error sensitivity-based methods. We follow a new philosophy in designing image/video quality metrics, which uses structural distortion as an estimation of perceived visual distortion. We develop a new approach for video quality assessment. Experiments on the video quality experts group (VQEG) test data set shows that the new quality measure has higher correlation with subjective quality measurement than the proposed methods in VQEG's Phase I tests for full-reference video quality assessment.
Article
We propose a new universal objective image quality index, which is easy to calculate and applicable to various image processing applications. Instead of using traditional error summation methods, the proposed index is designed by modeling any image distortion as a combination of three factors: loss of correlation, luminance distortion, and contrast distortion. Although the new index is mathematically defined and no human visual system model is explicitly employed, our experiments on various image distortion types indicate that it performs significantly better than the widely used distortion metric mean squared error. Demonstrative images and an efficient MATLAB implementation of the algorithm are available online at http://anchovy.ece.utexas.edu//spl sim/zwang/research/quality_index/demo.html.
Article
Image quality assessment plays an important role in various image processing applications. A great deal of effort has been made in recent years to develop objective image quality metrics that correlate with perceived quality measurement. Unfortunately, only limited success has been achieved. In this paper, we provide some insights on why image quality assessment is so difficult by pointing out the weaknesses of the error sensitivity based framework, which has been used by most image quality assessment approaches in the literature. Furthermore, we propose a new philosophy in designing image quality metrics: The main function of the human eyes is to extract structural information from the viewing field, and the human visual system is highly adapted for this purpose. Therefore, a measurement of structural distortion should be a good approximation of perceived image distortion. Based on the new philosophy, we implemented a simple but effective image quality indexing algorithm, which is very promising as shown by our current results. 1.
Mean Squared Error: Love it ot Leave it? IEEE Signal Processing Magazine
  • Zhou Wang
  • Alan C Bovik
Zhou Wang and Alan C. Bovik. Mean Squared Error: Love it ot Leave it? IEEE Signal Processing Magazine, January 2009.