ChapterPDF Available

Experimental Comparison of PSNR and SSIM Metrics for Video Quality Estimation

Authors:

Abstract

Since the development of digital video technology, due to the nature of digital video, the approach to video quality estimation has changed. Basically there are two types of metrics used to measure the objective quality of processed digital video: purely mathematically defined video quality metrics (DELTA, MSAD, MSE, SNR and PSNR) where the error is mathematically calculated as a difference between the original and processed pixel, and video quality metrics that have similar characteristics as the Human Visual System – HVS (SSIM, NQI, VQM) where the perceptual quality is considered in the overall video quality estimation. In this paper, an overview and experimental comparison of PSNR and SSIM metrics for video quality estimation is presented.
http://link.springer.com/chapter/10.1007%2F978-3-642-10781-8_37#
... Comparison between the different noise-degraded synthetic datasets using a common metric was necessary to understand the impact of each noise type on the DVC analyses. Literature is abundant with techniques and models recommended for assessing image quality, defined as the magnitude of deviation from the ideal or reference model [48,49]. The most widely adopted image quality assessment metrics are the mean squared error (MSE) and the peak signal-to-noise ratio (PSNR). ...
... The number of Ring Artefacts was uniform and constant on all occasions (equal to 10) found elsewhere [50]. The metric used to compare the image datasets implemented for this study was the structural similarity index (SSIM) [48][49][50][51]. SSIM measures the similarity between an original and a distorted image. ...
... Therefore, judging from a human visual perspective, it is recommended as a comparatively better metric than PSNR and MSE [48]. In addition, SSIM's improved performance compared to MSE and PSNR has been reported [49]. A modified implementation of the usual formulation of the structural similarity index between reference and distorted image was used in this study. ...
Article
Full-text available
Background The potential effect of image noise artefacts on Digital Volume Correlation (DVC) analysis has not been thoroughly studied and, more particularly quantified, even though DVC is an emerging technique widely used in life and material science over the last decade. Objective This paper presents the results of a sensitivity study to shed light on the effect of various noise artefacts on the full-field kinematic fields generated by DVC, both in zero and rigid body motion. Methods Various noise artefacts were studied, including the Gaussian, Salt & Pepper, Speckle noise and embedded Ring Artefacts. A noise-free synthetic microstructure was generated using Discrete Element Modelling (DEM), representing an idealistic case, and acting as the reference dataset for the DVC analysis. Noise artefacts of various intensities (including selected extreme cases) were added to the reference image datasets using MATLAB (R2022) to form the outline of the parametric study. DVC analyses were subsequently conducted employing AVIZO (Thermo Fisher). A subset-based local approach was adopted. A three-dimensional version of the Structural Similarity Index Measure (SSIM) was used to define the similarity between the compared image datasets on each occasion. Sub-pixel rigid body motion was applied on the DEM-generated microstructure and subsequently “poisoned” with noise artefacts to evaluate mean bias and random error of the DVC analysis. Results When the local approach is implemented, the sensitivity study on zero motion data revealed the insignificant effect of the Gaussian, Salt & Pepper, and Speckle noise on the DVC-computed kinematic field. Therefore, the presence of such noise artefacts can be neglected when DVC is executed. On the contrary, Ring Artefacts can pose a considerable challenge and therefore, DVC results need to be evaluated cautiously. A linear relationship between SSIM and the correlation index is observed for the same noise artefacts. Gaussian noise has a pronounced effect on the mean bias error for sub-pixel rigid body motion. Conclusions Generating synthetic image datasets using DEM enabled the investigation of a variety of noise artefacts that potentially affect a DVC analysis. Given that, any microstructure – resembling the material studied – can be simulated and used for a DVC sensitivity analysis, supporting the user in appropriately evaluating the computed kinematic field. Even though the study is conducted for a two-phase material, the method elaborated in this paper also applies to heterogeneous multi-phase materials also. The conclusions drawn are valid within the environment of the AVIZO DVC extension module. Alternative DVC algorithms, utilising different approaches for the cross-correlation and the sub-pixel interpolation methods, need to be investigated.
... While MSE, MAE, PSNR and SSIM are standard metrics for image-to-image translation tasks, we would also like to highlight that MSE and PSNR do not capture blurring (Ndajah et al., 2010). Moreover, PSNR and SSIM are highly sensitive to rotations, spatial shifts and scaling (Wang and Bovik, 2009), as well as Gaussian noise (Kotevski and Mitrevski, 2009). CTA imaging as a modality is highly noisy is contrast to the TOF-MRA imaging. ...
Preprint
Full-text available
Cerebrovascular disease often requires multiple imaging modalities for accurate diagnosis, treatment, and monitoring. Computed Tomography Angiography (CTA) and Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) are two common non-invasive angiography techniques, each with distinct strengths in accessibility, safety, and diagnostic accuracy. While CTA is more widely used in acute stroke due to its faster acquisition times and higher diagnostic accuracy, TOF-MRA is preferred for its safety, as it avoids radiation exposure and contrast agent-related health risks. Despite the predominant role of CTA in clinical workflows, there is a scarcity of open-source CTA data, limiting the research and development of AI models for tasks such as large vessel occlusion detection and aneurysm segmentation. This study explores diffusion-based image-to-image translation models to generate synthetic CTA images from TOF-MRA input. We demonstrate the modality conversion from TOF-MRA to CTA and show that diffusion models outperform a traditional U-Net-based approach. Our work compares different state-of-the-art diffusion architectures and samplers, offering recommendations for optimal model performance in this cross-modality translation task.
... These metrics provide a quantitative measure on the performance of the deep learning model which gives a comprehensive assessment of the model's effectiveness. This study aims to go beyond the single parameter evaluation approach and investigates multiple image-matching quality measures for a more comprehensive evaluation [10,11]. ...
... Even the image is degraded by some major types of distortions such as Gaussian blur, motion blur, and noises, and the related SSIM and PSNR scores can be larger than the enhanced image, which far contradicts human visual prospect. More related works showing the drawback of SSIM and PSNR can be found in (73,(107)(108)(109). ...
Article
Full-text available
Fundus cameras are widely used by ophthalmologists for monitoring and diagnosing retinal pathologies. Unfortunately, no optical system is perfect, and the visibility of retinal images can be greatly degraded due to the presence of problematic illumination, intraocular scattering, or blurriness caused by sudden movements. To improve image quality, different retinal image restoration/ enhancement techniques have been developed, which play an important role in improving the performance of various clinical and computer-assisted applications. This paper gives a comprehensive review of these restoration/ enhancement techniques, discusses their underlying mathematical models, and shows how they may be effectively applied in real-life practice to increase the visual quality of retinal images for potential clinical applications including diagnosis and retinal structure recognition. All three main topics of retinal image restoration/enhancement techniques, i.e., illumination correction, dehazing, and deblurring, are addressed. Finally, some considerations about challenges and the future scope of retinal image restoration/enhancement techniques will be discussed.
... However, some visually apparent distortions are not captured by the SSIM. Things like changing hue and brightness do not seem to affect SSIM much [49] and Sharif et al. [50] presents more examples of unpredictable ssim behaviours when presented with different distortions. ...
Preprint
Full-text available
With recent advances in video prediction, controllable video generation has been attracting more attention. Generating high fidelity videos according to simple and flexible conditioning is of particular interest. To this end, we propose a controllable video generation model using pixel level renderings of 2D or 3D bounding boxes as conditioning. In addition, we also create a bounding box predictor that, given the initial and ending frames' bounding boxes, can predict up to 15 bounding boxes per frame for all the frames in a 25-frame clip. We perform experiments across 3 well-known AV video datasets: KITTI, Virtual-KITTI 2 and BDD100k.
... By evaluating the ratio of peak signal strength to mean square error, PSNR evaluates the distinction between original and watermarked images. However, PSNR has certain limitations [2], it does not consider the perceptual impact of noise or distortion introduced by watermarks, and its effectiveness diminishes in scenarios involving additive noise. ...
Article
Full-text available
Digital watermarking is an essential tool for numerous applications, and the quality of watermarked images must be assessed using accurate criteria. Peak Signal-to-Noise Ratio (PSNR), a widely used image assessment metric, has limits when evaluating images containing noise, such as watermarks. To tackle such kind of issues this, this study investigates a different assessment metric, the Neutrosophic Similarity Measure, and assesses its performance in evaluating watermarked images when compared to PSNR. Similarities to ascertain whether the neutrosophic similarity Measure has a higher noise tolerance and offers a more accurate evaluation of watermarked images. The results show that Neutrosophic Similarity Measure overcomes PSNR in capturing the influence of additive watermarks and demonstrating superior noise tolerance through experimental evaluation on a dataset of watermarked images. These findings highlight the possibility of adopting new assessment metric, such as neutrosophic similarity measure, for assessing watermarked images, thereby enhancing the effectiveness of evaluating watermarked Images.
Conference Paper
Full-text available
Many recently proposed perceptual image quality assessment algorithms are implemented in two stages. In the first stage, image quality is evaluated within local regions. This results in a quality/distortion map over the image space. In the second stage, a spatial pooling algorithm is employed that combines the quality/distortion map into a single quality score. While great effort has been devoted to developing algorithms for the first stage, little has been done to find the best strategies for the second stage (and simple spatial average is often used). In this work, we investigate three spatial pooling methods for the second stage: Minkowski pooling, local quality/distortion-weighted pooling, and information content-weighted pooling. Extensive experiments with the LIVE database show that all three methods may improve the prediction performance of perceptual image quality measures, but the third method demonstrates the best potential to be a general and robust method that leads to consistent improvement over a wide range of image distortion types
Article
Full-text available
Motion is one of the most important types of information contained in natural video, but direct use of motion information in the design of video quality assessment algorithms has not been deeply investigated. Here we propose to incorporate a recent model of human visual speed perception [ Nat. Neurosci. 9, 578 (2006) ] and model visual perception in an information communication framework. This allows us to estimate both the motion information content and the perceptual uncertainty in video signals. Improved video quality assessment algorithms are obtained by incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty. Consistent improvement over existing video quality assessment algorithms is observed in our validation with the video quality experts group Phase I test data set.
Book
55% new material in the latest edition of this "must-have? for students and practitioners of image & video processing!This Handbook is intended to serve as the basic reference point on image and video processing, in the field, in the research laboratory, and in the classroom. Each chapter has been written by carefully selected, distinguished experts specializing in that topic and carefully reviewed by the Editor, Al Bovik, ensuring that the greatest depth of understanding be communicated to the reader. Coverage includes introductory, intermediate and advanced topics and as such, this book serves equally well as classroom textbook as reference resource. - Provides practicing engineers and students with a highly accessible resource for learning and using image/video processing theory and algorithms - Includes a new chapter on image processing education, which should prove invaluable for those developing or modifying their curricula - Covers the various image and video processing standards that exist and are emerging, driving today's explosive industry - Offers an understanding of what images are, how they are modeled, and gives an introduction to how they are perceived - Introduces the necessary, practical background to allow engineering students to acquire and process their own digital image or video data - Culminates with a diverse set of applications chapters, covered in sufficient depth to serve as extensible models to the reader's own potential applications About the Editor... Al Bovik is the Cullen Trust for Higher Education Endowed Professor at The University of Texas at Austin, where he is the Director of the Laboratory for Image and Video Engineering (LIVE). He has published over 400 technical articles in the general area of image and video processing and holds two U.S. patents. Dr. Bovik was Distinguished Lecturer of the IEEE Signal Processing Society (2000), received the IEEE Signal Processing Society Meritorious Service Award (1998), the IEEE Third Millennium Medal (2000), and twice was a two-time Honorable Mention winner of the international Pattern Recognition Society Award. He is a Fellow of the IEEE, was Editor-in-Chief, of the IEEE Transactions on Image Processing (1996-2002), has served on and continues to serve on many other professional boards and panels, and was the Founding General Chairman of the IEEE International Conference on Image Processing which was held in Austin, Texas in 1994.
Article
Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000.
Article
In this article, we have reviewed the reasons why we (collectively) want to love or leave the venerable (but perhaps hoary) MSE. We have also reviewed emerging alternative signal fidelity measures and discussed their potential application to a wide variety of problems. The message we are trying to send here is not that one should abandon use of the MSE nor to blindly switch to any other particular signal fidelity measure. Rather, we hope to make the point that there are powerful, easy-to-use, and easy-to-understand alternatives that might be deployed depending on the application environment and needs. While we expect (and indeed, hope) that the MSE will continue to be widely used as a signal fidelity measure, it is our greater desire to see more advanced signal fidelity measures being used, especially in applications where perceptual criteria might be relevant. Ideally, the performance of a new signal processing algorithm might be compared to other algorithms using several fidelity criteria. Lastly, we hope that we have given further motivation to the community to consider recent advanced signal fidelity measures as design criteria for optimizing signal processing algorithms and systems. It is in this direction that we believe that the greatest benefit eventually lies.
Article
55% new material in the latest edition of this "must-have? for students and practitioners of image & video processing!This Handbook is intended to serve as the basic reference point on image and video processing, in the field, in the research laboratory, and in the classroom. Each chapter has been written by carefully selected, distinguished experts specializing in that topic and carefully reviewed by the Editor, Al Bovik, ensuring that the greatest depth of understanding be communicated to the reader. Coverage includes introductory, intermediate and advanced topics and as such, this book serves equally well as classroom textbook as reference resource. - Provides practicing engineers and students with a highly accessible resource for learning and using image/video processing theory and algorithms - Includes a new chapter on image processing education, which should prove invaluable for those developing or modifying their curricula - Covers the various image and video processing standards that exist and are emerging, driving today's explosive industry - Offers an understanding of what images are, how they are modeled, and gives an introduction to how they are perceived - Introduces the necessary, practical background to allow engineering students to acquire and process their own digital image or video data - Culminates with a diverse set of applications chapters, covered in sufficient depth to serve as extensible models to the reader's own potential applications About the Editor... Al Bovik is the Cullen Trust for Higher Education Endowed Professor at The University of Texas at Austin, where he is the Director of the Laboratory for Image and Video Engineering (LIVE). He has published over 400 technical articles in the general area of image and video processing and holds two U.S. patents. Dr. Bovik was Distinguished Lecturer of the IEEE Signal Processing Society (2000), received the IEEE Signal Processing Society Meritorious Service Award (1998), the IEEE Third Millennium Medal (2000), and twice was a two-time Honorable Mention winner of the international Pattern Recognition Society Award. He is a Fellow of the IEEE, was Editor-in-Chief, of the IEEE Transactions on Image Processing (1996-2002), has served on and continues to serve on many other professional boards and panels, and was the Founding General Chairman of the IEEE International Conference on Image Processing which was held in Austin, Texas in 1994.
Article
We propose a new universal objective image quality index, which is easy to calculate and applicable to various image processing applications. Instead of using traditional error summation methods, the proposed index is designed by modeling any image distortion as a combination of three factors: loss of correlation, luminance distortion, and contrast distortion. Although the new index is mathematically defined and no human visual system model is explicitly employed, our experiments on various image distortion types indicate that it performs significantly better than the widely used distortion metric mean squared error. Demonstrative images and an efficient MATLAB implementation of the algorithm are available online at http://anchovy.ece.utexas.edu//spl sim/zwang/research/quality_index/demo.html.