About
255
Publications
121,122
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
86,754
Citations
Publications
Publications (255)
Mesh quality assessment (MQA) models play a critical role in the design, optimization, and evaluation of mesh operation systems in a wide variety of applications. Current MQA models, whether model-based methods using topology-aware features or projection-based approaches working on rendered 2D projections, often fail to capture the intricate intera...
No-reference bitstream-layer point cloud quality assessment (PCQA) can be deployed without full decoding at any network node to achieve real-time quality monitoring. In this work, we focus on the PCQA problem dedicated to Octree-RAHT encoding mode. First, to address the issue that existing PCQA databases have a small scale and limited distortion le...
No-reference bitstream-layer point cloud quality assessment (PCQA) can be deployed without full decoding at any network node to achieve real-time quality monitoring. In this work, we develop the first PCQA model dedicated to Trisoup-Lifting encoded 3D point clouds by analyzing bitstreams without full decoding. Specifically, we investigate the relat...
Despite substantial efforts dedicated to the design of heuristic models for omnidirectional (i.e., 360$^\circ$) image quality assessment (OIQA), a conspicuous gap remains due to the lack of consideration for the diversity of viewing behaviors that leads to the varying perceptual quality of 360$^\circ$ images. Two critical aspects underline this ove...
With the increasing demand of compressing and streaming 3D point clouds under constrained bandwidth, it has become ever more important to accurately and efficiently determine the quality of compressed point clouds, so as to assess and optimize the quality-of-experience (QoE) of end users. Here we make one of the first attempts developing a bitstrea...
In practical media distribution systems, visual content usually undergoes multiple stages of quality degradation along the delivery chain, but the pristine source content is rarely available at most quality monitoring points along the chain to serve as a reference for quality assessment. As a result, full-reference (FR) and reduced-reference (RR) i...
Banding or false contour is an annoying visual artifact whose impact negatively degrades the perceptual quality of visual content. Since users are increasingly expecting better visual quality from such content and banding leads to deteriorated quality-of-experience, the area of banding removal or debanding has taken paramount importance. Existing d...
3D point clouds have found a wide variety of applications in multimedia processing, remote sensing, and scientific computing. Although most point cloud processing systems are developed to improve viewer experiences, little work has been dedicated to perceptual quality assessment of 3D point clouds. In this work, we build a new 3D point cloud databa...
Out-of-focus sections of whole slide images are a significant source of false positives and other systematic errors in clinical diagnoses. As a result, focus quality assessment (FQA) methods must be able to quickly and accurately differentiate between focus levels in a scan. Recently, deep learning methods using convolutional neural networks (CNNs)...
The real-world applications of 3D point clouds have been growing rapidly in recent years, but not much effective work has been dedicated to perceptual quality assessment of colored 3D point clouds. In this work, we first build a large 3D point cloud database for subjective and objective quality assessment of point clouds. We construct 20 high quali...
In practical media distribution systems, visual content usually undergoes multiple stages of quality degradation along the delivery chain, but the pristine source content is rarely available at most quality monitoring points along the chain to serve as a reference for quality assessment. As a result, full-reference (FR) and reduced-reference (RR) i...
Banding or false contour is an annoying visual artifact whose impact is even more pronounced in ultra high definition, high dynamic range, and wide colour gamut visual content, which is becoming increasingly popular. Since users associate a heightened expectation of quality with such content and banding leads to deteriorated visual quality-of-exper...
The enormous space and diversity of natural images is usually represented by a few small-scale human-rated image quality assessment (IQA) datasets. This casts great challenges to deep neural network (DNN) based blind IQA (BIQA), which requires large-scale training data that is representative of the natural image distribution. It is extremely diffic...
AI technology has made remarkable achievements in computational pathology (CPath), especially with the help of deep neural networks. However, the network performance is highly related to architecture design, which commonly requires human experts with domain knowledge. In this paper, we combat this challenge with the recent advance in neural archite...
Image quality assessment (IQA) models aim to establish a quantitative relationship between visual images and their quality as perceived by human observers. IQA modeling plays a special bridging role between vision science and engineering practice, both as a test-bed for vision theories and computational biovision models and as a powerful tool that...
Image quality assessment (IQA) models aim to establish a quantitative relationship between visual images and their perceptual quality by human observers. IQA modeling plays a special bridging role between vision science and engineering practice, both as a test-bed for vision theories and computational biovision models, and as a powerful tool that c...
Out-of-focus microscopy lens in digital pathology is a critical bottleneck in high-throughput Whole Slide Image (WSI) scanning platforms, for which pixel-level automated Focus Quality Assessment (FQA) methods are highly desirable to help significantly accelerate the clinical workflows. Existing FQA methods include both knowledge-driven and data-dri...
The diversity of video delivery pipeline poses a grand challenge to the evaluation of adaptive bitrate (ABR) streaming algorithms and objective quality-of-experience (QoE) models. Here we introduce so-far the largest subject-rated database of its kind, namely WaterlooSQoE-IV, consisting of 1350 adaptive streaming videos created from diverse source...
Out-of-focus microscopy lens in digital pathology is a critical bottleneck in high-throughput Whole Slide Image (WSI) scanning platforms, for which pixel-level automated Focus Quality Assessment (FQA) methods are highly desirable to help significantly accelerate the clinical workflows. Existing FQA methods include both knowledge-driven and data-dri...
Many multimedia applications require precise understanding of the rate-distortion characteristics measured by the function relating visual quality to media attributes, for which we term it the generalized rate-distortion (GRD) function. In this study, we explore the GRD behavior of compressed digital videos in a two-dimensional space of bitrate and...
Macroblocking is a type of widely observed video artifact where severe block-shaped artifacts appear in video frames. Macroblocking may be produced by heavy lossy compression but is visually most annoying when transmission error such as packet loss occurs during network video transmission. Since receivers do not have access to the pristine-quality...
With the wider availability of High Dynamic Range (HDR) Wide Colour Gamut (WCG) content, both consumers and content producers have become more concerned about the preservation of creative intent. While the accurate representation of colour plays a vital role in preserving creative intent, there are relatively fewer objective image and video quality...
Rate-distortion (RD) theory is at the heart of lossy data compression. Here we aim to model the generalized RD (GRD) trade-off between the visual quality of a compressed video and its encoding profiles (
e.g.
, bitrate and spatial resolution). We first define the theoretical functional space
$\mathcal {W}$
of the GRD function by analyzing its ma...
We propose a deep bilinear model for blind image quality assessment (BIQA) that works for both synthetically and authentically distorted images. Our model constitutes two streams of deep convolutional neural networks (CNN), specializing in the two distortion scenarios separately. For synthetic distortions, we first pre-train a CNN to classify the d...
Rate-distortion (RD) analysis is at the heart of lossy data compression. Here we extend the idea to generalized RD (GRD) functions of compressed videos that characterize the visual quality of a video and its encoding profile, which includes not only bit rate but also other attributes such as video resolution. We first define the theoretical functio...
We propose a fast multi-exposure image fusion (MEF) method, namely MEF-Net, for static image sequences of arbitrary spatial resolution and exposure number. We first feed a low-resolution version of the input sequence to a fully convolutional network for weight map prediction. We then jointly upsample the weight maps using a guided filter. The final...
The fundamental conflict between the enormous space of adaptive streaming videos and the limited capacity for subjective experiment casts significant challenges to objective Quality-of-Experience (QoE) prediction. Existing objective QoE models exhibit complex functional form, failing to generalize well in diverse streaming environments. In this stu...
High Dynamic Range (HDR) Wide Color Gamut (WCG) Ultra High Definition (4K/UHD) content has become increasingly popular recently. Due to the increased data rate, novel video compression methods have been developed to maintain the quality of the videos being delivered to consumers under bandwidth constraints. This has led to new challenges for the de...
Image quality assessment (IQA) algorithms aim to predict perceived image quality by human observers. Over the last two decades, a large amount of work has been carried out in the field. New algorithms are being developed at a rapid rate in different areas of IQA, but are often tested and compared with limited existing models using out-of-date test...
A common approach to high dynamic range (HDR) imaging is to capture multiple images of different exposures followed by multi-exposure image fusion (MEF) in either radiance or intensity domain. A predominant problem of this approach is the introduction of the ghosting artifacts in dynamic scenes with camera and object motion. While many MEF methods...
In real-world visual content acquisition and distribution systems, a vast majority of visual content undergoes multiple distortions between the source and the end user. However, traditional image quality assessment (IQA) algorithms are usually validated and at times trained on image databases with a single distortion stage. Existing IQA methods for...
4K, ultra high-definition (UHD), and higher resolution video contents have become increasingly popular recently. The largely increased data rate casts great challenges to video compression and communication technologies. Emerging video coding methods are claimed to achieve superior performance for high-resolution video content, but thorough and ind...
We propose a deep bilinear model for blind image quality assessment (BIQA) that handles both synthetic and authentic distortions. Our model consists of two convolutional neural networks (CNN), each of which specializes in one distortion scenario. For synthetic distortions, we pre-train a CNN to classify image distortion type and level, where we enj...
Many multimedia applications require precise understanding of the rate-distortion characteristics measured by the function relating visual quality to media attributes, for which we term it the generalized rate-distortion (GRD) function. In this study, we explore the GRD behavior of compressed digital videos in a three-dimensional space of bitrate,...
In many science and engineering fields that require computational models to predict certain physical quantities, we are often faced with the selection of the best model under the constraint that only a small sample set can be physically measured. One such example is the prediction of human perception of visual quality, where sample images live in a...
Blind video quality assessment (BVQA) algorithms are traditionally designed with a two-stage approach - a feature extraction stage that computes typically hand-crafted spatial and/or temporal features, and a regression stage working in the feature space that predicts the perceptual quality of the video. Unlike the traditional BVQA methods, we propo...
The dynamic adaptive streaming over HTTP (DASH) provides an inter-operable solution to overcome volatile network conditions, but how the human visual quality-ofexperience (QoE) changes with time-varying video quality is not well-understood. Here, we build a large-scale video database of time-varying quality and design a series of subjective experim...
We propose a multi-exposure image fusion (MEF) algorithm by optimizing a novel objective quality measure, namely the color MEF structural similarity (MEF-SSIM
<sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">c</sub>
) index. The design philosophy we introduce here is substantially different from existin...
We propose a Multi-task End-to-end Optimized deep neural Network (MEON) for blind image quality assessment (BIQA). MEON consists of two sub-networks—a distortion identification network and a quality prediction network—sharing the early layers. Unlike traditional methods used for training multi-task networks, our training process is performed in two...
With the remarkable growth of adaptive streaming media applications, especially the wide usage of dynamic adaptive streaming schemes over HTTP (DASH), it becomes ever more important to understand the perceptual quality-of-experience (QoE) of end users, who may be constantly experiencing adaptations (switchings) of video bitrate, spatial resolution,...
In practical media distribution systems, visual content often undergoes multiple stages of quality degradations along
the delivery chain between the source and destination. By
contrast, current image quality assessment (IQA) models are
typically validated on image databases with a single distortion stage. In this work, we construct two large-scale...
Digital images in the real-world are created by a variety of means and have diverse properties. A photographical natural scene image (NSI) may exhibit substantially different characteristics from a computer graphic image (CGI) or a screen content image (SCI). This casts major challenges to objective image quality assessment, for which existing appr...
Objective assessment of image quality is fundamentally important in many image processing tasks. In this work, we focus on learning blind image quality assessment (BIQA) models which predict the quality of a digital image with no access to its original pristine-quality counterpart as reference. One of the biggest challenges in learning BIQA models...
Blind image quality assessment (BIQA) aims to estimate the subjective quality of a query image without access to the reference image. Existing learning based methods typically train a regression function by minimizing the average error between subjective opinion scores and model predictions. However, minimizing average error does not necessarily le...
We propose a simple yet effective structural patch decomposition (SPD) approach for multi-exposure image fusion (MEF) that is robust to ghosting effect. We decompose an image patch into three conceptually independent components: signal strength, signal structure, and mean intensity. Upon fusing these three components separately, we reconstruct a de...
Blind image quality assessment (BIQA) of distorted stereo-scopic pairs without referring to the undistorted source is a challenging problem, especially when the distortions in the left-and right-views are asymmetric. Existing studies suggest that simply averaging the quality of the left-and right-views well predicts the quality of symmetrically dis...
Objective quality assessment of stereoscopic 3D video is challenging but highly desirable, especially in the application of stereoscopic video compression and transmission, where useful quality models are missing that can guide the critical decision making steps in the selection of mixed-resolution coding, asymmetric quantization, and pre- and post...
Subjective and objective measurement of the perceptual quality of depth information in symmetrically and asymmetrically distorted stereoscopic images is a fundamentally important issue in stereoscopic 3D imaging that has not been deeply investigated. Here we first carry out a subjective test following the traditional absolute category rating protoc...
The human visual system excels at detecting local blur of visual images, but the underlying mechanism is mysterious. Traditional views of blur such as reduction in local or global high-frequency energy and loss of local phase coherence have fundamental limitations. For example, they cannot well discriminate flat regions from blurred ones. Here we a...
The great content diversity of real-world digital images poses a grand challenge to image quality assessment (IQA) models, which are traditionally designed and validated on a handful of commonly used IQA databases with very limited content variation. To test the generalization capability and to facilitate the wide usage of IQA techniques in realwor...
With the rapid growth of streaming media applications, there has been a strong demand of quality-of-experience (QoE) measurement and QoE-driven video delivery technologies. Most existing methods rely on bitrate and global statistics of stalling events for QoE prediction. This is problematic for two reasons. First, using the same bitrate to encode d...
With the fast advances in video acquisition, computational imaging, and display technologies, there has been a growing interest in high dynamic range (HDR) videos. Tone mapping operators (TMOs) that convert HDR content to low dynamic range (LDR) ones provide a practically useful solution for the visualization of HDR videos on standard LDR displays,...
Network streaming video services have been growing explosively in the past decade, but how to measure and assure the video quality-of-experience (QoE) of end consumers is still an open problem. Poor presentation quality and playback stalling have been identified as the most dominant factors that degrade user QoE. Although both factors have been stu...
propose a structural similarity (SSIM) motivated two-pass VBR rate control algorithm for High Efficiency Video Coding (HEVC). Given a bit rate budget, the available bits are optimally allocated at group of pictures (GoP), frame, and coding unit (CU) levels by hierarchically constructing a perceptually uniform space with an SSIM-inspired divisive no...
Screen content image (SCI) has recently emerged as an active topic due to the rapidly increasing demand in many graphically rich services such as wireless displays and virtual desktops. Image quality models play an important role in measuring and optimizing user experience of SCI compression and transmission systems, but are currently lacking. SCIs...
The state-of-the-art High Efficiency Video Coding (HEVC) standard adopts a hierarchical coding structure to improve its coding efficiency. This allows for the Quantization Parameter Cascading (QPC) scheme that assigns Quantization Parameters (Qps) to different hierarchical layers in order to further improve the Rate-Distortion (RD) performance. How...
In this paper we aim to tackle the problem of re-constructing a high-resolution image from a single low-resolution input image, known as single image super-resolution. In the literature, sparse representation has been used to address this problem, where it is assumed that both low-resolution and high-resolution images share the same sparse represen...
Background:
Dual energy subtraction (DES) radiography is a powerful but underutilized technique which aims to improve the diagnostic value of an X-ray by separating soft tissue from bones, producing two different images. Compared to traditional chest X-rays, DES requires exposure to higher doses of radiation but may achieve higher accuracy. The ob...
Contrast is a fundamental attribute of images that plays an important role in human visual perception of image quality. With numerous approaches proposed to enhance image contrast, much less work has been dedicated to automatic quality assessment of contrast changed images. Existing approaches rely on global statistics to estimate contrast quality....
The popular Structural SIMilarity (SSIM) index has shown to be a good perceptual criterion for testing and optimizing video encoders such as the MPEG-H/H.265 High Efficiency Video Coding (HEVC). However, it is still unclear how to compare two HEVC encoders with a number of bit rates and SSIM values. In this work, we study the video quality comparis...
How to deliver videos to consumers over the network for optimal quality-of-experience (QoE) has been the central goal of modern video delivery services. Surprisingly, regardless of the large volume of videos being delivered everyday through various systems attempting to improve visual QoE, the actual QoE of end consumers is not properly assessed, n...