Hanhe Lin

Hanhe Lin
Robert Gordon University | RGU · National Subsea Centre

Doctor of Philosophy

About

66
Publications
14,254
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
609
Citations
Introduction
I am currently working at the National Subsea Centre, Robert Gordon University, UK. My research interests include machine learning, computer vision. My current project is applying machine learning and deep learning for image/video processing, analysis and visual quality assessment.
Additional affiliations
October 2016 - August 2021
Universität Konstanz
Position
  • PostDoc Position
July 2012 - September 2016
University of Otago
Position
  • PhD Student

Publications

Publications (66)
Article
Full-text available
Convolutional neural networks (CNNs) have significantly advanced computational modelling for saliency prediction. However, accurately simulating the mechanisms of visual attention in the human cortex remains an academic challenge. It is critical to integrate properties of human vision into the design of CNN architectures, leading to perceptually mo...
Conference Paper
Full-text available
Although image quality assessment (IQA) in-the-wild has been researched in computer vision, it is still challenging to precisely estimate perceptual image quality in the presence of real-world complex and composite distortions. In order to improve machine learning solutions for IQA, we consider side information denoting the presence of distortions...
Preprint
Full-text available
Convolutional neural networks (CNNs) have significantly advanced computational modeling for saliency prediction. However, the inherent inductive biases of convolutional architectures cause insufficient long-range contextual encoding capacity, which potentially makes a saliency model less humanlike. Transformers have shown great potential in encodin...
Article
Full-text available
In subjective full-reference image quality assessment, a reference image is distorted at increasing distortion levels. The differences between perceptual image qualities of the reference image and its distorted versions are evaluated, often using degradation category ratings (DCR). However, the DCR has been criticized since differences between rati...
Preprint
Full-text available
In subjective full-reference image quality assessment, differences between perceptual image qualities of the reference image and its distorted versions are evaluated, often using degradation category ratings (DCR). However, the DCR has been criticized since differences between rating categories on this ordinal scale might not be perceptually equidi...
Article
Full-text available
Video quality assessment (VQA) methods focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either artificial or authentical...
Article
Full-text available
Like many low- and middle-income countries, Nepal is experiencing a massive motorization, predominantly from increased use of motorcycles which is driving a surge in road-related injuries and fatalities. Motorcycles and their riders have been identified as a focal point for road traffic injury prevention measures. While helmet use is mandatory for...
Conference Paper
Background Mandatory motorcycle helmet use regulation is essential, but its enforcement is even more important for head injury prevention, especially in a country like Nepal with a high share of motorcycle traffic. We assessed the impact of one-sided motorcycle helmet use regulation in Nepal, where helmet use is mandatory, but only drivers are fine...
Conference Paper
Background The motorcycle is the main form of transport for many road users in the world, especially in low- and middle-income countries (LMIC). and since motorcycle riders are critically vulnerable in case of a crash, there should be strong enforcement of road safety related rules, such as helmet use. However, insufficient resources in LMIC hinder...
Chapter
Full-text available
We propose to use a quality estimator and evolutionary methods to search the latent space of generative adversarial networks trained on small, difficult datasets, or both. The new method leads to the generation of significantly higher quality images while preserving the original generator’s diversity. Human raters preferred an image from the new ve...
Conference Paper
Full-text available
Saliency has been widely studied in relation to image quality assessment (IQA). The optimal use of saliency in IQA met-rics, however, is nontrivial and largely depends on whether saliency can be accurately predicted for images containing various distortions. Although tremendous progress has been made in saliency modelling, very little is known abou...
Preprint
Full-text available
We propose to use a quality estimator and evolutionary methods to search the latent space of generative adversarial networks trained on small, difficult datasets, or both. The new method leads to the generation of significantly higher quality images while preserving the original generator's diversity. Human raters preferred an image from the new ve...
Preprint
Full-text available
Super-resolution aims at increasing the resolution and level of detail within an image. The current state of the art in general single-image super-resolution is held by NESRGAN+, which injects a Gaussian noise after each residual layer at training time. In this paper, we harness evolutionary methods to improve NESRGAN+ by optimizing the noise injec...
Article
Full-text available
Automated detection of motorcycle helmet use through video surveillance can facilitate efficient education and enforcement campaigns that increase road safety. However, existing detection approaches have a number of shortcomings, such as the inabilities to track individual motorcycles through multiple frames, or to distinguish drivers from passenge...
Article
Full-text available
Current benchmarks for optical flow algorithms evaluate the estimation either directly by comparing the predicted flow fields with the ground truth or indirectly by using the predicted flow fields for frame interpolation and then comparing the interpolated frames with the actual frames. In the latter case, objective quality measures such as the mea...
Conference Paper
Full-text available
Super-resolution increases the resolution of an image. Using evolutionary optimization, we optimize the noise injection of a super-resolution method for improving the results. More generally, our approach can be used to optimize any method based on noise injection.
Conference Paper
Full-text available
Professional video editing tools can generate slow-motion video by interpolating frames from video recorded at a standard frame rate. Thereby the perceptual quality of such interpolated slow-motion videos strongly depends on the underlying interpolation techniques. We built a novel benchmark database that is specifically tailored for interpolated s...
Conference Paper
Full-text available
Video streaming under real-time constraints is an increasingly widespread application. Many recent video encoders are unsuitable for this scenario due to theoretical limitations or run time requirements. In this paper, we present a framework for the perceptual evaluation of foveated video coding schemes. Foveation describes the process of adapting...
Article
Full-text available
The Satisfied User Ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, gives the distribution function of the Just Noticeable Difference (JND), the smallest distortion level that can be perceived by a subject when a reference image is compared to a distorted one. A sequence of JNDs can be defined with a suitable successive choice of...
Article
Full-text available
Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10...
Preprint
Full-text available
Multi-level deep-features have been driving state-of-the-art methods for aesthetics and image quality assessment (IQA). However, most IQA benchmarks are comprised of artificially distorted images, for which features derived from ImageNet under-perform. We propose a new IQA dataset and a weakly supervised feature learning approach to train features...
Preprint
Full-text available
Current benchmarks for optical flow algorithms evaluate the estimation either directly by comparing the predicted flow fields with the ground truth or indirectly by using the predicted flow fields for frame interpolation and then comparing the interpolated frames with the actual frames. In the latter case, objective quality measures such as the mea...
Preprint
Full-text available
Video Quality Assessment (VQA) methods have been designed with a focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either...
Article
The continuous motorization of traffic has led to a sustained increase in the global number of road related fatalities and injuries. To counter this, governments are focusing on enforcing safe and law-abiding behavior in traffic. However, especially in developing countries where the motorcycle is the main form of transportation, there is a lack of...
Preprint
Full-text available
Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content, and annotating it accurately. We present a systematic and scalable approach to create KonIQ-10k, the largest IQA dataset to date consisting of 10,0...
Preprint
Full-text available
Subjective perceptual image quality can be assessed in lab studies by human observers. Objective image quality assessment (IQA) refers to algorithms for estimation of the mean subjective quality ratings. Many such methods have been proposed, both for blind IQA in which no original reference image is available as well as for the full-reference case....
Conference Paper
Full-text available
Current benchmarks for optical flow algorithms evaluate the estimation quality by comparing their predicted flow field with the ground truth, and additionally may compare interpolated frames, based on these predictions, with the correct frames from the actual image sequences. For the latter comparisons, objective measures such as mean square errors...
Conference Paper
Full-text available
The Satisfied User Ratio (SUR) curve for a lossy image compression scheme, e.g., JPEG, characterizes the probability distribution of the Just Noticeable Difference (JND) level, the smallest distortion level that can be perceived by a subject. We propose the first deep learning approach to predict such SUR curves. Instead of the direct approach of r...
Conference Paper
Full-text available
Current artificially distorted image quality assessment (IQA) databases are small in size and limited in content. Larger IQA databases that are diverse in content could benefit the development of deep learning for IQA. We create two datasets, the Konstanz Artificially Distorted Image quality Database (KADID-10k) and the Konstanz Artificially Distor...
Preprint
Full-text available
Current benchmarks for optical flow algorithms evaluate the estimation quality by comparing their predicted flow field with the ground truth, and additionally may compare interpolated frames, based on these predictions, with the correct frames from the actual image sequences. For the latter comparisons, objective measures such as mean square errors...
Conference Paper
Full-text available
Image quality has been studied almost exclusively as a global image property. It is common practice for IQA databases and metrics to quantify this abstract concept with a single number per image. We propose an approach to blind IQA based on a convolutional neural network (patchnet) that was trained on a novel set of 32,000 individually annotated pa...
Conference Paper
Full-text available
One of the main challenges in no-reference video quality assessment is temporal variation in a video. Methods typically were designed and tested on videos with artificial distortions, without considering spatial and temporal variations simultaneously. We propose a no-reference spatiotemporal feature combination model which extracts spatiotemporal i...
Conference Paper
Full-text available
We propose a screening approach to find reliable and effectively expert crowd workers in image quality assessment (IQA). Our method measures the users' ability to identify image degradations by using test questions, together with several relaxed reliability checks. We conduct multiple experiments, obtaining reproducible results with a high agreemen...
Article
Full-text available
The main challenge in applying state-of-the-art deep learning methods to predict image quality in-the-wild is the relatively small size of existing quality scored datasets. The reason for the lack of larger datasets is the massive resources required in generating diverse and publishable content. We present a new systematic and scalable approach to...
Conference Paper
Full-text available
Segmenting sensor events for activity recognition has many key challenges due to its unsupervised nature, the real-time requirements necessary for on-line event detection, and the possibility of having to recognise overlapping activities. A further challenge is to achieve robustness of classification due to sub-optimal choice of window size. In thi...
Conference Paper
Full-text available
Detecting abnormal events in video surveillance is a challenging problem due to the large scale, stream fashion video data as well as the real-time constraint. In this paper, we present an online, adaptive, and real-time framework to address this problem. The spatial locations in a frame is partitioned into grids, in each grid the proposed Adaptive...
Conference Paper
This paper presents a novel framework to detect shot boundaries based on the One-Class Support Vector Machine (OCSVM). Instead of comparing the difference between pair-wise consecutive frames at a specific time, we measure the divergence between two OCSVM clas-sifiers, which are learnt from two contextual sets, i.e., immediate past set and immediat...
Book
Full-text available
There is an increasing interest in crowd scene analysis in video surveillance due to the ubiquitously deployed video surveillance systems in public places with high density of objects amid the increasing concern on public security and safety. A comprehensive crowd scene analysis approach is required to not only be able to recognize crowd events and...
Conference Paper
Full-text available
We propose a novel, online adaptive one-class support vector machines algorithm for anomaly detection in crowd scenes. Integrating incremental and decremental one-class support vector machines with a sliding buffer offers an efficient and effective scheme, which not only updates the model in an online fashion with low computational cost, but also d...
Conference Paper
Full-text available
Crowd scene analysis has caught significant attention both in academia and industry as it has a great number of potential applications. In this paper, we propose a novel spatial-temporal pyramid matching scheme for crowd scene analysis. Video segments are represented as concatenated histograms of all cells at all pyramid levels with corresponding w...
Conference Paper
Full-text available
We propose a new video manifold learning method for event recognition and anomaly detection in crowd scenes. A novel feature descriptor is proposed to encode regional optical flow features of video frames, where quantization and binarization of the feature code are employed to improve the differentiation of crowd motion patterns. Based on the new f...
Conference Paper
Full-text available
Using video manifold to analyze video scenes and detect possible anomaly has become a popular research topic in recent years. While a number of attempts have been proposed and reported promising outcomes, there is currently a lack of understanding about the parameter setting for various components in the algorithmic framework. In this paper we look...

Network