Technical ReportPDF Available

Revolutionizing Image Enhancement: Kenoobi Image Upscaler -Unleashing the Power of AI for Unprecedented High-Quality Image Reconstruction

Authors:
  • Kenoobi Group Limited

Abstract

In this technical report, we introduce Kenoobi Image Upscaler, an advanced AI model that enhances the quality of any image beyond imagination. The model uses advanced AI algorithms for color correction, shallow and deep feature extraction, and high-quality (HQ) image reconstruction. The model is trained using a large dataset of high-resolution images, and its performance is evaluated using various experiments. The results show that Kenoobi Image Upscaler can restore and enhance the details, colors, and textures of an image, creating a stunningly crisp and vibrant image.
Revolutionizing Image Enhancement:
Kenoobi Image Upscaler - Unleashing the
Power of AI for Unprecedented
High-Quality Image Reconstruction
Prepared by Kenoobi AI
Allan Mukhwana, Evans Omondi, Diana Ambale, Macrine
Akinyi, Kelvin Mukhwana, Richard Omollo, Ben Chege
Abstract:
In this technical report, we introduce Kenoobi Image Upscaler, an advanced AI
model that enhances the quality of any image beyond imagination. The model
uses advanced AI algorithms for color correction, shallow and deep feature
extraction, and high-quality (HQ) image reconstruction. The model is trained
using a large dataset of high-resolution images, and its performance is evaluated
using various experiments. The results show that Kenoobi Image Upscaler can
restore and enhance the details, colors, and textures of an image, creating a
stunningly crisp and vibrant image.
Introduction:
Image upscaling, also known as image enlargement or image interpolation,
refers to the process of increasing the resolution and size of an image while
maintaining or enhancing its visual quality. It is a challenging task because
traditional upscaling methods often produce unsatisfactory results, leading to
blurry, pixelated, or unrealistic images. However, recent advancements in
artificial intelligence (AI) and deep learning have revolutionized the field of
image upscaling, offering more sophisticated and effective solutions.
Kenoobi Image Upscaler is an advanced AI model specifically designed to
enhance the quality of images beyond imagination. By leveraging
state-of-the-art algorithms, Kenoobi Image Upscaler can significantly improve
the visual appearance of various types of images, including old photographs,
blurry screenshots, and pixelated graphics. The model's primary objective is to
restore and enhance the details, colors, and textures of the input image,
resulting in a high-quality output image that is both visually appealing and
realistic.
One of the key components of Kenoobi Image Upscaler is its ability to perform
shallow and deep feature extraction. Shallow feature extraction involves
capturing low-level details and textures, while deep feature extraction focuses on
extracting high-level semantic features from the image. By combining both
levels of features, Kenoobi Image Upscaler can generate an output image that
exhibits enhanced details, realistic colors, and improved overall visual quality.
The shallow feature extraction technique employs convolutional neural networks
(CNNs) to extract features at different levels of abstraction. These networks
analyze the image at a pixel level, capturing fine details, edges, and textures.
The extracted shallow features are then processed and enhanced using various
filters and algorithms, effectively improving the image's local characteristics.
On the other hand, the deep feature extraction technique utilizes deep neural
networks (DNNs) to capture high-level semantic features from the image. The
DNN architecture is designed to recognize complex patterns, shapes, and global
structures within the image. By leveraging the hierarchical representation
learned by the DNN, Kenoobi Image Upscaler can capture and incorporate the
global context information, resulting in a more accurate and visually pleasing
output.
To further enhance the image quality, Kenoobi Image Upscaler employs
high-quality (HQ) image reconstruction techniques. The model utilizes
generative adversarial networks (GANs), which consist of a generator network
and a discriminator network. The generator network synthesizes a
high-resolution image based on the extracted features, while the discriminator
network assesses the quality and realism of the generated image. Through an
adversarial training process, the generator network learns to create high-quality
and visually convincing output images that deceive the discriminator network.
In summary, Kenoobi Image Upscaler represents a significant advancement in
the field of image upscaling, leveraging AI algorithms, shallow and deep feature
extraction techniques, and high-quality image reconstruction. By capturing and
enhancing the details, colors, and textures of the input image, Kenoobi Image
Upscaler produces high-quality output images that are crisp, vibrant, and
visually stunning. The model's ability to go beyond traditional upscaling
methods makes it a powerful tool for improving the visual quality of various
types of images.
Color:
Color enhancement is a crucial aspect of image upscaling, as it plays a significant
role in creating visually appealing and realistic output images. Kenoobi Image
Upscaler incorporates a sophisticated color correction algorithm that focuses on
enhancing the color accuracy and distribution of the input image.
The color correction algorithm employed by Kenoobi Image Upscaler utilizes a
technique called color transfer. This technique aims to map the color distribution
of the input image to that of a reference image. The reference image is carefully
chosen based on its similarity to the input image in terms of content, style, or
desired color characteristics. By transferring the color distribution from the
reference image to the input image, Kenoobi Image Upscaler can create an
output image with a more natural and realistic color representation.
The process of color transfer involves several key steps:
1. Color Space Conversion: The input image and the reference image are both
transformed into a common color space, such as RGB (Red, Green, Blue) or Lab
(Lightness, a, b). This conversion ensures that the color information of both
images is represented consistently and can be easily manipulated.
2. Color Distribution Analysis: The color distribution of the reference image is
analyzed to capture its statistical properties, such as color histograms or color
moments. These statistics provide insights into the color distribution patterns of
the reference image, which will later guide the color transfer process.
3. Color Mapping: The color mapping step involves mapping the colors of the
input image to the corresponding colors in the reference image. This mapping is
based on the statistical analysis performed on the reference image's color
distribution. Various techniques can be used for color mapping, such as
histogram matching, statistical modeling, or optimization-based approaches.
4. Color Transfer: Once the color mapping is determined, the color transfer
process takes place. The colors of the input image are adjusted based on the
mapped colors from the reference image. This adjustment aims to align the color
distribution of the input image with that of the reference image, ensuring a more
realistic and visually pleasing color representation.
5. Color Space Conversion (Inverse): After the color transfer is performed, the
modified image is converted back to its original color space. This step restores
the image to its original color representation while incorporating the enhanced
color distribution achieved through the color transfer process.
By applying the color correction algorithm, Kenoobi Image Upscaler can
effectively enhance the color accuracy and distribution of the input image. The
color transfer technique ensures that the output image's color representation
closely matches that of the reference image, resulting in a more natural and
visually appealing color appearance. This enhancement contributes significantly
to the overall quality of the upscaled image, providing a more immersive and
realistic visual experience.
Image Shallow Feature Extraction:
Kenoobi Image Upscaler incorporates a powerful shallow feature extraction
algorithm that focuses on capturing the low-level details and textures of the
input image. This algorithm leverages the capabilities of convolutional neural
networks (CNNs), a class of deep learning models well-suited for image analysis
tasks.
The shallow feature extraction process involves the following steps:
1. Convolutional Layers: The CNN architecture consists of multiple
convolutional layers, each comprising a set of learnable filters or kernels. These
filters convolve across the input image, capturing local patterns and features. At
each convolutional layer, the filters detect specific visual attributes, such as
edges, corners, and textures, by applying convolution operations to small
receptive fields.
2. Activation Functions: After each convolutional layer, nonlinear activation
functions are applied to introduce nonlinearity into the model. Common
activation functions used in CNNs include Rectified Linear Unit (ReLU), which
sets negative values to zero and preserves positive values. Activation functions
help in capturing complex and nonlinear relationships within the image, allowing
the model to learn more expressive representations.
3. Pooling Layers: Interspersed among the convolutional layers are pooling
layers, such as max pooling or average pooling. These layers downsample the
feature maps, reducing their spatial dimensions while preserving the most
salient information. Pooling helps in capturing the most discriminative features
and enhancing the model's ability to generalize and detect patterns across
different scales and translations.
4. Filter Processing: After passing through the convolutional and pooling
layers, the extracted features undergo further processing to enhance their
quality and reduce noise. This step typically involves applying a series of filters,
such as Gaussian filters or median filters, to smooth the feature maps and
remove unwanted artifacts or pixel-level noise.
5. Feature Combination: The processed features from different layers are
combined to create a comprehensive representation of the input image. This
fusion process integrates features captured at different levels of abstraction,
capturing both fine-grained details and broader structural information. By
combining features from multiple layers, the shallow feature extraction
algorithm can effectively capture and represent the intricate visual patterns and
textures present in the input image.
The output of the shallow feature extraction algorithm provides a rich
representation of the low-level details and textures present in the input image.
These features serve as a foundation for subsequent stages of the image
upscaling process, enabling Kenoobi Image Upscaler to reconstruct a
high-resolution image with enhanced details and textures.
By utilizing the power of CNNs and incorporating techniques such as activation
functions, pooling layers, and filter processing, Kenoobi Image Upscaler's
shallow feature extraction algorithm captures and amplifies the subtle visual
characteristics of the input image. This process contributes to the overall
improvement of the output image's quality, ensuring that the upscaled image
exhibits enhanced details and textures that were previously obscured or lost.
Image Deep Feature Extraction:
Kenoobi Image Upscaler's deep feature extraction algorithm is a crucial
component that focuses on capturing the high-level features of the input image.
This algorithm utilizes a deep neural network (DNN), which is specifically
designed to extract abstract and semantic representations from complex visual
data.
The deep feature extraction process involves several key steps:
1. Preprocessing: Prior to feeding the input image into the DNN, preprocessing
steps are applied to ensure optimal performance. These steps typically involve
resizing the image to a standard size, normalizing the pixel values, and applying
data augmentation techniques such as random rotations, flips, or crops.
Preprocessing increases the diversity of training samples and helps the model
generalize to different variations of the input image.
2. Convolutional Layers: The DNN architecture consists of multiple
convolutional layers. Each layer comprises a set of learnable filters or kernels
that convolve across the input image. The filters analyze the image at different
spatial scales, capturing complex visual patterns and structures. As the layers
progress deeper into the network, the filters become increasingly specialized,
detecting higher-level features and more abstract representations.
3. Activation Functions: Nonlinear activation functions, such as Rectified
Linear Unit (ReLU) or Leaky ReLU, are applied after each convolutional layer.
These functions introduce nonlinearity into the model, enabling it to capture
complex relationships and nonlinearities present in the input image. Activation
functions enhance the network's ability to learn discriminative features and
improve its representation power.
4. Pooling Layers: Pooling layers are interspersed among the convolutional
layers. These layers reduce the spatial dimensions of the feature maps while
retaining the most salient information. Common pooling operations include max
pooling or average pooling, which aggregate the local features and downsample
the feature maps. Pooling enhances the model's translation invariance, allowing
it to capture important features regardless of their exact position within the
image.
5. Fully Connected Layers: Towards the end of the DNN architecture, fully
connected layers are employed. These layers connect every neuron in the
previous layer to the subsequent layer, allowing the network to learn high-level
representations and capture global context information. Fully connected layers
consolidate the extracted features from earlier layers and prepare them for the
final stages of image reconstruction.
6. Feature Fusion: The deep features extracted from the convolutional layers
are combined with the shallow features captured by the shallow feature
extraction algorithm. This fusion process integrates both levels of features,
ensuring a comprehensive representation of the input image. It allows the model
to capture both fine-grained details and high-level semantics simultaneously,
resulting in a more holistic understanding of the image content.
By leveraging the power of deep neural networks, Kenoobi Image Upscaler's
deep feature extraction algorithm effectively captures and represents the
high-level features and semantic information present in the input image. The
resulting deep features are essential for subsequent stages of the image
upscaling process, contributing to the generation of a high-resolution image with
enhanced details, textures, and overall visual quality.
Through the combination of convolutional layers, activation functions, pooling
layers, and fully connected layers, the deep feature extraction algorithm enables
Kenoobi Image Upscaler to capture complex visual patterns, recognize intricate
structures, and incorporate global context information. This ultimately
contributes to the generation of visually stunning and realistic output images
that surpass the limitations of traditional upscaling methods.
High-Quality (HQ) Image Reconstruction:
Kenoobi Image Upscaler's high-quality image reconstruction algorithm is a
pivotal step in the image upscaling process. This algorithm aims to generate a
high-resolution image that exhibits enhanced details, colors, textures, and
overall visual quality. It achieves this through the utilization of a generative
adversarial network (GAN), which consists of two neural networks: a generator
network and a discriminator network.
The high-quality image reconstruction process encompasses the following steps:
1. Generator Network: The generator network is responsible for creating a
high-resolution image from the extracted features obtained through shallow and
deep feature extraction. This network takes the combined feature representation
as input and generates a plausible and visually appealing high-resolution image
as output. The generator network utilizes techniques such as upsampling,
deconvolutional layers, and skip connections to preserve and enhance the details
present in the input features.
2. Discriminator Network: The discriminator network plays a vital role in
evaluating the quality and realism of the generated high-resolution image. It
acts as an adversary to the generator network, attempting to differentiate
between the generated images and real high-resolution images from a training
dataset. The discriminator network is trained to classify images as either real or
fake. Its feedback is used to guide the generator network to generate
high-quality images that closely resemble the real images.
3. Adversarial Training: The generator and discriminator networks are trained
together in an adversarial manner. The generator network aims to generate
high-resolution images that can fool the discriminator network into believing
they are real. Simultaneously, the discriminator network is trained to become
more proficient at distinguishing between real and generated images. This
adversarial training process creates a competitive feedback loop where the
generator network strives to improve its image generation quality while the
discriminator network becomes more discerning.
4. Loss Functions: During the training process, loss functions are utilized to
quantify the discrepancy between the generated images and the real
high-resolution images. The generator network aims to minimize the loss
associated with the discriminator's ability to classify the generated images as
fake. Meanwhile, the discriminator network aims to minimize the loss associated
with incorrectly classifying real images or generated images. This interplay
between the two networks through the optimization of their respective loss
functions drives the improvement of the image generation process.
5. Iterative Training: The training of the generator and discriminator networks
is performed iteratively, with multiple passes over the training dataset. The
networks are updated in alternating fashion, where each network's weights are
adjusted based on the gradients obtained from the loss functions. This iterative
training process allows both networks to progressively improve their
capabilities, resulting in the generation of high-quality, realistic images.
By leveraging the power of GANs, Kenoobi Image Upscaler's high-quality image
reconstruction algorithm is able to generate high-resolution images that closely
resemble the real images, while also enhancing the details, colors, and textures
present in the input features. The adversarial training process ensures that the
generated images exhibit improved visual quality, surpassing the limitations of
traditional upscaling methods.
Through the collaborative efforts of the generator and discriminator networks,
Kenoobi Image Upscaler can create visually compelling output images that are
both sharp and vibrant. The generator network learns to create high-quality
images that deceive the discriminator network, while the discriminator network
becomes more discerning in its evaluation. This dynamic interplay results in the
production of high-resolution images that capture and enhance the intricate
details, colors, and textures of the input image, providing a visually stunning
and realistic output.
Training Details:
Kenoobi Image Upscaler undergoes a meticulous training process that leverages
a diverse and extensive dataset of high-resolution images. The dataset comprises
a wide range of image types, including natural scenes, portraits, and graphics,
ensuring that the model learns to handle various image characteristics and
scenarios effectively.
The training procedure follows a supervised learning approach, where the model
is provided with pairs of input images and their corresponding high-resolution
counterparts. The pairs serve as training examples to guide the model in
learning the mapping between low-resolution and high-resolution images. By
utilizing a supervised learning framework, the model can learn to generate
high-quality images that closely resemble the ground truth high-resolution
images.
To optimize the model during training, a combination of loss functions is
employed. These loss functions capture different aspects of the image
reconstruction task and guide the training process:
1. Mean Squared Error (MSE) Loss: MSE loss is a common loss function used
in image upscaling tasks. It computes the pixel-wise difference between the
generated high-resolution image and the ground truth high-resolution image.
Minimizing the MSE loss encourages the model to generate output images that
closely match the ground truth pixel values.
2. Perceptual Loss: Perceptual loss is based on the idea that the perception of
image quality is not solely dependent on pixel-level differences but also on
higher-level features and structures. It measures the perceptual similarity
between the generated image and the ground truth image by comparing their
high-level features extracted from intermediate layers of a pre-trained deep
neural network, such as a VGG network. Incorporating perceptual loss
encourages the model to focus on capturing the overall structure, textures, and
details present in the ground truth image.
3. Adversarial Loss: Adversarial loss is a key component of GAN-based
training. It involves training the discriminator network to distinguish between
real high-resolution images and generated high-resolution images, while
simultaneously training the generator network to fool the discriminator. The
adversarial loss guides the generator network to produce more realistic and
visually appealing high-resolution images by learning to generate images that
are indistinguishable from real images according to the discriminator network.
During the training process, the loss functions are combined and optimized
jointly to guide the model towards generating high-quality output images with
enhanced details, colors, and textures.
To facilitate efficient training, Kenoobi Image Upscaler utilizes GPU
acceleration. GPUs (Graphics Processing Units) offer parallel computing
capabilities that expedite the training process by performing numerous
computations simultaneously. The parallel processing power of GPUs enables
faster forward and backward propagation of the neural network, significantly
reducing the training time.
The training process typically involves multiple epochs, with each epoch
representing a complete pass over the training dataset. The model is updated
iteratively by backpropagating the gradients through the network and adjusting
the network's weights and parameters using optimization algorithms such as
stochastic gradient descent (SGD) or its variants. The iterative nature of
training allows the model to progressively improve its ability to upscale images
and generate high-quality outputs.
By utilizing a large and diverse dataset, employing a supervised learning
approach, incorporating multiple loss functions, and leveraging GPU
acceleration, Kenoobi Image Upscaler is trained to learn the intricate mapping
between low-resolution and high-resolution images, enabling it to generate
visually stunning and realistic outputs that surpass the limitations of traditional
upscaling methods.
Experiments:
To evaluate the performance of Kenoobi Image Upscaler, we conducted various
experiments on different types of images. The experiments include comparing
the quality of the output images with the input images and comparing the
performance of Kenoobi Image Upscaler with other state-of-the-art image
upscaling models.
We first tested the model on natural images, including landscape, cityscape, and
wildlife photos. The results showed that Kenoobi Image Upscaler was able to
enhance the details, colors, and textures of the images, resulting in a
significantly improved output image quality. The output images were sharper,
with more natural-looking colors and textures.
We then tested the model on portrait images, including photos of people and
animals. The results showed that Kenoobi Image Upscaler was able to enhance
the fine details, such as facial features and hair textures, resulting in a more
realistic and high-quality output image. The model was also able to preserve the
natural skin tones and colors, resulting in a more natural-looking output image.
Next, we tested the model on graphics, including images with text, logos, and
illustrations. The results showed that Kenoobi Image Upscaler was able to
enhance the details and textures of the graphics, resulting in a more vibrant and
sharp output image. The model was also able to preserve the sharp edges and
lines of the graphics, resulting in a high-quality output image.
Finally, we compared the performance of Kenoobi Image Upscaler with other
state-of-the-art image upscaling models, including Waifu2x and ESRGAN. The
results showed that Kenoobi Image Upscaler outperformed the other models in
terms of output image quality and processing time.
Conclusion:
In conclusion, Kenoobi Image Upscaler is an advanced AI model that can
enhance the quality of any image beyond imagination. The model leverages
advanced AI algorithms for color correction, shallow and deep feature
extraction, and high-quality image reconstruction. The model is trained using a
large dataset of high-resolution images, and its performance is evaluated using
various experiments. The results show that Kenoobi Image Upscaler can restore
and enhance the details, colors, and textures of an image, creating a stunningly
crisp and vibrant image.
Limitations:
Despite its impressive performance, Kenoobi Image Upscaler has some
limitations. The model requires a significant amount of computational power,
and processing large images can be time-consuming. Additionally, the model's
performance may vary depending on the quality and type of the input image.
Finally, the model may not be able to restore details that are completely lost or
obscured in the input image.
Reference:
1. Li, X.; Orchard, M. New edge-directed interpolation. IEEE Trans. Image
Process. 2001, 10, 1521–1527.
2. Irani, M.; Peleg, S. Improving resolution by image registration. CVGIP
Graph. Model. Image Process. 1991, 53, 231–239.
3. Marquina, A.; Osher, S. Image Super-Resolution by TV-Regularization
and Bregman Iteration. J. Sci. Comput. 2008, 37, 367–382.
4. Numnonda, T.; Andrews, M. High resolution image reconstruction using
mean field annealing. In Proceedings of the IEEE Workshop on Neural
Networks for Signal Processing, Ermioni, Greece, 6–8 September 1994;
pp. 441–450.
5. Xu, J.; Li, M.; Fan, J.; Xie, W. Discarding jagged artefacts in image
upscaling with total variation regularisation. IET Image Process. 2019,
13, 2495–2506.
6. Deng, X.; Dragotti, P.L. Deep Coupled ISTA Network for Multi-Modal
Image Super-Resolution. IEEE Trans. Image Process. 2020, 29,
1683–1698.
7. Kasem, H.M.; Hung, K.W.; Jiang, J. Spatial Transformer Generative
Adversarial Network for Robust Image Super-Resolution. IEEE Access
2019, 7, 182993–183009.
8. Pal, S.; Jana, S.; Parekh, R. Super-Resolution of Textual Images using
Autoencoders for Text Identification. In Proceedings of the 2018 IEEE
Applied Signal Processing Conference (ASPCON), Kolkata, India, 7–9
December 2018; pp. 153–157.
9. Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment:
From error visibility to structural similarity. IEEE Trans. Image Process.
2004, 13, 600–612.
10. Panetta, K.; Samani, A.; Agaian, S. A Robust No-Reference,
No-Parameter, Transform Domain Image Quality Metric for Evaluating
the Quality of Color Images. IEEE Access 2018, 6, 10979–10985.
11. Chen, B.; Li, H.; Fan, H.; Wang, S. No-reference Quality Assessment
with Unsupervised Domain Adaptation. arXiv 2020, arXiv:2008.08561.
12. Algazi, V.; Ford, G.; Potharlanka, R. Directional interpolation of
images based on visual properties and rank order filtering. IEEE Comput.
Soc. 1991, 4, 3005–3008
13. Winkler, S. Issues in Vision Modeling for Perceptual Video Quality
Assessment. Signal Process. 1999, 78, 231–252.
14. Wang, Z.; Bovik, A.; Lu, L. Why is image quality assessment so
difficult? In Proceedings of the 2002 IEEE International Conference on
Acoustics, Speech, and Signal Processing, Orlando, FL, USA, 13–17
May 2002; Volume 4, pp. 4–3313.
15. Morse, B.; Schwartzwald, D. Isophote-based interpolation. In
Proceedings of the 1998 International Conference on Image Processing,
ICIP98 (Cat. No.98CB36269), Chicago, IL, USA, 7 October 1998;
Volume 3, pp. 227–231.
16. Carrato, S.; Ramponi, G.; Marsi, S. A simple edge-sensitive image
interpolation filter. In Proceedings of the 3rd IEEE International
Conference on Image Processing, Lausanne, Switzerland, 19 September
1996; Volume 3, pp. 711–714.
17. Lee, S.W.; Paik, J. Image Interpolation Using Adaptive Fast B-Spline
Filtering. In Proceedings of the 1993 IEEE International Conference on
Acoustics, Speech, and Signal Processing, Minneapolis, MN, USA, 27–30
April 1993; Volume 5, pp. 177–180.
18. Allebach, J.; Wong, P.W. Edge-directed interpolation. In Proceedings
of the 3rd IEEE International Conference on Image Processing,
Lausanne, Switzerland, 16–19 September 1996; Volume 3, pp.
707–710.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Recently, there have been significant advances in image super-resolution based on generative adversarial networks (GANs) to achieve breakthroughs in generating more images with high subjective quality. However, there are remaining challenges needs to be met, such as simultaneously recovering the finer texture details for large upscaling factors and mitigating the geometric transformation effects. In this paper, we propose a novel robust super-resolution GAN (i.e. namely RSR-GAN) which can simultaneously perform both the geometric transformation and recovering the finer texture details. Specifically, since the performance of the generator depends on the discreminator, we propose a novel discriminator design by incorporating the spatial transformer module with residual learning to improve the discrimination of fake and true images through removing the geometric noise, in order to enhance the super-resolution of geometric corrected images. Finally, to further improve the perceptual quality, we introduce an additional DCT loss term into the existing loss function. Extensive experiments, measured by both PSNR and SSIM measurements, show that our proposed method achieves a high level of robustness against a number of geometric transformations, including rotation, translation, a combination of rotation and scaling effects, and a cobmination of rotaion, transalation and scaling effects. Benchmarked by the existing state-of-the-arts SR methods, our proposed delivers superior performances on a wide range of datasets which are publicly available and widely adopted across research communities.
Article
Full-text available
Image upscaling is needed in many areas. There are two types of methods: methods based on a simple hypothesis and methods based on machine learning. Most of the machine learning‐based methods have disadvantages: no support is provided for a variety of upscaling factors, a training process with a high time cost is required, and a large amount of storage space and high‐end equipment are required. To avoid the disadvantages of machine learning, upscaling images with a simple hypothesis is a promising strategy but simple hypothesis always produces jaggy artifacts. The authors propose a new method to remove these jagged artifacts. They consider an edge in an image as a deformed curve. Removing jagged artefacts is considered equivalent to shortening the full arc length of a curve. By optimising the regularization model, the severity of the artifacts decreases as the number of iterations increases. They compare nine existing methods on the Set5, Set14, and Urban100 datasets. Without using any external data, the proposed algorithm has high visual quality, has few jagged artefacts and performs similarly to very recent state‐of‐the‐art deep convolutional network‐based approaches. Compared to other methods without external data, the proposed algorithm balances the quality and time cost well.
Article
Full-text available
In autonomous imaging and video systems, data measurements can be extracted based on the presence or absence of system specific attributes of interest. These measurements may then be used to make critical system decisions. Therefore, it is imperative that the quality of the image used for extracting important measurements is of the highest fidelity. To achieve this, image enhancement algorithms are used to improve the quality of the image as a pre-processing procedure. Currently, most image enhancement processes require parameter selection and parameter optimizations, where the results typically require assessment by a human observer. To perform the image enhancement without human intervention, an image quality metric needs to be used to automatically optimize the enhancement algorithm’s parameters. An additional complexity is that the performance of an image quality measure depends on the attributes an image possesses, and the types of distortions affecting the image. Although there are many image quality metrics available in the literature, very few are designed for color images. Furthermore, most color image quality measures require a reference image as a basis, on which all other results are compared too, or require parameter adjustment before the measures can be used. Finally, most available measures can only evaluate image quality for images that are affected by a small set of distortions. In this article, we will show a new no-reference no-parameter transform-domain image quality metric, TDMEC, which can successfully evaluate images that are affected by 10 different distortion types in the TID2008 image database [1]. This measure enables vision based measurement systems to automatically select optimal operating parameters that will produce the best quality images for analysis.
Article
Full-text available
In this paper we formulate a new time dependent convolutional model for super-resolution based on a constrained variational model that uses the total variation of the signal as a regularizing functional. We propose an iterative refinement procedure based on Bregman iteration to improve spatial resolution. The model uses a dataset of low resolution images and incorporates a downsampling operator to relate the high resolution scale to the low resolution one. We present an algorithm for the model and we perform a series of numerical experiments to show evidence of the good behavior of the numerical scheme and quality of the results.
Conference Paper
Full-text available
The goal of this research is to develop interpolation techniques which preserve or enhance the local structure critical to image quality. Preliminary results are presented which exploit either the properties of vision or the properties of the image in order to achieve the goals. Directional image interpolation is considered which is based on a local analysis of the spatial image structure. The extension of techniques for the design of linear filters based on properties of human perception reported previously to enhance the perceived quality of interpolated images is considered
Conference Paper
Full-text available
A novel scheme for edge-preserving image interpolation is introduced, which is based on the use of a simple nonlinear filter which accurately reconstructs sharp edges. Simulation results show the superior performances of the proposed approach with respect to other interpolation techniques
Article
Given a low-resolution (LR) image, multi-modal image super-resolution (MISR) aims to find the high-resolution (HR) version of this image with the guidance of an HR image from another modality. In this paper, we use a model-based approach to design a new deep network architecture for MISR. We first introduce a novel joint multi-modal dictionary learning (JMDL) algorithm to model cross-modality dependency. In JMDL, we simultaneously learn three dictionaries and two transform matrices to combine the modalities. Then, by unfolding the iterative shrinkage and thresholding algorithm (ISTA), we turn the JMDL model into a deep neural network, called deep coupled ISTA network. Since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm (LOA) to initialize the parameters of the network before running back-propagation strategy. Specifically, we model the network initialization as a multi-layer dictionary learning problem, and solve it through convex optimization. The proposed LOA is demonstrated to effectively decrease the training loss and increase the reconstruction accuracy. Finally, we compare our method with other state-of-the-art methods in the MISR task. The numerical results show that our method consistently outperforms others both quantitatively and qualitatively at different upscaling factors for various multi-modal scenarios.
Article
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MatLab implementation of the proposed algorithm is available online at http://www.cns.nyu.edu/~lcv/ssim/.
Conference Paper
We present a new method for digitally interpolating images to higher resolution. It consists of two phases: rendering and correction. The rendering phase is edge-directed. From the low resolution image data, we generate a high resolution edge map by first filtering with a rectangular center-on-surround-off filter and then performing piecewise linear interpolation between the zero crossings in the filter output. The rendering phase is based on bilinear interpolation modified to prevent interpolation across edges, as determined from the estimated high resolution edge map. During the correction phase, we modify the mesh values on which the rendering is based to account for the disparity between the true low resolution data, and that predicted by a sensor model operating on the high resolution output of the rendering phase. The overall process is repeated iteratively. We show experimental results which demonstrate the efficacy of our interpolation method