Technical ReportPDF Available

Unleashing Creativity: Kenoobi Artworx AI -A Revolutionary Approach to Generating Personalized Art with AI Prepared by Kenoobi AI

Authors:
  • Kenoobi Group Limited

Abstract

Kenoobi Artworx AI is an AI model that generates unique art pieces based on user input. It uses various AI methods such as variational autoencoder and noise schedule to create images that are owned by the user and can be used for NFT projects or shared on social media. The app offers different models, styles, and aspect ratios, and regular updates are provided to improve the AI methods. In this technical report, we provide details on the training process, image generation, experiments, limitations, and potential technology misuse. Introduction:
Unleashing Creativity: Kenoobi Artworx AI - A
Revolutionary Approach to Generating
Personalized Art with AI
Prepared by Kenoobi AI
Allan Mukhwana, Evans Omondi, Diana Ambale, Macrine
Akinyi, Kelvin Mukhwana, Richard Omollo, Ben Chege
Abstract:
Kenoobi Artworx AI is an AI model that generates unique art pieces based on
user input. It uses various AI methods such as variational autoencoder and noise
schedule to create images that are owned by the user and can be used for NFT
projects or shared on social media. The app offers different models, styles, and
aspect ratios, and regular updates are provided to improve the AI methods. In
this technical report, we provide details on the training process, image
generation, experiments, limitations, and potential technology misuse.
Introduction:
Kenoobi Artworx AI is an innovative solution that combines AI and art to create
unique pieces. The app uses natural language processing (NLP) to generate art
based on user input. Users can simply describe what they want to see, and the
app generates an image based on their input. The generated art is owned by the
user, and they can use it for various purposes, including NFT projects, prints,
and social media sharing.
Training Details:
The training process of Kenoobi Artworx AI involved two key components:
Classifier guidance and forward and reverse diffusion. Classifier guidance played
a crucial role in enabling the AI to understand and learn different styles of art.
During the training phase, a large dataset of art images was used, encompassing
various styles, genres, and artists. Each image in the dataset was manually
classified with its corresponding art style, providing the AI with labeled
examples to learn from. This classification was done by art experts and curators
who possessed in-depth knowledge of different artistic styles. The labeled
dataset served as a valuable resource for the AI model to grasp the distinctive
characteristics and nuances of each art style.
Classifier guidance involved training a separate neural network model, known as
the style classifier. This style classifier was trained to accurately predict the art
style of a given input image. The style classifier learned to recognize and
differentiate between different styles based on features such as color palette,
brushstroke patterns, composition, and subject matter.
The output of the style classifier was then used as a guidance signal during the
training of the main Kenoobi Artworx AI model. By incorporating this classifier
guidance, the AI model was able to learn how to generate art that aligned with
the desired style specified by the user. The guidance signal provided a valuable
reference for the AI to understand and mimic the specific attributes and
characteristics associated with different art styles.
In addition to classifier guidance, forward and reverse diffusion was employed as
a method to enhance the AI's ability to generate high-quality images. This
method involved the controlled diffusion of noise throughout the AI model in
multiple iterations.
During the forward diffusion process, noise was injected into the latent space of
the AI model. The noise served as a random perturbation that introduced
variations into the generated images. As the training progressed, the AI
gradually learned to control and manipulate this noise, resulting in images with
diverse and unique visual characteristics.
The reverse diffusion process complemented the forward diffusion by
reconstructing the original image from the noisy representation. This
reconstruction step helped refine the generated images and reduce the noise
introduced during the forward diffusion. Through this iterative process of
diffusion and reconstruction, the AI model gradually improved its ability to
generate high-quality and visually appealing images.
The training process involved numerous iterations, with the AI model
continuously adjusting its parameters and learning from the labeled dataset and
the noise diffusion techniques. The model was trained using powerful
computational resources to handle the complexity of the training data and
optimize its performance.
By combining the insights gained from classifier guidance and the refinement
achieved through forward and reverse diffusion, Kenoobi Artworx AI was able
to develop an understanding of various art styles and generate high-quality art
that closely resembled the specified style in user input.
Image Generation:
The image generation process of Kenoobi Artworx AI involved several
techniques, including Variational autoencoder, Noise schedule, and Text
conditioning. These techniques played vital roles in enabling the AI to generate
unique and visually appealing images based on user input.
The variational autoencoder (VAE) was a key component of the image
generation process. It is a type of neural network architecture that learns how to
encode and decode images. During the training phase, the VAE was trained on a
diverse dataset of art images, allowing it to learn the underlying patterns and
structures present in the data. The VAE's encoder network learned to map the
input images into a low-dimensional latent space representation, capturing the
essential features and characteristics of the image. This latent representation
was then fed into the decoder network, which reconstructed the image based on
the learned representation. By utilizing the VAE, Kenoobi Artworx AI was able
to generate new images by sampling from the latent space and decoding the
samples into visually coherent artworks.
The noise schedule technique was employed to aid the AI in generating
high-quality images. This technique involved gradually increasing the noise level
during the image generation process. At the initial stages of image generation, a
small amount of noise was introduced to the latent space, which resulted in
relatively smoother and less detailed images. As the generation progressed, the
noise level was systematically increased, allowing the AI to explore more diverse
and intricate visual patterns. This incremental noise schedule contributed to the
creation of visually appealing and aesthetically rich art pieces.
Text conditioning played a crucial role in guiding the AI to generate images
based on user input. Users could describe their desired artwork using natural
language, and the AI model was trained to understand and interpret these
textual descriptions. Through text conditioning, the AI learned to associate
textual input with specific visual features and artistic styles. By conditioning the
image generation process on user-provided text, Kenoobi Artworx AI could
generate images that aligned with the user's intentions and preferences.
During the image generation process, the AI model combined the learned
representations from the VAE, the noise schedule technique, and the text
conditioning to produce unique and personalized artworks. The model leveraged
the encoded latent representations to generate images that encompassed the
desired artistic style and visual characteristics specified by the user's input. The
gradual increase in noise allowed for exploration of diverse visual patterns,
resulting in captivating and varied art pieces. Text conditioning ensured that the
generated images were coherent with the user's textual descriptions.
Throughout the training and development of Kenoobi Artworx AI, these image
generation techniques were refined and optimized, resulting in an AI model
capable of generating high-quality art pieces that reflected the user's desires and
preferences.
Experiments:
To assess the effectiveness of Kenoobi Artworx AI in generating unique art
pieces, a series of experiments were conducted. These experiments aimed to
evaluate the quality of the generated images based on user input, as well as
explore the app's versatility in offering various models, styles, and aspect ratios
to meet users' specific needs.
In the experiments, a diverse group of users, including artists, art enthusiasts,
and general users, were invited to interact with the Kenoobi Artworx AI app.
Participants were provided with a user-friendly interface where they could input
their desired art descriptions using natural language. The app then generated
images based on these descriptions.
The generated images were evaluated based on several criteria, including visual
quality, fidelity to the specified style, creativity, and uniqueness. A panel of
experts, comprising artists, curators, and AI specialists, was involved in the
evaluation process. They assessed the images for their artistic merit, technical
proficiency, and adherence to the desired style.
The experiments demonstrated that Kenoobi Artworx AI was able to generate
high-quality images that closely matched the user's input. The AI successfully
captured the desired styles and artistic characteristics specified by the users,
producing visually compelling and unique art pieces. The generated images
exhibited a range of artistic styles, from classical to contemporary, abstract to
realistic, and more, showcasing the app's ability to cater to diverse artistic
preferences.
Furthermore, the experiments highlighted the versatility of Kenoobi Artworx AI
in terms of offering various models, styles, and aspect ratios. Users had the
flexibility to choose different models, each trained on specific artistic genres or
periods, allowing them to explore and experiment with different artistic styles.
The app also provided options for different aspect ratios, enabling users to
generate images suitable for different mediums and platforms, including NFT
projects, printing, and social media sharing.
The feedback received from the participants during the experiments was
positive, indicating their satisfaction with the generated art pieces. Users
expressed appreciation for the app's ability to generate personalized and unique
images based on their descriptions. The experiments provided valuable insights
into the strengths of Kenoobi Artworx AI, highlighting its potential as a
powerful tool for artists, designers, and art enthusiasts alike.
Based on the experimental results and user feedback, Kenoobi Artworx AI
continued to undergo iterative improvements. Regular updates were introduced
to refine the AI model, expand the available artistic styles and models, and
enhance the overall user experience. These updates aimed to address any
limitations identified during the experiments and further optimize the app's
ability to generate high-quality, user-driven art.
In conclusion, the experiments conducted with Kenoobi Artworx AI
demonstrated its effectiveness in generating unique and high-quality art pieces
based on user input. The app's ability to offer various models, styles, and aspect
ratios provided users with the opportunity to explore and create personalized
images that met their specific artistic requirements. The positive feedback
received from participants affirmed the app's potential as a valuable tool in the
creation of digital art in a user-friendly and accessible manner.
Conclusion:
Kenoobi Artworx AI is an innovative solution that combines AI and art to create
unique pieces. The app uses natural language processing (NLP) to generate art
based on user input. The generated art is owned by the user, and they can use it
for various purposes, including NFT projects, prints, and social media sharing.
The AI methods used by Kenoobi Artworx AI, such as variational autoencoder
and noise schedule, enable the app to generate high-quality images that meet the
user's needs.
Limitations:
While Kenoobi Artworx AI is a powerful tool for generating unique art pieces, it
does have some limitations. The app's effectiveness is dependent on the quality
of the user's input. If the user's input is unclear or too general, the AI may not
generate an image that meets their needs. Additionally, the app's effectiveness is
limited by the available data used to train the AI.
Technology Misuse:
Kenoobi Artworx AI is a powerful tool that can be misused if not used
appropriately. Users must ensure that they have the rights to the images they
generate and use them in a legal and ethical manner. The app should not be used
to generate images that violate intellectual property laws or promote hate speech
or violence. The misuse of Kenoobi Artworx AI could result in legal action or
reputational damage to the user or the company.
References:
1. Li, X.; Orchard, M. New edge-directed interpolation. IEEE Trans. Image
Process. 2001, 10, 1521–1527.
2. Irani, M.; Peleg, S. Improving resolution by image registration. CVGIP
Graph. Model. Image Process. 1991, 53, 231–239.
3. Marquina, A.; Osher, S. Image Super-Resolution by TV-Regularization
and Bregman Iteration. J. Sci. Comput. 2008, 37, 367–382.
4. Numnonda, T.; Andrews, M. High resolution image reconstruction using
mean field annealing. In Proceedings of the IEEE Workshop on Neural
Networks for Signal Processing, Ermioni, Greece, 6–8 September 1994;
pp. 441–450.
5. Xu, J.; Li, M.; Fan, J.; Xie, W. Discarding jagged artefacts in image
upscaling with total variation regularisation. IET Image Process. 2019,
13, 2495–2506.
6. Deng, X.; Dragotti, P.L. Deep Coupled ISTA Network for Multi-Modal
Image Super-Resolution. IEEE Trans. Image Process. 2020, 29,
1683–1698.
7. Kasem, H.M.; Hung, K.W.; Jiang, J. Spatial Transformer Generative
Adversarial Network for Robust Image Super-Resolution. IEEE Access
2019, 7, 182993–183009.
8. Pal, S.; Jana, S.; Parekh, R. Super-Resolution of Textual Images using
Autoencoders for Text Identification. In Proceedings of the 2018 IEEE
Applied Signal Processing Conference (ASPCON), Kolkata, India, 7–9
December 2018; pp. 153–157.
9. Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment:
From error visibility to structural similarity. IEEE Trans. Image Process.
2004, 13, 600–612.
10. Panetta, K.; Samani, A.; Agaian, S. A Robust No-Reference,
No-Parameter, Transform Domain Image Quality Metric for Evaluating
the Quality of Color Images. IEEE Access 2018, 6, 10979–10985.
11. Chen, B.; Li, H.; Fan, H.; Wang, S. No-reference Quality Assessment
with Unsupervised Domain Adaptation. arXiv 2020, arXiv:2008.08561.
12. Algazi, V.; Ford, G.; Potharlanka, R. Directional interpolation of
images based on visual properties and rank order filtering. IEEE Comput.
Soc. 1991, 4, 3005–3008
13. Winkler, S. Issues in Vision Modeling for Perceptual Video Quality
Assessment. Signal Process. 1999, 78, 231–252.
14. Wang, Z.; Bovik, A.; Lu, L. Why is image quality assessment so
difficult? In Proceedings of the 2002 IEEE International Conference on
Acoustics, Speech, and Signal Processing, Orlando, FL, USA, 13–17
May 2002; Volume 4, pp. 4–3313.
15. Morse, B.; Schwartzwald, D. Isophote-based interpolation. In
Proceedings of the 1998 International Conference on Image Processing,
ICIP98 (Cat. No.98CB36269), Chicago, IL, USA, 7 October 1998;
Volume 3, pp. 227–231.
16. Carrato, S.; Ramponi, G.; Marsi, S. A simple edge-sensitive image
interpolation filter. In Proceedings of the 3rd IEEE International
Conference on Image Processing, Lausanne, Switzerland, 19 September
1996; Volume 3, pp. 711–714.
17. Lee, S.W.; Paik, J. Image Interpolation Using Adaptive Fast B-Spline
Filtering. In Proceedings of the 1993 IEEE International Conference on
Acoustics, Speech, and Signal Processing, Minneapolis, MN, USA, 27–30
April 1993; Volume 5, pp. 177–180.
18. Allebach, J.; Wong, P.W. Edge-directed interpolation. In Proceedings
of the 3rd IEEE International Conference on Image Processing,
Lausanne, Switzerland, 16–19 September 1996; Volume 3, pp.
707–710.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Recently, there have been significant advances in image super-resolution based on generative adversarial networks (GANs) to achieve breakthroughs in generating more images with high subjective quality. However, there are remaining challenges needs to be met, such as simultaneously recovering the finer texture details for large upscaling factors and mitigating the geometric transformation effects. In this paper, we propose a novel robust super-resolution GAN (i.e. namely RSR-GAN) which can simultaneously perform both the geometric transformation and recovering the finer texture details. Specifically, since the performance of the generator depends on the discreminator, we propose a novel discriminator design by incorporating the spatial transformer module with residual learning to improve the discrimination of fake and true images through removing the geometric noise, in order to enhance the super-resolution of geometric corrected images. Finally, to further improve the perceptual quality, we introduce an additional DCT loss term into the existing loss function. Extensive experiments, measured by both PSNR and SSIM measurements, show that our proposed method achieves a high level of robustness against a number of geometric transformations, including rotation, translation, a combination of rotation and scaling effects, and a cobmination of rotaion, transalation and scaling effects. Benchmarked by the existing state-of-the-arts SR methods, our proposed delivers superior performances on a wide range of datasets which are publicly available and widely adopted across research communities.
Article
Full-text available
Image upscaling is needed in many areas. There are two types of methods: methods based on a simple hypothesis and methods based on machine learning. Most of the machine learning‐based methods have disadvantages: no support is provided for a variety of upscaling factors, a training process with a high time cost is required, and a large amount of storage space and high‐end equipment are required. To avoid the disadvantages of machine learning, upscaling images with a simple hypothesis is a promising strategy but simple hypothesis always produces jaggy artifacts. The authors propose a new method to remove these jagged artifacts. They consider an edge in an image as a deformed curve. Removing jagged artefacts is considered equivalent to shortening the full arc length of a curve. By optimising the regularization model, the severity of the artifacts decreases as the number of iterations increases. They compare nine existing methods on the Set5, Set14, and Urban100 datasets. Without using any external data, the proposed algorithm has high visual quality, has few jagged artefacts and performs similarly to very recent state‐of‐the‐art deep convolutional network‐based approaches. Compared to other methods without external data, the proposed algorithm balances the quality and time cost well.
Article
Full-text available
In autonomous imaging and video systems, data measurements can be extracted based on the presence or absence of system specific attributes of interest. These measurements may then be used to make critical system decisions. Therefore, it is imperative that the quality of the image used for extracting important measurements is of the highest fidelity. To achieve this, image enhancement algorithms are used to improve the quality of the image as a pre-processing procedure. Currently, most image enhancement processes require parameter selection and parameter optimizations, where the results typically require assessment by a human observer. To perform the image enhancement without human intervention, an image quality metric needs to be used to automatically optimize the enhancement algorithm’s parameters. An additional complexity is that the performance of an image quality measure depends on the attributes an image possesses, and the types of distortions affecting the image. Although there are many image quality metrics available in the literature, very few are designed for color images. Furthermore, most color image quality measures require a reference image as a basis, on which all other results are compared too, or require parameter adjustment before the measures can be used. Finally, most available measures can only evaluate image quality for images that are affected by a small set of distortions. In this article, we will show a new no-reference no-parameter transform-domain image quality metric, TDMEC, which can successfully evaluate images that are affected by 10 different distortion types in the TID2008 image database [1]. This measure enables vision based measurement systems to automatically select optimal operating parameters that will produce the best quality images for analysis.
Article
Full-text available
In this paper we formulate a new time dependent convolutional model for super-resolution based on a constrained variational model that uses the total variation of the signal as a regularizing functional. We propose an iterative refinement procedure based on Bregman iteration to improve spatial resolution. The model uses a dataset of low resolution images and incorporates a downsampling operator to relate the high resolution scale to the low resolution one. We present an algorithm for the model and we perform a series of numerical experiments to show evidence of the good behavior of the numerical scheme and quality of the results.
Conference Paper
Full-text available
The goal of this research is to develop interpolation techniques which preserve or enhance the local structure critical to image quality. Preliminary results are presented which exploit either the properties of vision or the properties of the image in order to achieve the goals. Directional image interpolation is considered which is based on a local analysis of the spatial image structure. The extension of techniques for the design of linear filters based on properties of human perception reported previously to enhance the perceived quality of interpolated images is considered
Conference Paper
Full-text available
A novel scheme for edge-preserving image interpolation is introduced, which is based on the use of a simple nonlinear filter which accurately reconstructs sharp edges. Simulation results show the superior performances of the proposed approach with respect to other interpolation techniques
Article
Given a low-resolution (LR) image, multi-modal image super-resolution (MISR) aims to find the high-resolution (HR) version of this image with the guidance of an HR image from another modality. In this paper, we use a model-based approach to design a new deep network architecture for MISR. We first introduce a novel joint multi-modal dictionary learning (JMDL) algorithm to model cross-modality dependency. In JMDL, we simultaneously learn three dictionaries and two transform matrices to combine the modalities. Then, by unfolding the iterative shrinkage and thresholding algorithm (ISTA), we turn the JMDL model into a deep neural network, called deep coupled ISTA network. Since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm (LOA) to initialize the parameters of the network before running back-propagation strategy. Specifically, we model the network initialization as a multi-layer dictionary learning problem, and solve it through convex optimization. The proposed LOA is demonstrated to effectively decrease the training loss and increase the reconstruction accuracy. Finally, we compare our method with other state-of-the-art methods in the MISR task. The numerical results show that our method consistently outperforms others both quantitatively and qualitatively at different upscaling factors for various multi-modal scenarios.
Article
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MatLab implementation of the proposed algorithm is available online at http://www.cns.nyu.edu/~lcv/ssim/.
Conference Paper
We present a new method for digitally interpolating images to higher resolution. It consists of two phases: rendering and correction. The rendering phase is edge-directed. From the low resolution image data, we generate a high resolution edge map by first filtering with a rectangular center-on-surround-off filter and then performing piecewise linear interpolation between the zero crossings in the filter output. The rendering phase is based on bilinear interpolation modified to prevent interpolation across edges, as determined from the estimated high resolution edge map. During the correction phase, we modify the mesh values on which the rendering is based to account for the disparity between the true low resolution data, and that predicted by a sensor model operating on the high resolution output of the rendering phase. The overall process is repeated iteratively. We show experimental results which demonstrate the efficacy of our interpolation method