Pankoo Kim’s research while affiliated with Chosun University and other places


Ad

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (210)


A Study on Webtoon Generation Using CLIP and Diffusion Models
  • Article
  • Full-text available

September 2023

·

93 Reads

Electronics

·

Hyoungju Kim

·

Jeongin Kim

·

[...]

·

Pankoo Kim

This study focuses on harnessing deep-learning-based text-to-image transformation techniques to help webtoon creators’ creative outputs. We converted publicly available datasets (e.g., MSCOCO) into a multimodal webtoon dataset using CartoonGAN. First, the dataset was leveraged for training contrastive language image pre-training (CLIP), a model composed of multi-lingual BERT and a Vision Transformer that learnt to associate text with images. Second, a pre-trained diffusion model was employed to generate webtoons through text and text-similar image input. The webtoon dataset comprised treatments (i.e., textual descriptions) paired with their corresponding webtoon illustrations. CLIP (operating through contrastive learning) extracted features from different data modalities and aligned similar data more closely within the same feature space while pushing dissimilar data apart. This model learnt the relationships between various modalities in multimodal data. To generate webtoons using the diffusion model, the process involved providing the CLIP features of the desired webtoon’s text with those of the most text-similar image to a pre-trained diffusion model. Experiments were conducted using both single- and continuous-text inputs to generate webtoons. In the experiments, both single-text and continuous-text inputs were used to generate webtoons, and the results showed an inception score of 7.14 when using continuous-text inputs. The text-to-image technology developed here could streamline the webtoon creation process for artists by enabling the efficient generation of webtoons based on the provided text. However, the current inability to generate webtoons from multiple sentences or images while maintaining a consistent artistic style was noted. Therefore, further research is imperative to develop a text-to-image model capable of handling multi-sentence and -lingual input while ensuring coherence in the artistic style across the generated webtoon images.

Download

A Study on the Generation of Webtoons through Fine-Tuning of Diffusion Models

August 2023

·

13 Reads

Korean Institute of Smart Media

This study proposes a method to assist webtoon artists in the process of webtoon creation by utilizing a pretrained Text-to-Image model to generate webtoon images from text. The proposed approach involves fine-tuning a pretrained Stable Diffusion model using a webtoon dataset transformed into the desired webtoon style. The fine-tuning process, using LoRA technique, completes in a quick training time of approximately 4.5 hours with 30,000 steps. The generated images exhibit the representation of shapes and backgrounds based on the input text, resulting in the creation of webtoon-like images. Furthermore, the quantitative evaluation using the Inception score shows that the proposed method outperforms DCGAN-based Text-to-Image models. If webtoon artists adopt the proposed Text-to-Image model for webtoon creation, it is expected to significantly reduce the time required for the creative process.




A Study on Generating Webtoons Using Multilingual Text-to-Image Models

June 2023

·

111 Reads

·

2 Citations

Applied Sciences

Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise and used as input to a GAN, which generates images similar to the original images via competition between the generator and discriminator. Although images have been extensively generated from English text, text-to-image technology based on multilingualism, such as Korean, is in its developmental stage. Webtoons are digital comic formats for viewing comics online. The webtoon creation process involves story planning, content/sketching, coloring, and background drawing, all of which require human intervention, thus being time-consuming and expensive. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs multilingual BERT to extract feature vectors for multiple languages and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images similar to the original images when presented with multilingual input text after training. The evaluation metrics further support these findings, as the generated images achieved an Inception score of 4.99 and an FID score of 22.21.


A Study on Generating Webtoons using Multilingual Text-to-Image Models

April 2023

·

152 Reads

·

1 Citation

Recent advances in deep learning technology have led to increased interest in text-to-image technology, which enables computers to create images from text by simulating the human process of forming mental images. The GAN-based text-to-image technology involves the extraction of features from input text, which are combined with noise and then used as input to a GAN that generates images that are similar to the original images through competition between the generator and discriminator. Although generating images from English text is a mature area of research, text-to-image technology based on multilingualism, such as Korean, is still in its early stages of development. Webtoon is a digital comic format that allows comics to be viewed online. The creation process for webtoons is divided into story planning, content/sketching, coloring, and background drawing. Since each stage of webtoon production requires human intervention, it is both time-consuming and expensive. As a result, deep learning technologies such as automatic coloring and automatic line drawing are being used to reduce human involvement. However, there is a shortage of technology that can assist authors with story creation in webtoon production. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs Multilingual BERT to extract feature vectors for multiple languages, and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images that are similar to the original images when presented with multilingual input text after training.


Process of the proposed business-location recommendation system.
Business location recommendation process.
Architecture of (a) ResNet, (b) Inception v3 module, (c) EfficientNet B0, and (d) MBConv.
Architecture of (a) ResNet, (b) Inception v3 module, (c) EfficientNet B0, and (d) MBConv.
Architecture of (a) ResNet, (b) Inception v3 module, (c) EfficientNet B0, and (d) MBConv.

+17

Deep Learning-Based Business Recommendation System in Intelligent Vehicles

April 2023

·

78 Reads

Mobile Information Systems

The advancements in intelligent vehicle technologies are facilitating the growth of information technology (IT) platforms, unlike conventional automobiles. In-vehicle-infotainment (IVI) is becoming an appealing element in intelligent vehicles as it offers various experiences to users; however, it requires personalized services to provide even more sophisticated user experiences. It is supposed that passengers search for businesses that provide products or services they found interesting in videos played via IVIs while the vehicle is driving autonomously. In that case, it could be more effective to use images that can express the user’s preference as a query for the search than to utilize texts such as product names. Accordingly, this study proposes a recommendation system that informs users of businesses near an intelligent vehicle when a passenger inputs an image of a product or service into an IVI system. The proposed recommendation system involves training deep learning-based image classification models with the user’s interest images to classify the category, measure the similarity with the business category using Word2vec, and finally provide the locations of the businesses with a high degree of similarity via IVI, using a smartphone. The experimental results indicated that the user’s interest image exhibited 85% accuracy for category classification via the EfficientNet B0 model, while the similarity between the image and business categories using Word2vec was particularly high in the business category similar to the actual image category.


Restoration of Dimensions for Ancient Drawing Recognition

September 2021

·

66 Reads

·

8 Citations

Electronics

This study aims to investigate and determine the actual size of the “cheok” scale—the traditional weights and measures of Korea—to aid in data construction on the recognition of ancient drawings in the field of artificial intelligence. The cheok scale can be divided into Yeongjocheok, Jucheok, Pobaekcheok, and Joryegicheok. This study calculated the actual dimensions used in the drawings of Tonga and Eonjo contained in Jaseungcha Dohae by Gyunam Ha BaeckWon, which helped us analyze the scale used in the southern region of Korea in the 1800s. The scales of 1/15 cheok and 1/10 cheok were used in the Tonga and Eonjo sections in Jaseungcha Dohae, and the actual dimensions in the drawing were converted to the scale used at the time. Owing to the conversion, the dimensions in the drawings of Tonga were converted to 30.658 cm per cheok, and ~31.84 cm per cheok for Eonjo. In this manner, the actual dimensions used in the southern region of Korea around the year 1800 were restored. Through this study, the reference values for drawing recognition of machinery drawings in Korea around 1800 were derived.


Gradient descent for quadratic functions using geometric mean and the Kai Fang method

September 2021

·

16 Reads

Concurrency and Computation Practice and Experience

The geometric mean is typically used to measure the mean of inflation rate and population fluctuation. It is also used in the description and analysis of singularities and geometric distance spaces. Gradient descent is an integral part of artificial intelligence. In this study, we transform the gradient calculation from conventional quadratic gradient descent algorithms into a root extraction calculation using geometric means. To eliminate the computational complexity of differential operations in gradient calculation and to easily calculate roots using only fundamental arithmetic operations, we introduce the Kai Fang method, the East Asian traditional root extraction method. To do this, we propose a new quadratic gradient descent method based on geometric means and we apply the Kai Fang method with geometric means to create an improved quadratic gradient descent method. The proposed method shows improved computational ease over conventional methods.



Ad

Citations (69)


... Digital science and technologies such as the internet of things (IoT), blockchain, and artificial intelligence are the core driving forces of this revolution. These disruptive technologies have become the prelude to the new era of the fourth industrial revolution (Al-Qerem et al., 2020;Gupta et al., 2023b;Mamta et al., 2021). The development of science and technology has never been isolated or closed, so the integration and innovation of different technologies will create enormous productivity and promote human civilization to a new and higher level (Chander et al., 2022;Gupta et al., 2023a;Tiwari & Garg, 2022). ...

Reference:

A Trusted Authentication Scheme Using Semantic LSTM and Blockchain in IoT Access Control System
A Deep CNN-based Framework for Distributed Denial of Services (DDoS) Attack Detection in Internet of Things (IoT)
  • Citing Conference Paper
  • August 2023

... Another approach involved FastGAN [208], enhanced with a condition vector for generating high-quality images from small datasets, such as manga faces drawn from Osamu Tezuka's works [209], which showed improved FID scores compared to the original FastGAN. In [210], a multilingual text-to-image model was developed for creating webtoons, demonstrating the adaptability of GANs to multilingual contexts. However, despite various training experiments, the webtoon fine-tuned GAN underperformed compared to fine-tuning Korean MSCOCO, with the latter still producing abstract images that noticeably differed from authentic comics. ...

A Study on Generating Webtoons Using Multilingual Text-to-Image Models

Applied Sciences

... This framework also enables the analysis of various construction strategies and resource allocations. Meanwhile, Hong et al. [48] constructed a power grid domain ontology based on an electric power business platform, facilitating the utilization of ontology reasoning to inquire into various contextual outcomes of power grid construction. ...

The Method of Power Domain Ontology Construction and Reasoning based on Power Business Platform
  • Citing Article
  • June 2020

... Therefore, although this method may be effective in certain datasets that have cyclic patterns, it is not suitable for datasets that have multiple types of power consumption patterns. Experiments are conducted on compensation methods based on ARIMA and long short-term memory (LSTM) estimations, which are compensation methods based on time-series estimation, in addition to these two conventional methods [29]. In this research, we study a hybrid method that combines the advantages of the linear interpolation method and those of the LSTM estimation-based compensation method; subsequently, we perform a comparative analysis. ...

Method of estimation of missing data in AMI system
  • Citing Conference Paper
  • September 2020

... Lee and Kim introduced a new ensemble learning technique with multiple stacking [15]. Joo et al. (2021) proposed an efficient healthcare service based on Stacking Ensemble [15]. Federmann et al. investigated using machine learning algorithms to improve phrase selection in hybrid machine translation [16]. ...

Efficient healthcare service based on Stacking Ensemble
  • Citing Conference Paper
  • December 2020

... The problems with murals are mainly cracks, falling off, fading, and herpes [12,13]. With the passage of time, the damage from the strokes will be more serious, so the protection of the murals is imminent. ...

Restoration of Dimensions for Ancient Drawing Recognition

Electronics

... Jeongin Kim et al. (2021) [21] introduced a Word2VnCR algorithm to replace an OOV word with a semantically related term when an error occurs in morpheme analysis. With the help of this approach, candidate words to be exchanged with the OOV word having the same meaning as OOV are extracted and their semantic similarity to the OOV word's nearby terms can be determined. ...

Replacing Out-of-Vocabulary Words with an Appropriate Synonym Based on Word2VnCR

Mobile Information Systems

... However, this digital transformation has raised serious concerns about the protection of sensitive privacy information embedded within these texts [2]. Sensitive privacy information, such as personal identifiable information (PII), financial data, and medical records, is highly susceptible to privacy breaches due to unauthorized access [3]. The leakage of sensitive information leads to financial fraud, identity theft, and business losses [4]. ...

Data Independent Acquisition Based Bi-Directional Deep Networks for Biometric ECG Authentication

Applied Sciences

... Recent advancements in AI and deep-learning algorithms have significantly contributed to the analysis and classification of qualitative content, such as text and images [19][20][21]. These techniques offer new possibilities for analyzing the content generated by tourists, thus providing insights into their characteristics and tourism activities. ...

Enhancing Personalized Ads Using Interest Category Classification of SNS Users Based on Deep Neural Networks

Sensors

... However, this scheme is vulnerable to conspiracy attacks among specific authentication nodes. li [9] et al. proposed a group bulk authentication and membership management scheme with session unlinkability feature, but the scheme requires vehicle users to generate a large number of random identity tags locally, which cannot meet the storage requirements of resource-constrained vehicle devices. In addition, bulk authentication schemes with conditional privacy protection [10,11] and malicious node anonymity revocation [12] have been proposed, but these schemes lack hierarchical membership relationship management and do not have perfect forward security. ...

Practical Homomorphic Authentication in Cloud-Assisted VANETs with Blockchain-Based Healthcare Monitoring for Pandemic Control

Electronics