
Minho Lee- Kyungpook National University
Minho Lee
- Kyungpook National University
About
366
Publications
57,869
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,353
Citations
Introduction
Skills and Expertise
Current institution
Publications
Publications (366)
Neuro-symbolic neural networks have been extensively studied to integrate symbolic operations with neural networks, thereby improving systematic generalization. Specifically, Tensor Product Representation (TPR) framework enables neural networks to perform differentiable symbolic operations by encoding the symbolic structure of data within vector sp...
Block Diagrams play an essential role in visualizing the relationships between components or systems. Generating summaries of block diagrams is important for document understanding or question answering (QA) tasks by providing concise overviews of complex systems. However, it’s a challenging task as it requires compressing complex relationships int...
Neuro-symbolic neural networks have been extensively studied to integrate symbolic operations with neural networks, thereby improving systematic generalization. Specifically, Tensor Product Representation (TPR) framework enables neural networks to perform differentiable symbolic operations by encoding the symbolic structure of data within vector sp...
In recent research, Tensor Product Representation (TPR) is applied for the systematic generalization task of deep neural networks by learning the compositional structure of data. However, such prior works show limited performance in discovering and representing the symbolic structure from unseen test data because their decomposition to the structur...
Catastrophic forgetting, which means a rapid forgetting of learned representations while learning new data/samples, is one of the main problems of deep neural networks. In this paper, we propose a novel incremental learning framework that can address the forgetting problem by learning new incoming data in an online manner. We develop a new incremen...
Air conditioner consumers expect their products to work well without any problem. Consumers encounter problems if the amount of refrigerant in the air conditioner is insufficient. Therefore, we propose a novel deep learning approach that predicts the amount of refrigerant in advance. Our approach differs from others, as it is not limited to specifi...
Certain datasets contain a limited number of samples with highly various styles and complex structures. This study presents a novel adversarial Lagrangian integrated contrastive embedding (ALICE) method for small-sized datasets. First, the accuracy improvement and training convergence of the proposed pre-trained adversarial transfer are shown on va...
In this study, we propose a virtual assistant system that is applied to real life using signal processing and deep learning. First, the overall structure of the proposed system that integrates and controls various modules is introduced, after which we present a multi-modal fusion module that provides services to users. It integrates a natural langu...
Block diagrams are very popular for representing a workflow or process of a model. Understanding block diagrams by generating summaries can be extremely useful in document summarization. It can also assist people in inferring key insights from block diagrams without requiring a lot of perceptual and cognitive effort. In this paper, we propose a nov...
Certain datasets contain a limited number of samples with highly various styles and complex structures. This study presents a novel adversarial Lagrangian integrated contrastive embedding (ALICE) method for small-sized datasets. First, the accuracy improvement and training convergence of the proposed pre-trained adversarial transfer are shown on va...
Temporal action proposal generation aims to generate temporal boundaries containing action instances. In real-time applications such as surveillance cameras, autonomous driving, and traffic monitoring, the online localization and recognition of human activities occurring in short temporal intervals are important areas of research. Existing approach...
Prediction of clinical outcomes using the patient’s medical data enhances clinical decision making and improves prognostic accuracy. Deep learning (DL) for medical decision support systems has particularly shown expert-level accuracy in predicting clinical outcomes. However, most of these machine learning and Artificial Intelligent (AI) models lack...
Despite recent progress in memory augmented neural network (MANN) research, associative memory networks with a single external memory still show limited performance on complex relational reasoning tasks. Especially the content-based addressable memory networks often fail to encode input data into rich enough representation for relational reasoning...
Recognition of ancient Korean-Chinese cursive character (Hanja) is a challenging problem mainly because of large number of classes, damaged cursive characters, various hand-writing styles, and similar confusable characters. They also suffer from lack of training data and class imbalance issues. To address these problems, we propose a unified Regula...
In energy disaggregation (ED), an accurate estimation of each appliance power consumption over time is crucial for practical applications. Most optimization-based ED approaches are single-objective and the disaggregated appliance power consumption profiles provided by them do not match the actual operational characteristics of the devices. In this...
There has been an increased interest in high-level image-to-image translation to achieve semantic matching. Through a powerful translation model, we can efficiently synthesize high-quality images with diverse appearances while retaining semantic matching. In this paper, we address an imbalanced learning problem using a cross-species image-to-image...
This paper presents Scene2Wav, a novel deep convolutional model proposed to handle the task of music generation from emotionally annotated video. This is important because when paired with the appropriate audio, the resulting music video is able to enhance the emotional effect it has on viewers. The challenge lies in transforming the video to audio...
In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomp...
Text classification, using deep learning techniques, has become a research challenge in natural language processing. Most of the existing deep learning models for text classification face difficulties when the length of the input text increases. Most models work well on shorter text inputs, however, their performance degrades with the increase in t...
A differentiable neural computer (DNC) is a memory augmented neural network devised to solve a wide range of algorithmic and question answering tasks and it showed promising performance in a variety of domains. However, its single memory-based operations are not enough to store and retrieve diverse informative representations existing in many tasks...
Recent video captioning models aim at describing all events in a long video. However, their event descriptions do not fully exploit the contextual information included in a video because they lack the ability to remember information changes over time. To address this problem, we propose a novel context-aware video captioning model that generates na...
Residential consumers desire to minimize electricity bills while maximizing comfort by appropriate appliance scheduling. The conflicting nature of the objectives facilitates a multi-objective formulation that can provide a set of trade-off schedules enabling better decision making. In literature, user preference or comfort regarding each device at...
Thin-film solar cells are predominately designed similar to a stacked structure. Optimizing the layer thicknesses in this stack structure is crucial to extract the best efficiency of the solar cell. The commonplace method used in optimization simulations, such as for optimizing the optical spacer layers' thicknesses, is the parameter sweep. Our sim...
Generating music with emotion similar to that of an input video is a very relevant issue nowadays. Video content creators and automatic movie directors benefit from maintaining their viewers engaged, which can be facilitated by producing novel material eliciting stronger emotions in them. Moreover, there's currently a demand for more empathetic com...
Generating music with emotion similar to that of an input video is a very relevant issue nowadays. Video content creators and automatic movie directors benefit from maintaining their viewers engaged, which can be facilitated by producing novel material eliciting stronger emotions in them. Moreover, there is currently a demand for more empathetic co...
In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomp...
Brain hemorrhage segmentation in Computed Tomography (CT) scan images is challenging, due to low image contrast and large variations of hemorrhages in appearance. Unlike the previous approaches estimating the binary masks of hemorrhages directly, we newly introduce affinity graph, which is a graph representation of adjacent pixel connectivity to a...
In order to tackle the problem of abstractive summarization of long multi-sentence texts, it is critical to construct an efficient model, which can learn and represent multiple compositionalities better. In this paper, we introduce a temporal hierarchical pointer generator network that can represent multiple compositionalities in order to handle lo...
The aim of this study is to recognize human emotions from electroencephalographic (EEG) signals using deep neural networks. Large training data is an important prerequisite for successful implementation of deep neural networks. In this view, we propose an independent component analysis (ICA) - evolution based data augmentation method. This method p...
We propose to use Genetic Algorithm (GA), inspired by Darwin's evolution theory, to optimize the search for the optimal thickness in organic solar cell's layers with regards to maximizing the short-circuit current density. The conventional method used in optimization simulations, such as for optimizing the optical spacer layers' thicknesses, is the...
Multimodal emotion understanding enables AI systems to interpret human emotions. With accelerated video surge, emotion understanding remains challenging due to inherent data ambiguity and diversity of video content. Although deep learning has made a considerable progress in big data feature learning, they are viewed as deterministic models used in...
Recently, many researchers have attempted to apply deep neural networks to detect Atrial Fibrillation (AF). In this paper, we propose an approach for prediction of AF instead of detection using Deep Convolutional Neural Networks (DCNN). This is done by classifying electrocardiogram (ECG) before AF into normal and abnormal states, which is hard for...
For the artificial intelligence (AI) to effectively mimic humans, understanding humans, more specifically, human emotion is important. Sentiment analysis aims to automatically uncover the underlying sentiment or emotions that humans hold towards an entity. There is high ambiguity of emotion in text data. In this paper, we consider the sentence-leve...
Image transformation between multiple domains has become a challenging problem in deep generative networks. This is because, in real-world applications, finding paired images in different domains is an expensive and impractical task. This paper proposes a new model named joint moment-matching autoencoders(JMA). This model learns to perform cross-do...
This study analyzes the characteristics of unsupervised feature learning using a convolutional neural network (CNN) to investigate its efficiency for multi-task classification and compare it to supervised learning features. We keep the conventional CNN structure and introduce modifications into the convolutional auto-encoder design to accommodate a...
Inspired by the recent advances in generative models, we introduce a human action generation model in order to generate a consecutive sequence of human motions to formulate novel actions. We propose a framework of an autoencoder and a generative adversarial network (GAN) to produce multiple and consecutive human actions conditioned on the initial s...
Coupled Generative Adversarial Network (CoGAN) was recently introduced in order to model a joint distribution of a multi modal dataset. The CoGAN model lacks the capability to handle noisy data as well as it is computationally expensive and inefficient for practical applications such as cross-domain image transformation. In this paper, we propose a...
In humans, perception and action (PA) possess cyclically causal relations. In this paper, we propose a new PA-based cyclic learning framework to autonomously enhance the depth-estimation accuracy of a humanoid robot and perform given behavioral tasks. The proposed method integrates the concepts of sensory invariance-driven action and object-size in...
In deep generative networks, one of the major challenges is to generate non-blurry, clearer images. Unlike the generative adversarial networks, generative models such as variational autoencoders, generative moment matching networks etc. use pixel-wise loss which leads to the generation of blurry images. In this paper, we propose an improved generat...
Deep learning based vision understanding algorithms have recently approached human-level performance in object recognition and image captioning. These performance evaluations are, however, limited to static data and these algorithms are also limited. Few limitations of these methods include their inability to selectively encode human behavior, move...
Understanding of human intention by observing a series of human actions has been a challenging task. In order to do so, we need to analyze longer sequences of human actions related with intentions and extract the context from the dynamic features. The multiple timescales recurrent neural network (MTRNN) model, which is believed to be a kind of solu...
In this paper, we propose a sensitive convolutional neural network which incorporates sensitivity term in the cost function of Convolutional Neural Network (CNN) to emphasize on the slight variations and high frequency components in highly blurred input image samples. The proposed cost function in CNN has a sensitivity part in which the conventiona...
Studies have shown that some verbal metaphors require mental images that are akin to perceptual experiences, suggesting that metaphor processing requires interaction among different modalities. We explore here brain activation patterns during visual metaphor comprehension. We found significant activation in temporal lobe (BA 22) during visual metap...
In smart cities, an intelligent traffic surveillance system plays a crucial role in reducing traffic jams and air pollution, thus improving the quality of life. An intelligent traffic surveillance should be able to detect and track multiple vehicles in real-time using only limited resources. Conventional tracking methods usually run at a high video...
We propose a face recognition and notification system that can transform visual face information into tactile signals in order to help visually impaired people. The proposed system consists of a glasses type camera, a mobile computer and an electronic cane. The glasses type camera captures the frontal view of the user, and sends this image to mobil...
This pupillometry study examined the relationship between intelligence and creative cognition from
the resource allocation perspective. It was hypothesized that, during a creative metaphor task, individuals with higher intelligence scores would have different resource allocation patterns than individuals with lower intelligence scores. The study al...
To develop an advanced human-robot interaction system, it is important to first understand how human beings learn to perceive, think, and act in an ever-changing world. In this paper, we propose an intention understanding system that uses an Object Augmented-Supervised Multiple Timescale Recurrent Neural Network (OA-SMTRNN) and demonstrate the effe...
In this paper, we propose a new Convolutional Neural Network (CNN) with biologically inspired retinal structure and ON/OFF Rectified Linear Unit (ON/OFF ReLU). Retinal structure enhances input images by center surround difference of green-red and blue-yellow components, which in turn creates positive as well as negative features like ON/OFF visual...
Deep learning has received significant attention recently as a promising solution to many problems in the area of artificial intelligence. Among several deep learning architectures, convolutional neural networks (CNNs) demonstrate superior performance when compared to other machine learning methods in the applications of object detection and recogn...
Missing feature is a common problem in real-world data classification. Therefore, a robust classification method is required when classifying data with missing features. In this study, we propose an iterative algorithm composed of a generative model that works in conjunction with a discriminative model in a cycle. The Gaussian mixture model (GMM) a...
Scenes can convey emotion like music. If that’s so, it might be possible that, given an image, one can generate music with similar emotional reaction from users. The challenge lies in how to do that. In this paper, we use the Hue, Saturation and Lightness features from a number of image samples extracted from videos excerpts and the tempo, loudness...
In this paper, we analyze the efficiency of unsupervised learning features in multi-task classification, where the unsupervised learning is used as initialization of Convolutional Neural Network (CNN) which is trained by a supervised learning for multi-task classification. The proposed method is based on Convolution Auto Encoder (CAE), which mainta...
Behavior-directed intentions can be revealed by certain biological signals that precede behaviors. This study used eye movement data to infer human behavioral intentions. Participants were asked to view pictures while operating under different intentions, which necessitated cognitive search and affective appraisal. Intentions regarding the pictures...
In this work, we introduce temporal hierarchies to the sequence to sequence (seq2seq) model to tackle the problem of abstractive summarization of scientific articles. The proposed Multiple Timescale model of the Gated Recurrent Unit (MTGRU) is implemented in the encoder-decoder setting to better deal with the presence of multiple compositionalities...
In this work, we introduce temporal hierarchies to the sequence to sequence (seq2seq) model to tackle the problem of abstractive summarization of scientific articles. The proposed Multiple Timescale model of the Gated Recurrent Unit (MTGRU) is implemented in the encoder-decoder setting to better deal with the presence of multiple compositionalities...
This paper proposes a system to detect driver`s cognitive state by internal and external information of vehicle. The proposed system can measure driver`s eye gaze. This is done by concept of information delivery and mutual information measure. For this study, we set up two web-cameras at vehicles to obtain visual information of the driver and front...
In daily life, humans get stress very often. Stress is one of the important factors of healthy life and closely related to the quality of life. Too much stress is known to cause hormone imbalance of our body, and it is observed by the brain and bio signals. Based on this, the relationship between brain signal and stress is explored, and brain signa...
Lane detection is a widely researched topic. Although simple road detection is easily achieved by previous methods, lane detection becomes very difficult in several complex cases involving noisy edges. To address this, we use a Convolution neural network (CNN) for image enhancement. CNN is a deep learning method that has been very successfully appl...
Monitoring the concentration level of a learner is important to maximize the learning effect, giving proper feedback on tasks and to understand the performance of learners in tasks. In this paper, we propose a personal concentration level monitoring system when a user performs an online task on a computer by analyzing his/her pupillary response and...
The four volume set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitues the proceedings of the 23rd International Conference on Neural Information Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 full papers presented were carefully reviewed and selected from 431 submissions. The 4 volumes are organized in topical secti...
The four volume set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitues the proceedings of the 23rd International Conference on Neural Information Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 full papers presented were carefully reviewed and selected from 431 submissions. The 4 volumes are organized in topical secti...
The four volume set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitues the proceedings of the 23rd International Conference on Neural Information Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 full papers presented were carefully reviewed and selected from 431 submissions. The 4 volumes are organized in topical secti...
The four volume set LNCS 9947, LNCS 9948, LNCS 9949, and LNCS 9950 constitutes the proceedings of the 23rd International Conference on Neural Information Processing, ICONIP 2016, held in Kyoto, Japan, in October 2016. The 296 full papers presented were carefully reviewed and selected from 431 submissions. The 4 volumes are organized in topical sect...
This paper proposes a human implicit intent recognition system based on electroencephalography (EEG) signals, for developing an advanced interactive web service engine. We focus on identifying brain state transitions between intentions, and classifying a user's implicit intentions while viewing an image on a web page, based on an EEG experiment. We...
A commonly used method to determine the intelligence of an individual is group test. It checks accuracy and response time while they solve a series of problems. However, it takes long time and is often inaccurate if the difficulty level of problems is high or the number of problems is too small. Therefore , there is an urgent need to find an object...
In Advanced Driving Assistance Systems (ADASs), for traffic safety, one of main application is to notify the driver regarding the important traffic information such as presence of a pedestrian or information regarding traffic signals. In a particular driving scenario, the amount of information related to the situation available to the driver can be...
In this paper, we explore how a humanoid robot having two cameras can learn to improve depth perception by itself. We propose an approach that can autonomously improve depth estimation of the humanoid robot. This approach can tune parameters that are required for binocular vision system of the humanoid robot and improve depth perception automatical...
This paper proposes a modification of convolutional neural network (CNN) with biologically inspired structure, retinal structure and ON/OFF rectified linear unit (ON/OFF ReLU). Retinal structure enhances input images by center surround difference of green and red, blue and yellow components and creates positive results and negative results like ON/...
Concentration is an important part of our life especially during learning or thinking. Visually or auditory evoked concentration affects information processing in human brain. To understand the concentration process of humans, the underlying neural mechanism needs to be explored. EEG device is a promising device to understand underlying neural mech...
We present our ongoing work on a creativity assistance tool called I-get. The tool is based on the hypothesis that perceptual similarity between a pair of images, at a subconscious level, plays a key role in generating creative conceptual associations and metaphorical interpretations. The tool " I-get " is designed to assist users to create novel i...
Picture is worth a thousand words. With changing life styles and technology advancement, visual or pictorial communication is preferred. We present a system to generate a pictogram for simple Korean sentences. The final pictogram integrates information about the object (about which something is said), the background (the environment) and the emotio...
Picture is worth a thousand words. With changing life styles and technology advancement, visual or pictorial communication is preferred. We present a system to generate a pictogram for simple Korean sentences. The final pictogram integrates information about the object (about which something is said), the background (the environment) and the emotio...