Duc Dung Nguyen

Duc Dung Nguyen
  • PhD
  • Senior Lecturer at Hochiminh city univeristy of technology

About

58
Publications
11,155
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
322
Citations
Current institution
Hochiminh city univeristy of technology
Current position
  • Senior Lecturer
Additional affiliations
March 2014 - present
Sungkyunkwan University
Position
  • PostDoc Position
Education
March 2007
Sungkyunkwan University
Field of study
  • Computer Vision
September 2002 - March 2007
Ho Chi Minh City University of Technology
Field of study
  • Computer Science and Engineering

Publications

Publications (58)
Preprint
Full-text available
Accurate prediction of 3D semantic occupancy from 2D visual images is vital in enabling autonomous agents to comprehend their surroundings for planning and navigation. State-of-the-art methods typically employ fully supervised approaches, necessitating a huge labeled dataset acquired through expensive LiDAR sensors and meticulous voxel-wise labelin...
Chapter
Full-text available
Multi-Task Learning (MTL) has proven its effectiveness for decades. By combining certain related tasks, neural networks will likely perform better due to their inductive biases obtained from concrete tasks. Thus, many AI systems (such as GPT) have been developed based on MTL as the de facto solution. MTL has been applied early in the field of autom...
Chapter
Self-supervised monocular depth estimation (MDE) has recently gained significant attention, demonstrating remarkable results, particularly in daytime scenarios. However, MDE in nighttime images remains challenging due to the sensitivity of photometric loss to noise and undiffused light illumination. In this paper, we propose a simple but highly eff...
Chapter
Lung tumor segmentation in computed tomography (CT) images is a critical task in medical image analysis. It aids in the early detection and diagnosis of lung cancer, which is one of the primary causes of cancer deaths around the world. However, because of the variable sizes, uncertain shapes of lung nodules, and complex internal lung structure, lun...
Chapter
Despite the remarkable result of Neural Scene Flow Fields [10] in novel space-time view synthesis of dynamic scenes, the model has limited ability when a few input views are provided. To enable the few-shots novel space-time view synthesis of dynamic scenes, we propose a new approach that extends the model architecture to use shared priors learned...
Chapter
The need to understand people, especially their behaviors and feelings, is growing significantly in today’s quickly-moving world. Despite the remarkable progress of science and technology in general and artificial intelligence in particular, facial emotion recognition remains challenging. This paper proposes a unique method for enhancing the accura...
Chapter
The goal of video frame synthesis is from given frames. In other words, it tends to predict single or several future frames (video prediction - VP) or in-between frames (video frame interpolation - VFI). This is one of the challenging problems in the computer vision field. Many recent VP and VFI methods employ optical flow estimation in the predict...
Article
Full-text available
Deep learning has been introduced to single-image super-resolution (SISR) in the last decade. These techniques have taken over the benchmarks of SISR tasks. Nevertheless, most architectural designs necessitate substantial computational resources, leading to a prolonged inference time on embedded systems or rendering them infeasible for deployment....
Preprint
Full-text available
Deep learning has been introduced to single-image super-resolution (SISR) in the last decade. These techniques have taken over the benchmarks of SISR tasks. Nevertheless, most architectural designs necessitate substantial computational resources, leading to a prolonged inference time on embedded systems or rendering them infeasible for deployment....
Article
Full-text available
With the successful development of deep learning, single image super-resolution (SISR) has advanced significantly in recent years. However, in practice, excessive convolutions limit super-resolution applications on platforms with limited resources like mobile devices or embedded systems. Besides, existing lightweight models have a problem with smal...
Article
Full-text available
Many alternative approaches for 3D object detection using a singular camera have been studied instead of leveraging high-precision 3D LiDAR sensors incurring a prohibitive cost. Recently, we proposed a novel approach for 3D object detection by employing a ground plane model that utilizes geometric constraints named GAC3D to improve the results of t...
Conference Paper
Full-text available
Automatic speech recognition (ASR) is one of the emergency tasks in human-computer interaction. There are many studies work in the field of building network architecture to deal with this task. While data augmentation was deeply discovered in computer vision, it is a big lag behind in the field of speech. Large data collection is not trivial, and i...
Chapter
This paper proposes a method using the Instance-based transfer learning approach to build a Vietnamese speech synthesis system based on a target voice 45 times smaller than source voice. By using the correlation features between data objects, we can reuse the entire model or part of the previously trained weights to retrain with the new data set. O...
Article
Full-text available
Monocular 3D object detection has recently become prevalent in autonomous driving and navigation applications due to its cost-efficiency and easy-to-embed to existent vehicles. The most challenging task in monocular vision is to estimate a reliable object’s location cause of the lack of depth information in RGB images. Many methods tackle this ill-...
Chapter
The common integral transforms present the speech signal to another space with a set of orthogonal basis vectors. The speech, in terms of nature, is a periodic signal so the basis vectors are periodic too, particularly the sinusoidal wave. In reality, after impacted by many outside agents, the speech signal is not always periodic. This leads to the...
Chapter
This work represents an alternative model for speech synthesis, which addresses some major disadvantages of current end-to-end models. Current state-of-the-art models still have some troubles while dealing with long sentences and the size of the dataset. Our proposed Adaptive Alignment Tacotron (AAT) model, however, has achieved impressive results...
Conference Paper
This paper proposes an alternative approach for synthesizing Vietnamese speech. In order to improve the naturalness and intelligibility of the speech synthesizer, we propose the Vietnamese phoneme model and integrate into an end-to-end model. We employ the pipeline of Tacotron 2 model with attention scheme to generate high-quality speeches. As we r...
Chapter
While detecting object becomes easier with deep models, estimating pose remains a challenging problem in modern vision research. In this work, we propose a method that enables detecting objects and estimating their pose simultaneously in a single model, without intermediate stages. Unlike some other approaches, we make the first attempt to hierarch...
Chapter
In this paper, we propose a new model for building a conversational dialogue system which provides natural, realistic and flexible interaction between human and machine based on large movie subtitles dataset. Our models are a generative model that is autonomously generated word-by-word, opening up the possibility of working on many different langua...
Conference Paper
Early detection and prediction of cardiac anomalies play an important role in the diagnosis and treatment of car- diovascular diseases. In medicine, electrocardiography provides valuable information for the doctors since they can accurately determine what is happening concerning the heart activities. Nevertheless, electrocardiography classification...
Preprint
Early detection and prediction of cardiac anomalies play an important role in the diagnosis and treatment of cardiovascular diseases (CVD). In medicine, electrocardiography (ECG or EKG) is valuable information for every doctor since they can accurately determine what is happening to heart activities. Nevertheless, ECG classification is a non-trivia...
Preprint
Early detection and prediction of cardiac anomalies play an important role in the diagnosis and treatment of cardiovascular diseases (CVD). In medicine, electrocardiography (ECG or EKG) is valuable information for every doctor since they can accurately determine what is happening to heart activities. Nevertheless, ECG classification is a non-trivia...
Conference Paper
Full-text available
High Performance Computing (HPC) is playing an important role in a variety of domains with the demand of high-level computational capacity. Besides, HPC provides services for a huge range of different users as well as multiple environments. Hence, the performance of the network is also one of the important criteria. The advent of InfiniBand (IB) ai...
Article
Full-text available
Collecting survey and feedback for analyzing useful information plays an important role in many fields such as business, market, manager, etc. In education, this analysis is the key in improving the teaching quality and the management process. We are interested in comments on students which are collected in the surveys. To valuate the progress of s...
Article
In this study, an advanced variational model is presented for problem modelling in computer vision and image processing. The proposed model allows for the definition of multiple constraints in data fidelity, which has not been considered in previous state-of-the-art methods. With this definition, the model is more robust and flexible with regard to...
Article
This paper introduces improvements to estimate 3D object pose from point clouds. We use point-pair feature for matching instead of traditional approaches using local feature descriptors. In order to obtain high accuracy estimation, a discriminative descriptor is introduced for point-pair features. The object model is a set of point pair descriptors...
Article
Stereo correspondence is challenging under realistic conditions due to uncontrolled factors that affect input images, including illumination inconsistencies and radiometric variations. Many local and global models have been suggested to address these problems; however, their performance is often degraded due to the assumption of color consistency b...
Article
This paper presents the design and implementation of a pipelined architecture and a method for real-time human detection using depth image from a Time-of-Flight (ToF) camera. In the proposed method, we use Euclidean Distance Transform (EDT) in order to extract human body location, and we then use the 1D, 2D scanning window in order to extract human...
Patent
Full-text available
An apparatus and method can effectively detect both hands and hand shape of a user from images input through cameras. A skin image detecting skin regions from one of the input images and a stereoscopic distance image are used. For hand detection, background and noise are eliminated from a combined image of the skin image and the distance image and...
Data
A new approach for vehicle detection and distance estimation based on stereo vision and evolutionary algorithm (SEA) is described in this paper. First, we reuse our recent work on FPGA implementation of census-based correlations for stereo matching. Next, the SEA uses the gray scale left image and disparity information obtained from the FPGA system...
Conference Paper
In this paper, we propose an alternative blocking-matching approach to the correspondence problem in stereo matching. In blocking-matching algorithms, a local window is used to measure the similarity (or dissimilarity) between pixels of a stereo pair. Although some area-based stereo matching methods have been developed and work well in many kinds o...
Article
The evolutionary algorithm (EA) is an effective method for solving various problems because it can search through very large search spaces and come to nearly optimal solutions quickly. However, existing EA-based methods for vehicle detection cannot achieve high performance because their fitness functions depend on sensitive information, such as edg...
Article
In this study, the authors propose an adaptive scheme to improve motion estimation of a variational model based on image features and flow quality measurements. Using image features, the authors introduce adaptive functions and inject them into the energy function to fine-tune the estimation process. They propose a hybrid scheme to deal with large...
Conference Paper
Noise removal in image processing is required in a variety of fields such as object tracking, stereo vision and medical image reconstruction. To obtain accurate results, various video pre-processing is required. We propose a hardware architecture using FPGA to improve the processing speed with the Total Variation algorithm for noise removing images...
Conference Paper
High dynamic range conditions are major obstacles to the im-plementation of practical stereovision systems in real scenes. We address this problem by introducing an adaptive local ternary-derivative pattern (ALTDP) which is a fusion of the local ternary pattern (LTP) and local derivative pattern (LDP). We make three main contributions in this study...
Conference Paper
A new approach for vehicle detection and distance estimation based on stereo vision and evolutionary algorithm (SEA) is described in this paper. First, we reuse our recent work on FPGA implementation of census-based correlations for stereo matching. Next, the SEA uses the gray scale left image and disparity information obtained from the FPGA system...
Conference Paper
This paper describes pipelined hardware architecture for real-time laser point position detection system by using FPGA. This system processes 640×480 resolution images that were received by the CCD camera. We detected the position of the laser point through the following 5 steps. First, the screen area is detected by Homography transform by using c...
Conference Paper
Full-text available
The gateway plays an important role in networks with different communication protocols because it enables devices in these networks to communicate with each other. Designing a gateway for multiple-devices with one physical device helps us to reduce costs and take full advantage of the higher-speed communication protocol. The lack of realtime capaci...
Conference Paper
In this work, we introduce an alternative measurement to evaluate quality of flow field. We then present an alternative filter called OA-filter (Occlusion-Aware filter) which uses this measurement. It is integrated inside the solver to push the performance of the model in terms of accuracy and sharpness. The experiments show that the estimation res...
Conference Paper
We introduce an alternative method to improve optical flow estimation using image data for control functions. Base on the nature of object motion, we tune the energy minimization process with an image-adaptive scheme embedded inside the energy function. We propose a hybrid scheme to improve the quality of the flow field and we use it along with the...
Conference Paper
Vehicle detection and distance estimation system has become important due to their assistance in reducing vehicle accidents. Therefore, an efficient vehicle detection and distance estimation algorithm using a knowledge-based method and image segmentation technique has been developed. The proposed algorithm can detect and estimate the distance of th...
Conference Paper
This paper presents a variational model to compute the optical flow using image-driven functions. The intensity, gradient and smoothness have different influences on each image area. Thus, we propose the control functions that take the image as the input to tune the estimation process. We use the second moment matrix to characterize distinct image...
Conference Paper
Full-text available
Optical flow is a motion field estimation method that has a wide range of applications. In this paper, we present a fully pipelined hardware architecture for high-speed optical flow estimation based on a full-search block matching algorithm. A census transform is applied to the corresponding pixels in the current and previous frame. The similarity...
Conference Paper
We present a method to detect human fingertips from images captured by a stereo camera. The system makes use of the disparity information from a stereo camera to find candidates, and defines an evaluation process to detect two hands. The finger detector then processes each hand image to extract finger images. Finally, we perform geometric calculati...
Conference Paper
In this paper, we describe a method to detect human fingers from images captured by a stereo camera. The images captured by the stereo camera are preprocessed by a skin detection module. Two hands are extracted from the skin regions using disparity information. We then apply grayscale morphology and use BLOB analysis to detect fingers and their dir...
Conference Paper
Extracting the positions of hands is an important step in human computer interaction and robot vision applications. Posture and gesture can be extracted from hand positions and the appropriate task can be performed. In this paper, we propose an approach to extract hand images using skin color and stereo information. Our method does not require clea...
Conference Paper
In this paper, we propose a simple binary shape matching algorithm and its hardware implementation. A shape matching algorithm is a method to measure the similarity of objects in an image. This technology is generally used in image retrieval, inspection, and object detection. The proposed method uses a transformation matrix that handles translation...

Network

Cited By