Hideaki UchiyamaToshiba Corporation
Hideaki Uchiyama
About
75
Publications
24,241
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
977
Citations
Publications
Publications (75)
We unveil how generalizable AI can be used to improve multi-view 3D pedestrian detection in unlabeled target scenes. One way to increase generalization to new scenes is to automatically label target data, which can then be used for training a detector model. In this context, we investigate two approaches for automatically labeling target data: pseu...
Pedestrian detection is a critical problem in many areas, such as smart cities, surveillance, monitoring, autonomous driving, and robotics. AI-based methods have made tremendous progress in the field in the last few years, but good performance is limited to data that match the training datasets. We present a multi-camera 3D pedestrian detection met...
This paper presents a neural network based method for 3D plant growth prediction based on sequential images-to-images translation. Especially, we extend an existing image-to-image translation technique based on U-Net to images-to-images translation by incorporating convLSTM into skip connections in U-Net. With this architecture, we can achieve sequ...
3D pedestrian tracking using multiple cameras is still a challenging task with many applications such as surveillance, behavioral analysis, statistical analysis, and more. Many of the existing tracking solutions involve training the algorithms on the target environment, which requires extensive time and effort. We propose an online 3D pedestrian tr...
IPIN 2019 Competition, sixth in a series of IPIN competitions, was held at the CNR Research Area of Pisa (IT), integrated into the program of the IPIN 2019 Conference. It included two on-site real-time Tracks and three off-site Tracks. The four Tracks presented in this paper were set in the same environment, made of two buildings close together for...
Recovering the 3D shape of a person from its 2D appearance is ill-posed due to ambiguities. Nevertheless, with the help of convolutional neural networks (CNN) and prior knowledge on the 3D human body, it is possible to overcome such ambiguities to recover detailed 3D shapes of human bodies from single images. Current solutions, however, fail to rec...
The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processi...
Improvements on mobile devices allowed tracking applications to be executed on such platforms. However, there still remain several challenges in the field of mobile tracking, such as the extraction of high-level semantic information from point clouds. This task is more challenging when using monocular visual SLAM systems that output noisy sparse da...
The urban environments represent challenging areas for handheld device pose estimation (i.e., 3D position and 3D orientation) in large displacements. It is even more challenging with low-cost sensors and computational resources that are available in pedestrian mobile devices (i.e., monocular camera and Inertial Measurement Unit). To address these c...
This paper present a framework for incremental 3D cuboid modeling combined with RGB-D SLAM. While performing RGB-D SLAM, planes are incrementally reconstructed from point clouds. Then, cuboids are detected in the planes by analyzing the positional relationships between the planes; orthogonality, convexity, and proximity. Finally, the position, pose...
Demand for indoor navigation systems has been rapidly increasing with regard to location-based services. As a cost-effective choice, inertial measurement unit (IMU)-based pedestrian dead reckoning (PDR) systems have been developed for years because they do not require external devices to be installed in the environment. In this paper, we propose a...
This paper presents a framework of incremental 3D cuboid modeling by using the mapping results of an RGB-D camera based simultaneous localization and mapping (SLAM) system. This framework is useful in accurately creating cuboid CAD models from a point cloud in an online manner. While performing the RGB-D SLAM, planes are incrementally reconstructed...
We propose a texture synthesis method to enhance the trackability of a target planar object by embedding natural features into the object in the object design process. To transform an input object into an easy-to-track object in the design process, we extend an inpainting method for naturally embedding the features into the texture. First, a featur...
Stereo matching has been solved as a supervised learning task with convolutional neural network (CNN). However, CNN based approaches basically require huge memory use. In addition, it is still challenging to find correct correspondences between images at ill-posed dim and sensor noise regions. To solve these problems, we propose Sparse Cost Volume...
This paper proposes a new approach to visualizing spatial variation of plant status in a tomato greenhouse based on farm work information operated by laborers. Farm work information consists of a farm laborer’s position and action. A farm laborer’s position is estimated based on radio wave strength measured by using a smartphone carried by the farm...
Fall is one of the leading causes of injury for the elderly individuals. Systems that automatically detect falls can significantly reduce the delay of assistance. Most of commercialized fall detection systems are based on wearable devices, which elderly individuals tend to forget wearing. Using surveillance cameras to detect falls based on computer...
Deep neural network-based (DNN-based) background subtraction has demonstrated excellent performance for moving object detection. The DNN-based background subtraction automatically learns the background features from training images and outperforms conventional background modeling based on handcraft features. However, previous works fail to detail w...
Reconstruction-based change detection methods are robust for camera motion. The methods learn reconstruction of input images based on background images. Foreground regions are detected based on the magnitude of the difference between an input image and a reconstructed input image. For learning, only background images are used. Therefore, foreground...
A thermal camera captures the temperature distribution of a scene as a thermal image. In thermal images, facial appearances of different people under different lighting conditions are similar. This is because facial temperature distribution is generally constant and not affected by lighting condition. This similarity in face appearances is advantag...
RGB-D sensors have become in recent years a product of easy access to general users. They provide both a color image and a depth image of the scene and, besides being used for object modeling, they can also offer important cues for object detection and tracking in real-time. In this context, the work presented in this paper investigates the use of...
This paper presents a method named Depth-Assisted Rectification of Contours (DARC) for detection and pose estimation of texture-less planar objects using RGB-D cameras. It consists in matching contours extracted from the current image to previously acquired template contours. In order to achieve invariance to rotation, scale and perspective distort...
This paper introduces recent progress on techniques of object detection and pose tracking with a monoc-ular camera for augmented reality applications. To visually merge a virtual object onto a real scene with geometrical consistency, a camera pose with respect to the scene needs to be computed. For this issue, many approaches have been proposed in...
This paper presents a folded surface detection and tracking method for augmented maps. First, we model a folded surface as two connected planes. Therefore, in order to detect a folded surface, the plane detection method is iteratively applied to the 2D correspondences between an input image and a reference plane. In order to compute the exact foldi...
Thanks to the great feature that allows to intuitively visualize virtual contents by superimposing them on real scenes, augmented reality (AR) technologies have widely been used in various fields such as entertainment, advertisement, education, tourism, and industrial/medical applications. In particular, AR applications in education have provided g...
This paper presents an approach for detecting and tracking various types of planar objects with geometrical features. We combine traditional keypoint detectors with Locally Likely Arrangement Hashing (LLAH) [21] for geometrical feature based keypoint matching. Because the stability of keypoint extraction affects the accuracy of the keypoint matchin...
This paper presents a novel musical performance system named onNote that directly utilizes printed music scores as a musical instrument. This system can make users believe that sound is indeed embedded on the music notes in the scores. The users can play music simply by placing, moving and touching the scores under a desk lamp equipped with a camer...
We extend planar fiducial markers using random dots [8] to nonrigidly deformable markers. Because the recognition and tracking of random dot markers are based on keypoint matching, we can estimate the deformation of the markers with nonrigid surface detection from keypoint correspondences. First, the initial pose of the markers is computed from a h...
We present an augmented reality (AR) system that is based on a tabletop system and has a hemisphere omnidirectional camera. We perpendicularly set the camera on a tabletop display and showed an omnidirectional image on the display screen. When users present objects to the camera, these objects are captured and recognized by using a specific object...
Intuitive music playing using various digital musical instruments with specific tangible interfaces has become one of the ways to enjoy the music experience [Jorda et al. 2007]. When we start studying music, we need to learn how to read musical scores to understand the different instrumental parts and also understand the melody, rhythm, fingering a...
We propose a camera-tracking method by on-line learning of keypoint arrangements in augmented reality applications. As target
objects, we deal with intersection maps from GIS and text documents, which are not dealt with by the popular SIFT and SURF
descriptors. For keypoint matching by keypoint arrangement, we use locally likely arrangement hashing...
This paper presents a novel approach for detecting and tracking markers with randomly scattered dots for augmented reality applications. Compared with traditional markers with square pattern, our random dot markers have several significant advantages for flexible marker design, robustness against occlusion and user interaction. The retrieval and tr...
This paper presents folded surface detection and tracking for augmented maps. For the detection, plane detection is iteratively applied to 2D correspondences between an input image and a reference plane because the folded surface is composed of multiple planes. In order to compute the exact folding line from the detected planes, the intersection li...
This paper presents an Augmented Reality (AR) system for physical text documents that enable users to click a document. In the system, we track the relative pose between a camera and a document to overlay some virtual contents on the document continuously. In addition, we compute the trajectory of a fingertip based on skin color detection for click...
We propose a novel augmented reality (AR) setup with an omni directional camera on a table top display. The table acts as a mirror on which real playing cards appear augmented with virtual elements. The omni directional camera captures and recognizes its surrounding based on a feature based image retrieval approach which achieves fast and scalable...
We propose a system that registers and retrieves text documents to annotate them on-line. The user registers a text document captured from a nearly top view and adds virtual annotations. When the user thereafter captures the document again, the system retrieves and displays the appropriate annotations, in real-time and at the correct location. Regi...
We propose a novel geovisualization framework based on geographic data matching between maps with intersections and their corresponding intersection database on GIS. First, users select several regions to view. The GIS generates maps with intersections and registers these intersections and their contents in our system. The users then view the map c...
We propose a technique for text document tracking over a large range of viewpoints. Since the popular SIFT or SURF descriptors typically fail on such documents, our method considers instead local arrangement of keypoints. We extends locally likely arrangement hashing (LLAH), which is limited to fronto-parallel images: We handle a large range of vie...
This paper presents a system for overlaying D GIS data information such as 3D buildings onto a 2D physical urban map. We
propose a map recognition framework by analysis of distribution of local intersections in order to recognize the area of the
physical map from a whole map. The retrieval of the geographical area described by the physical map is b...
This paper introduces a new method of Photomosaic. In this method, we propose to use tiled images that can be rotated in a
restricted range. The tiled images are selected from a database. The selection of an image is done by a hashing method based
on principal component analysis of a database. After computing the principal components of the databas...
This paper presents a method for retrieving a cor-responding map of a captured map image from a map database. Our method is inspired from LLAH based Document Image Retrieval (DIR). LLAH is a method for recognizing a point by using a LLAH feature com-posed of its neighbor points. Since Map Image Retrieval (MIR) is achieved by analyzing distribution...
We propose a system for estimating a user's view direction with its location of a captured image by re-trieving its corresponding region from panoramas in a database. In our database, 104 panoramas captured within a local area are stored. For retrieving a user's location, the query image captured by the user's planar-projection camera is compared w...
We propose a method to detect new objects in a scene by comparing an input query image and a movie database captured be-forehand. Our method is based on both feature point matching and edge matching. First, we select the most matched movie from the movie database based on the number of matched feature points. In addition, we can get unmatched point...
This paper presents a method for retrieving a corresponding map of a captured map image from a map database. Our method is inspired from LLAH based Document Image Retrieval (DIR). LLAH is a method for recognizing a point by using a LLAH feature composed of its neighbor points. Since Map Image Retrieval (MIR) is achieved by analyzing distribution of...
We propose a photogrammetric system based on visible light communications using a light marker as a reference mark. Our system automatically matches the light markers captured from multiple viewpoints. Each light blinks as a signal based on visible light communications. Using the signal, the automatic matching of the light markers is achieved. In a...
We propose an automated photogrammetric system using visible light communication. Our system can be applied to the measurement of a variety of distances using a light as a reference point. In addition, the matching of same lights in different viewpoints can be automatically done by using unique blinking patterns. A light area is extracted based on...
This paper presents a supporting system for pool games by computer vision based augmented reality technology. Main purpose of this system is to present visual aids drawn on a pool table through LCD display of a camera mounted handheld device without any artificial marker. Since a pool table is rectangle, a pool ball is sphere and each has a specifi...
This paper presents a method for estimating positions of solid balls from images which are captured using a handy camera mov- ing around the pool table. Since the camera moves around by hand in this method, the motion of the camera in 3D space should be estimated. For the camera motion estimation, a homography is calculated by ex- tracting the gree...