Akira Utsumi

Akira Utsumi
Advanced Telecommunications Research Institute | ATR

About

89
Publications
63,132
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,948
Citations

Publications

Publications (89)
Article
Full-text available
In this paper, a deep learning method is proposed for human image processing that incorporates a mechanism to update target-specific parameters. The aim is to improve system performance in situations where the target can be continuously observed. Network-based algorithms typically rely on offline training processes that use large datasets, while tr...
Chapter
This paper proposes a deep learning method for face-pose estimation with an incremental personalization mechanism to update the face-shape parameters. Recent advances in machine learning technology have also led to outstanding performance in applications of computer vision. However, network-based algorithms generally rely on an off-line training pr...
Article
Automated driving reduces the burden on the driver, however also makes it difficult for the driver to understand the current situation and predict the future movement of the vehicle. To facilitate the prediction of the future behavior of the vehicle by the driver, this paper aims to design and evaluate a haptic interface that actuates the vehicle s...
Conference Paper
The performance of vehicle control tasks greatly depends on the driver's ability to acquire visual information about the vehicle's location on the road. Advances in sensory and display technologies are making it possible to effectively assist the driver in information acquisition by accurately detecting the vehicle's current location and dynamicall...
Conference Paper
Automated driving technology is being developed to reduce driver workloads and improve driving safety. However, how connected drivers feel to the control of their vehicles can be a critical factor in driving comfort. In this paper, we discuss a highly automated and human-machine cooperative steering system. Our prototype system regulates driver's s...
Article
In this research, we will examine the gaze detection method based on eye center and iris center estimation using CNN. In the proposed method, eyeball center positions and the iris center positions are estimated from the head image captured by the camera, and estimated values of the gaze directions are obtained based on those coordinates. As a resul...
Article
Automatic driving and auxiliary driving functions are expected to improve safety and efficiency, and reduce the burden on drivers. These driving assist functions change the task of the driver not to control the vehicle but to monitor the driving operation, which could lead to new kind of stress. In this research, the driver stress during automated...
Article
In Japan, although the rapid aging of the population has caused serious traffic problems, only a few studies have investigated the behaviour of elderly drivers in real traffic conditions. The authors have been developing a system to automatically evaluate safe-driving skill through small wireless wearable sensors that directly measure the driver??s...
Article
Full-text available
In this paper, we introduce an interactive guide plate system by adopting a gaze-communicative stuffed-toy robot and a gaze-interactive display board. An attached stuffed-toy robot on the system naturally show anthropomorphic guidance corresponding to the user’s gaze orientation. The guidance is presented through gaze-communicative behaviors of the...
Conference Paper
We propose a scheme to evaluate driver behaviors by a vision-based head pose estimation method. In our project, we are investigating a collaborative safety mechanism based on the mutual sharing of driver information (attention and performance) in daily situations. The analysis of driver behaviors based on a non-contact type of measurement is one ke...
Conference Paper
In this paper, we describe an ongoing project to develop a collaborative safety mechanism based on mutual sharing of a driver's information (attention and performance). Most conventional driving safety approaches aim at assisting individual drivers by providing various types of safety information to the driver in the vehicle. However, the effect of...
Article
Applying the technologies of a network robot system, recommendation methods used in e-commerce are incorporated in a retail shop in the real world. We constructed a platform for ubiquitous networked robots that focuses on a shop environment where communication robots perform customer navigation. The platform observes customers’ purchasing behavior...
Article
We propose a multi-camera-based gaze tracking system that provides a wide observation area. In our system, multiple camera observations are used to expand the detection area by employing mosaic observations. Each facial feature and eye region image can be observed by different cameras, and in contrast to stereo-based systems, no shared observations...
Article
This paper introduces a daily-partner robot, that is aware of the user's situation by using gaze and utterance detection. For appropriate anthropomorphic interaction, the robot should talk to the user in proper timing without interrupting her/his task. Our proposed robot 1) estimates the user's context (the target of her/his speech) by detecting hi...
Article
In this paper, we propose an automatic evaluation system of safe driving skill for personalized driving lecture. Our system use small wireless wearable sensors to directly measure drivers' behavior without giving much stress. By using the sensors together with GPS and driving instructors' knowledge, our system automatically evaluate drivers' safe d...
Conference Paper
Full-text available
Applying the technologies of a network robot system, we incorporate the recommendation methods used in E-commerce in a retail shop in the real world. We constructed a platform for ubiquitous-networked robots that focuses on a shop environment where communication robots perform customer navigation. The platform estimates customer interests from thei...
Conference Paper
In this study, we proposed gaze-reactive brightness control of car onboard displays to ensure driver visibility and explored the method's effectiveness. In Experiment 1, we investigated the effect of an onboard display light for visibility and confirmed our proposed method's effectiveness. In Experiment 2, we developed the method presented in Exper...
Conference Paper
We analyze pedestrian behavior in a large shop-ping mall through observations using a laser range finder (LRF) and video cameras. The observed movements are classified into three categories, 'going straight,''finding the way,' and 'walking around,' based on each persons' walking speed, variability of trajec-tory, stopping ratio, and head motions. P...
Conference Paper
By applying network robot technologies, recommendation methods from E-Commerce are incorporated in a retail shop in the real world. We constructed an experimental shop environment where communication robots recommend specific items to the customers according to their purchasing behavior as observed by networked sensors. A recommendation scenario is...
Conference Paper
In this paper, we evaluate a video communication system with coordination of the robot's behaviors and the video's control that compensates for user's uncongenial attitudes. The system enables comfortable video communications between elderly or disabled people by an assistant robot for each user that expresses a) active listening behaviors to compe...
Conference Paper
We are developing the system that remotely support the daily living of people with dementia at home by using multi-media contents for bringing their peace of mind, for preventing their behavioural disturbances, and for guiding actions of their daily living, because a major problem in providing good care at home to people with dementia is that it mu...
Conference Paper
This paper proposes a daily-partner robot, that is aware of the user's situation or behavior by using gaze and utterance detection. For appropriate and familiar anthropomorphic interaction, the robot should wait for a timing to talk something to the user corresponding to the situation of her/him while she/he doing a task or thinking. According to t...
Conference Paper
In this paper, we propose a guide system for daily life in semipublic spaces by adopting a gaze-communicative stuffed-toy robot and a gaze-interactive display board. The system provides naturally anthropomorphic guidance through a) gaze-communicative behaviors of the stuffed-toy robot (ldquojoint attentionrdquo and ldquoeye-contact reactionsrdquo)...
Conference Paper
We propose a gaze estimation method that substantially relaxes the practical constraints possessed by most conven-tional methods. Gaze estimation research has a long his-tory, and many systems including some commercial schemes have been proposed. However, the application domain of gaze estimation is still limited (e.g, measurement devices for HCI i...
Conference Paper
We propose a real-time gaze estimation method based on facial-feature tracking using a single video camera that does not require any special user action for calibration. Many gaze estimation methods have been already proposed; however, most conventional gaze tracking algorithms can only be applied to experimental environments due to their complex c...
Conference Paper
Full-text available
In this paper, we introduce and evaluate an interactive guideboard with a communicative stuffed-toy guide-robot that behaves in correspondence to the user’s gazing direction. The proposed system adopts our remote gaze-tracking method, which estimates the user’s gaze angles based on image processing. The main purpose of this research is to provide i...
Conference Paper
We propose a method for modeling ununiform illumination conditions using multiple-camera-based marker observations. In computer graphics applications, multiple-camera-based object reconstruction is becoming popular for modeling 3D objects. However, geometrical and photo-metrical calibrations among multiple cameras still require high computational c...
Article
In this paper, we propose a body-mounted system to capture user experience as audio/visual information. The proposed system consists of two cameras (head-detection and wide angle) and a microphone. The head-detection camera captures user head motions, while the wide angle color camera captures user frontal view images. An image region approximately...
Article
We propose a real-time gaze estimation method based on facial-feature tracking using a single video camera. In ourq method, gaze directions are determined as 3D vectors connecting both the eyeball and iris centers. Since the center of eyeball cannot be directly observed from images, the geometrical relationship between the eyeball centers and the f...
Conference Paper
In this paper, we propose a method to estimate user attention to displayed content signals with temporal analysis of their exhibited behavior. Detecting user attention and controlling contents are key issues in our “networked interaction therapy system” that effectively attracts the attention of memory-impaired people. In our proposed method, user...
Article
In this paper, we propose a method to track human motions by using multiple-viewpoint images taken by both mobile cameras and fixed-viewpoint cameras. In a vision-based system, the variety of viewpoints is important for effective detection/tracking of target motions. As image sequences observed from various viewpoints contain rich information regar...
Conference Paper
This paper proposes a gaze-communicative stuffed-toy robot system with joint attention and eye-contact reactions based on ambient gaze-tracking. For free and natural interaction, we adopted our remote gaze-tracking method. Corresponding to the user's gaze, the gaze-reactive stuffed-toy robot is designed to gradually establish 1) joint attention usi...
Conference Paper
This research aims to naturally evoke human-robot communications in ambient space based on a hierarchical model of gazecommunication. The interactive ambient space is created with our remote gaze-tracking technology based on image analyses and our gaze-reactive robot system. A single remote camera detects the user's gaze in unrestricted situations...
Conference Paper
We propose a method to estimate gaze direction in real time with a single camera based on four reference points and three calibration images. First, the position at which the eyeball center is projected is calculated as a linear combination of those of the reference points. Then, the gaze direction is estimated as a vector connecting the calculated...
Conference Paper
Full-text available
This paper describes a method for estimating human distributions (quantities and locations) based on multiple-viewpoint image sequences. In the field of human image analysis, inter-human occlusion is a significant problem: when a scene includes a large number of occlusions, tracking of individual persons becomes difficult. Therefore, updating a tra...
Conference Paper
We propose a method of calibrating multiple camera sys- tems that operates by adjusting the camera parameters and the 3D shape of objects onto silhouette observations. Our method employs frontier points, which are geometrically meaningful points on object surfaces, to determine the ge- ometrical relations among multiple cameras. In contrast to conv...
Conference Paper
In this paper, we describe our methods of detecting human behavior in order to monitor and assist daily human tasks. To assist people in performing activities in daily life, we must be able to understand the system user's situation/state in the current task: what information is useful for the user now? Our posture-detection system using IR cameras...
Conference Paper
In this paper, we propose a body-mounted system to capture the user's experience as multiple-sensor information. The proposed system consists of three cameras (stereo cameras and one wide-angle camera) and other sensors (a microphone, a GPS (global positioning system), a digital compass, and an acceleration sensor). The stereo cameras are used for...
Conference Paper
In this paper, we discuss our system that estimates user at- tention to displayed content signals with temporal analysis of their exhib- ited behavior. Detecting user attention and controlling contents are key issues in our" networked interaction therapy system", which eectiv ely attracts the attention of memory-impaired people. In our proposed sys...
Article
We propose an automatic camera calibration method to determine the position and orientation parameters of a newly installed camera in our hand gesture tracking system. In a multiple camera-based system, automatic camera calibration algorithms are required due to the increasing cost of camera calibration. In our method, both 3D position tracking res...
Article
Full-text available
An abstract is not available.
Article
This paper proposes a method of tracking a human object by using nonsynchronous multiple-viewpoint images. The proposed method tracks human forms efficiently by using a Kalman filter to integrate observed information which is obtained nonsynchronously from multiple viewpoints. The experimental system is composed of multiple observation nodes, which...
Article
We propose a distributed automatic method of calibrating cameras for multiple-camera-based vision systems, because manual calibration is a difficult and time-consuming task. However, the data size and computational costs of automatic calibration increase when the number of cameras is increased. We solved these problems by employing a distributed al...
Conference Paper
We propose a vision-based method to detect interactions between human hand(s) and real objects. Since humans perform various kinds of tasks with their hands, detection of hand-object interactions is useful for building intelligent systems that understand and support human activities. We use a statistical color model to detect hand regions in input...
Conference Paper
We propose an appearance-based method for tracking motions of multiple persons using an asynchronous multiple-camera system. In the proposed method, the head appearance of each target person is dynamically modeled using multiple-camera-based observations. Observed color (texture) information and the related reliability values are stored in the head...
Conference Paper
In this paper, we propose a body attached system to capture the experience of a person in sequence as audio/visual information. The proposed system consists of two cameras (one IR (infra-red) camera and one wide-angle color camera) and a microphone. The IR camera image is used for capturing the user’s head motions. The wide-angle color camera is us...
Article
To relieve the stress in the lives of memory-impaired people and their family members, we propose the concept of Networked Interaction Therapy that connects them with community support group members via Internet by utilizing Internet communication, image understanding and sensory-interaction media technologies. To study their need and acceptance fo...
Conference Paper
In this paper, we propose a method to achieve position and pose of multiple cameras and temporal synchronization among them by using LED markers. In the proposed method, each IR marker transmits its own ID as a signal pattern. We can estimate camera positions and poses by using the 3D positions of multiple markers. In addition, these markers also t...
Conference Paper
We propose a statistical method to detect human(s) in images by using geometrical structures that are common to the appearances of the target objects (human figures). Most appearance-based methods focus on pixel values directly, because the same classes of objects usually have similar pixel value distributions. However, this is not true for some pa...
Book
Demand for capturing human motion has been increasing in many areas. For example, in entertainment industries that produce digital cinema and video games, human motions are measured and used for creating animations of the human body. In these applications, multiple sensing devices are attached to a human body so as to measure the movements in real-...
Conference Paper
We propose an automatic camera calibration method to determine the position and orientation of a newly installed camera in our human tracking system. Due to the increas-ing cost of camera calibration, automatic algorithms are required. In our method, both 3D position tracking re-sults and 2D positions and sizes on camera image planes are used. Beca...
Conference Paper
We present a vision-based hand tracking system for gesture-based man-machine interactions and a statistical hand detection method. Our hand tracking system employs multiple cameras to reduce occlusion problems. Non-synchronous multiple observations enhance system scalability. In the system, users can manipulate a virtual scene by using predefined g...
Conference Paper
We propose a texture model construction method for human tracking based on a statistical texture-free shape model (distance map). In practical vision systems, a priori knowledge of the target object is often limited. In human tracking, since human figures involve a variety of images due to different clothes, the detected pixel values (color brightn...
Chapter
This chapter discusses a human tracking method using multiple non-synchronous camera observations. In vision-based human tracking, self-occlusions and human-human occlusions are significant problems. Employing multiple viewpoints reduces these problems. Furthermore, the use of the non-synchronous observation approach eliminates the scalability prob...
Conference Paper
We propose an adaptive human tracking system with non-synchronous multiple observations. Our system consists of three types of processes: discovering node for detecting newly appeared person; tracking node for tracking each target person; and observation node for processing one viewpoint (camera) images. We have multiple observation nodes and each...
Conference Paper
We propose a method of tracking 3D position, posture, and shapes of human hands from multiple-viewpoint images. Self-occlusion and hand-hand occlusion are serious problems in the vision-based hand tracking. Our system employs multiple-viewpoint and viewpoint selection mechanism to reduce these problems. Each hand position is tracked with a Kalman f...
Conference Paper
We propose a multiple-view-based tracking algorithm for multiple-human motions. In vision-based human tracking, self-occlusions and human-human occlusions are a part of the more significant problems. Employing multiple viewpoints and a viewpoint selection mechanism, however can reduce these problems. In our system, human positions are tracked with...
Conference Paper
We propose a novel method of extracting a moving object region from each frame in a series of images regardless of complex, changing background using statistical knowledge about the target. In vision systems for `real worlds' like a human motion tracer, a priori knowledge about the target and environment is often limited (e.g., only the approximate...
Conference Paper
We propose a human motion detection method using multiple-viewpoint images. In vision-based human tracking, self-occlusions and human-human occlusions are a part of the more significant problems. Employing multiple viewpoints and a viewpoint selection mechanism, however, can reduce these problems. The vision system in this case should select the be...
Conference Paper
We propose a human motion detection method using multiple-viewpoint images. We employ a simple elliptic model and a small number of reliable image features detected in multiple-viewpoint images to estimate the pose (position and normal axis) of a human body, where feature extraction is employed based on distance transformation. The COG (center of g...
Conference Paper
We propose a method to detect hand position, posture and shapes from multiple viewpoint images. We employ a simple elliptic model and a small number of reliable image features detected in multiple viewpoint images to estimate the pose (position and normal axis) of a human hand, where feature extraction is employed based on distance transformation....
Conference Paper
Two methods to extract a moving target region from a series of images are presented. Pixel value distributions for both the target object and background region are estimated for each pixel with roughly extracted moving regions. Using the distributions, stable target extraction is performed. In the first method, the distributions are approximated wi...
Article
We describe a method for detecting hand position, posture, and finger bendings using multiple camera images. Stable detection can be achieved using distance transformed images. We detect the maximum point in each distance transformed image as the center of gravity (COG) point of the hand region and calculate its 3D position by stereo matching. The...
Conference Paper
We describe a method to detect hand position, posture and finger bendings using multiple camera images. Stable detection can be achieved by using skeleton images, and this is confirmed through experiments. This system can be used as a user interface device in a virtual environment, replacing glove-type devices and overcoming most of the disadvantag...
Article
A number of researchers have investigated vision-based navigation of autonomous mobile robot, using passive sensing systems; stereo camera or motion camera system. In general, by using the passive sensing system, the robot can obtain three-dimensional data of objects. This 3-D data we have been using, come from an edge-based algorithm that yields a...
Conference Paper
We propose a vision-based hand pose recognition system. The system we propose expresses a hand pose by a plane model that consists of hand's center of gravity (COG) and fingertip points. These points of reference can be relatively more stable and easily detected than other points (e.g., finger base points). However, since it has been assumed in the...
Article
This paper describes studies on perception of virtual object locations. It explores the behavior of some factors related to depth perception, especially the effect of inter-pupillary distance (IPD) mismatch and the interplay of image blur and binocular disparity. IPD mismatch (which is caused by errors in estimation of the parameter) results in a c...
Article
Full-text available
In this paper we discuss Augmented Reality (AR) displays in a general sense, within the context of a Reality-Virtuality (RV) continuum, encompassing a large class of "Mixed Reality" (MR) displays, which also includes Augmented Virtuality (AV). MR displays are defined by means of seven examples of existing display concepts in which real objects and...
Article
This paper describes our studies on perception of virtual object locations. We explore the behavior of various factors related to depth perception, especially the interplay of fuzziness and binocular disparity. Experiments measure this interplay with the use of a three- dimensional display. Image fuzziness is ordinarily seen as an effect of the aer...