
Yoshio Iwai- Ph.D.
- Professor (Full) at Tottori University
Yoshio Iwai
- Ph.D.
- Professor (Full) at Tottori University
About
151
Publications
7,641
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
634
Citations
Introduction
Skills and Expertise
Current institution
Additional affiliations
April 2011 - present
April 1997 - March 2010
Publications
Publications (151)
Micro-expressions are rapid and subtle facial movements that can reflect the most real emotional state hidden in the human heart. Classifying different micro-expressions is still challenging because of their short duration and low intensity. This paper proposes new neural network models, Simplified SE-DenseNet-cc and SE-ResNet-cc, incorporating Eul...
We propose a method for visualizing where an observer’s gaze focuses on a subject in a still image using a neutral human body model. Generally, two-dimensional (2D) heatmaps are superimposed on still images to visualize an observer’s gaze distribution, which indicates where an observer looks when observing a subject. To investigate gaze distributio...
Semantic segmentation plays a crucial role in understanding the surroundings of a vehicle in the context of autonomous driving. Nevertheless, segmentation networks are typically trained on a closed-set of inliers, leading to misclassification of anomalies as in-distribution objects. This is especially dangerous for obstacles on roads, such as stone...
We propose a method for temporally enhancing the high-frequency components of video sequences with no artifacts to observe the micromovement of the human body. The existing methods of video motion magnification cause severe artifacts in the enhanced video sequence because the temporal micromovement and the spatial appearance of a subject are not st...
We investigated how the observer’s gaze locations temporally shift over the body parts of a subject in an image when the observer is tasked to evaluate the subject’s characteristics. We also investigated how the temporal changes of the gaze locations vary when different characteristic words are contained in the tasks. Previous analytical studies di...
The clothing that coworkers and others wear in the workplace affects our evaluations of them, including what personality traits we ascribe to them. In this study, we sought to determine whether observers’ subjective impressions of a subject’s sincerity and nervousness vary depending on the subject’s clothing, and the extent to which observers look...
We propose an attention mechanism in deep learning networks for gender recognition using the gaze distribution of human observers when they judge the gender of people in pedestrian images. Prevalent attention mechanisms spatially compute the correlation among values of all cells in an input feature map to calculate attention weights. If a large bia...
We propose a method for classifying the weight of baggage carried by a person in an upright posture by finding temporal cues of body sway from depth image sequences. When a standing person is viewed from an overhead depth camera, body sway, which is a slight movement that naturally occurred in the human body, is observed. We consider body sway as d...
Two-dimensional (2D) image registration is a conventional technique for simultaneously performing object recognition and pose estimation tasks. Deep neural-based 2D image registration techniques recently emerged and achieved high performance in both tasks. However, these 2D image registration techniques are not designed to perform the segmentation...
We investigated how the observer's gaze locations temporally move over the subject's body parts in images while tack- ling the tasks of determining the subject's impression. We also investigated how the temporal changes of the gaze locations vary when different impression words are contained in the tasks. Existing analytical studies have not consid...
Accurate object recognition and pose estimation models are essential for practical applications of robot arms, such as picking products on a shelf. Training such a model often requires a large-scale dataset with qualified labels for both object classes and pose parameters, and collecting accurate pose labels is particularly costly. A recent paper [...
We investigate how to probabilistically describe the distribution of gaze with respect to body parts when an observer evaluates impression words for an individual in an image. In the field of cognitive science, analytical studies have reported how observers view a person in an image and form impressions about him or her. However, a probabilistic re...
We investigate a method for enhancing the motion video sequences of an image-based avatar so that the body motion can be perceived as natural on a small display. If some avatar motions are too small, then the users cannot perceive those motions when the avatar is viewed on a small display. In particular, the motion of an upright posture that the av...
In this paper, simple tasks mimicking visual label inspection are described to compare the accuracy of humans with that of deep learning techniques. The number of training samples that are required to obtain equal or higher accuracy as the inspection by humans is investigated using the simple task. In our method, letters printed on test labels are...
We investigate how to represent the probability of gaze distributions for indicating that the observers frequently view body parts when they judge impression words for the body parts of the subjects in the person images. In the field of cognitive science, analytical studies have been reported on how observers view person images and judge the impres...
We propose a method for identifying people using body sway measured from head regions under a condition that a top-view camera observes bodies disturbed by self-occlusion. In order to correctly represent identities of body sway, it is necessary that the appearances of people are accurately acquired from the camera. However, the deficits of the appe...
We propose a method of identifying people in case of self-occlusion by using body sway measured at the head using a top-view camera. To accurately represent the identities of people as reflected in body sway, it is important to acquire accurate appearances in images. However, such images frequently contain defects, especially self-occlusion, that d...
We propose a novel method for inferring state transitions from non participants to participants in free-style conversational interaction using physical behaviors acquired from cameras and a microphone. The existing methods do not consider non-participants and bystanders who play important roles in the interaction. In the research field of cognitive...
This paper proposes a method for pixel-based background subtraction with improved gamma correction and a layered adaptive background model (IGLABM). The main problems of background subtraction are background oscillation and shadow. To solve these problems, we have proposed the gamma corrected layered adaptive background model (GLABM), however the p...
In this paper, we investigate the number of training samples required for deep learning techniques to achieve better accuracy of inspection than a human on a simple visual inspection task. We also examine whether there are differences in terms of finding anomalies when deep learning techniques outperform human subjects. To this end, we design a sim...
We propose a method for classifying gender using training samples after applying privacy-protection. Recently, training samples containing individuals require to protect their privacy. Head regions of training sample are usually manipulated for privacy-protection. However, the accuracy of gender classification is degraded when directly using the pr...
We investigate the visual effects of superimposing turning points and travel directions within the user’s field of view in a navigation system using a subjective assessment procedure. Existing methods were developed without conducting subjective assessments of the effects of superimposing the turning points and travel directions on the user’s displ...
We investigate whether an image-based avatar with motion can smoothly direct a target person who has requested guidance. Existing methods that use an avatar generally assume a one-to-one interaction with the user and did not fully consider how the avatar should direct a target person, which is important in interactions with multiple users. When the...
We propose a method for accurately identifying people using temporal and spatial changes in local movements measured from video sequences of body sway. Existing methods identify people using gait features that mainly represent the large swinging of the limbs. The use of gait features introduces a problem in that the identification performance decre...
We propose a novel method for synthesizing training samples to obtain high accuracy of object detection under the condition that the number of acquisition images is small. The convolutional neuronal networks for object detection require the large number of acquisition images that the angles of postures of each object are varied. Thus, it is very ti...
We discuss how to reveal and use the gaze locations of observers who view pedestrian images for personal attribute classification. Observers look at informative regions when attempting to classify the attributes of pedestrians in images. Thus, we hypothesize that the regions in which observers’ gaze locations are clustered will contain discriminati...
We propose a method for embedding the awareness state and response state in an image-based avatar to smoothly and automatically start an interaction with a user. When both states are not embedded, the image-based avatar can become non-responsive or slow to respond. To consider the beginning of an interaction, we observed the behaviors between a use...
We propose a novel navigation system using a virtual guide who walks in front of the user and induces the following effect. When using existing navigation systems, the user has a high physical burden because the user continuously moves their gaze and head to look at the map and landmarks on their journey. Employing our system, the user simply follo...
We propose a method for synthesizing body sway to give human-like movement to image-based avatars. This method is based on an analysis of body sway in real people. Existing methods mainly handle the action states of avatars without sufficiently considering the wait states that exist between them. The wait state is essential for filling the periods...
Digital signage often uses a large display with the camera placed on the outer frame and set with a wide angle to capture the face of the viewer. In some cases the viewer's iris is obstructed due to the gaze position of the viewer. In this study, we present an appearance-based gaze estimation method that can be used for digital signage even when th...
We propose an expression transmission system using a cellular-phone-type teleoperated robot called Elfoid. Elfoid has a soft exterior that provides the look and feel of human skin, and is designed to transmit the speaker's presence to their communication partner using a camera and microphone. To transmit the speaker's presence, Elfoid sends not onl...
Evaluation and comparison of methods, repeatability of experiments, and availability of data are the dynamics driving science forward. In computer vision, a database with ground-truth information enables fair comparison and facilitates rapid improvement of methods in a particular topic. Being a high-level discipline, Human-Computer Interaction (HCI...
We present local feature evaluation for a constrained local model (CLM) framework. We target facial images captured by a mobile camera such as a smartphone. When recognizing facial images captured by a mobile camera, changes in lighting conditions and image degradation from motion blur are considerable problems. CLM is effective for recognizing a f...
A multi-sensor-based ambient sensing system is proposed for estimating the user's comfort/discomfort in response to the lighting condition during desk work. The user's comfort/discomfort is estimated according to facial expression, body sway, writing motion and frequency of drinking measured by sensors embedded in the environment. The purpose of th...
In this study, we introduce a system for tracking multiple people using multiple active cameras. Our main objective is to surveille as many targets as possible, at any time, using a limited number of active cameras. In our context, an active camera is a statically located pan-tilt-zoom camera. In this research, we aim to optimize the camera configu...
A system for providing music employing electroencephalography for music therapy is described. Music therapy for the treatment of patients suffering mental illness has been attempted over a period of 20 years. To reduce stress, it is preferable to listen to music that matches a person's emotions. However, it is difficult to know exactly the person's...
We propose an emotion transmission system using a cellular phone-type teleoperated robot with a mobile projector. Elfoid has a soft exterior that provides the look and feel of human skin and is designed to transmit a speaker’s presence to their communication partner using a camera and microphone. To transmit the speaker’s presence, Elfoid transmits...
This paper describes a navigation system that is guided by a CG avatar using augmented reality (AR) technology. Some existing conventional AR navigation systems use arrows for route guidance. However, the positions to which the arrows point can be unclear because the actual scale of the arrow is unknown. In contrast, a navigation process conducted...
In this study, we introduce a system for tracking multiple people using multiple active cameras. Our main objective is to capture as many targets as possible at any time, using a limited number of active cameras. In our context, an active camera is a statically located pan-tilt-zoom camera. The use of active cameras for tracking has not been thorou...
In this study, we investigated expressive facial reactions in response to changes in the visual environment and their automatic extraction from sensors, in order to construct a comfortable level of illumination in personal living spaces. We conducted an experiment that showed that expressive facial reactions occur when illumination in the visual en...
We propose a method for constructing an interpersonal interaction system using a real image-based avatar. Humancomputer interaction is important when we communicate with computers. As a medium of an interpersonal interaction, communication robots are used commonly in the real world and CG avatar is used in the virtual world. On behalf of the commun...
This paper presents a method for learning and predicting human motion in closed environments.
Many surveillance, security, entertainment and smart-home systems require the localization of human subjects and the prediction of their future locations in the environment. Traditional tracking methods employ a linear motion model for human motion. Howeve...
In this paper, we introduce a novel method on tracking multiple people using multiple active cameras. The aim is to capture as many targets as possible at any time using a limited number of active cameras. In our context, an active camera is a statically located PTZ (pan-tilt-zoom) camera. Using active cameras for tracking is not researched thoroug...
We propose a method for generating facial expressions emphasized with cartoon techniques using a cellular-phone-type teleoperated android with a mobile projector. Elfoid is designed to transmit the speaker’s presence to their communication partner using a camera and microphone, and has a soft exterior that provides the look and feel of human skin....
We propose a method for generating facial expressions with a mobile projector built into a cellphone-type tele-operated android, called Elfoid. Elfoid is designed to transmit the presence of a speaker to a communication partner in a remote place using a camera and microphone and a soft exterior that provides the look and feel of human skin. To tran...
We have constructed systems that detect abnormal areas of lung X-ray images from one-dimensional numeric sequences using neural networks. In these systems, the neural network consists of neurons that use trigonometric polynomials as activation functions, or TPUnit neural networks. The TPunit neural network has a high generalization ability in a sma...
In this study, we investigated expressive facial reactions in response to changes in the visual environment and their automatic extraction from sensors, in order to construct a comfortable level of illumination in personal living spaces. We conducted an experiment that showed that expressive facial reactions occur when illumination in the visual en...
The electroencephalogram (EEG) is necessary for the diagnosis of epilepsy. To make a diagnosis of epilepsy exactly, a full EEG recording for a long stretch of time is needed. The observation for a long record is a big burden for a doctor. To reduce this burden, a computer aid is important. This paper presented classifications of EEG patterns using...
Recently, activity support systems that enable dialogue with humans have been intensively studied owing to the development of various sensors and recognition technologies. In order to enable a smooth dialogue between a system and a human user, we need to clarify the rules of dialogue, including how utterances and motions are interpreted among human...
Many researches and developments of table-top interface have been
proposed in the last decade. In a table-top system, the difference of
operations between the digital media and the physical media like a paper
gives us unsatisfactory experience and disturbs a comfortable
simultaneous using of them. It is required for user satisfaction that
the user...
A living environment should be comfortable for all residents. The thermal environment is one of the indices of comfort, but
it is difficult to adopt a specific thermal environment suitable for all residents who share a thermal space but have different
personal needs. In this research, we propose an air conditioning control method that satisfies the...
Recently, research fields of augmented reality and robot navigation are actively investigated. Estimating a relative posture between an object and a camera is an important task in these fields Visual markers are frequently used to estimate a relative posture between an object and a camera, but the usage of visual markers spoils a scene. In this pap...
In recent years, the detection accuracy has significantly improved under various conditions using sophisticated methods. However,
these methods require a great deal of computational cost, and have difficulty in real-time applications. In this paper, we
propose a real-time system for object detection in outdoor environments using a graphics processi...
Recently, research fields of augmented reality and robot navigation are actively investigated. Estimating a relative posture between an object and a camera is an important task in these fields. In this paper, we propose a novel method for posture estimation by using high frequency markers and kernel regressions. The markers are embedded in an objec...
A resolution of camera has been drastically improved under a current request for high-quality digital images. For example, digital still camera has several mega pixels. Although a video camera has the higher frame-rate, the resolution of a video camera is lower than that of still camera. Thus, the high-resolution is incompatible with the high frame...
We propose a method that tracks and recognizes faces simultaneously. In previous methods, features needed to be extracted twice for tracking and recognizing faces in image sequences because the features used for face recognition are different from those used for face tracking. To reduce the computational cost, we propose a probabilistic model for f...
In this paper, we propose a generic framework for detecting suspicious
actions with mixture distributions of action primitives, of which
collection represents human actions. The framework is based on Bayesian
approach and the calculation is performed by Sequential Monte Carlo
method, also known as Particle filter. Sequential Monte Carlo is used to...
An omnidirectional vision is an imaging system that can capture a surrounding image in whole direction by using a hyperbolic mirror and a conventional CCD camera. This paper proposes a streaming server that can efficiently transfer movies captured by an omnidirectional vision system through the Internet. The proposed system uses multiple channels t...
There are two major problems with learning-based super-resolution algorithms. One is that they require a large amount of memory to store examples; while the other is the high computational cost of finding the nearest neighbors in the database. In order to alleviate these problems, it is helpful to reduce the dimensionality of examples and to store...
Self-location capability is a very useful and informative attribute for wearable systems. This paper proposes a method for identifying a user's location from an omnidirectional image sensor, a GPS data source and wireless LAN data. Azimuth-invariant features are extracted from an omnidirectional image by integrating pixel information circumferentia...
In years, security camera systems have been installed in various public facilities. More intelligent processes are needed to track people in image sequences for security camera systems. In this paper, we propose a face tracking and recognition method based on a Bayesian framework. We assume that an observed space is three-dimensional, and we estima...
Probabilistic and statistical model analysis methods based on the Bayesian approach have recently been applied to face tracking. Here, we propose a face tracking method based on a Bayesian framework of image sequences. We assume that an observed space is three-dimensional (3D) and model facial shape, rotation and translation in 3D. A 3D positional...
In this paper, we propose a generic framework for detecting suspicious actions with mixture distributions of action primitives,
of which collection represents human actions. The framework is based on Bayesian approach and the calculation is performed
by Sequential Monte Carlo method, also known as Particle filter. Sequential Monte Carlo is used to...
In this paper, we propose a novel learning-based video super resolution algorithm with less memory requirements and computational cost. To this end, we adopt discrete cosine transform (DCT) coefficients for feature vector components. Moreover, we design an example selection procedure to construct a compact database. We conducted evaluative experime...
Background subtraction is widely used in detecting moving objects; however, changing illumination conditions, color similarity, and real-time performance remain important problems. In this paper, we introduce a sequential method for adaptively estimating background components using Kalman filters, and a novel method for detecting objects using marg...
Recently, researchers have proposed many face recognition methods with the aim of improving the accuracy rate of face recognition. However, few face recognition methods focus on computational cost. To reduce the computational cost of face recognition, we propose an effective face recognition method using Haar wavelet features and a branch and bound...
顔認識の特徴量としては,様々なものが利用されているが,その代表的なものとしてGaborウェーブレット特徴量がある.Gaborウェーブレット特徴量は,出力特性が生物の視覚特性と似ており,Eigenface等,他の顔認識手法と比較して良好な性能を示している.しかし,顔認識の特徴量として,Gaborウェーブレット特徴量が最適であるか明らかではない.そこで,本研究では,Gaborウェーブレット以外の様々なウェーブレット(Haar, French hat, Mexican hat, Daubechies, Coiflet, Symlet, O-spline)を用いて特徴抽出を行い,どのウェーブレット特徴量が顔認識に最適であるかを調べた.ウェーブレットのスケールを固定したもの,スケールを可変にしたものの...
In the automatic authentication of face images, the estimation of the facial position is the most important process. When the facial position is estimated on the basis of the facial shape model, the accuracy is not guaranteed unless the optimal model is used. In the solution of this problem, there is a trade-off relation between the estimation accu...