Masaaki Iiyama

Masaaki Iiyama
Shiga University

PhD

About

101
Publications
14,985
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
425
Citations
Introduction
Skills and Expertise
Additional affiliations
April 2015 - present
Kyoto University
Position
  • Professor (Associate)

Publications

Publications (101)
Article
Full-text available
Gridded bathymetric data are often used to understand seafloor topography; however, high-resolution data are rare. To obtain high-resolution gridded bathymetric data, the observations from which the data are derived must be densely measured. However, this process is time consuming and expensive. In this study, we propose a method to obtain dense ba...
Article
Full-text available
Scene graph generation (SGG) aims to detect the relationships of objects in an image. Recently, it has been extended to open-set SGG, which also considers unknown objects unseen in a training phase and thereby enables various applications in complex real-world scenes. However, previous research on open-set SGG addressed unknown object detection sim...
Article
Full-text available
We propose deep depth from focal stack (DDFS), which takes a focal stack as input of a neural network for estimating scene depth. Defocus blur is a useful cue for depth estimation. However, the size of the blur depends on not only scene depth but also camera settings such as focus distance, focal length, and f-number. Current learning-based methods...
Chapter
In this study, we propose a method for potential risk estimation of road scenes from driving videos and investigate the relationship between the potential risk estimation and the risk perception of humans. We employ a frame prediction method and define scenes where the frame prediction accuracy decreases as risky scenes. We also use the scene depth...
Article
Full-text available
Prediction of sea surface temperature (SST) is a very challenging task, especially for the regions of high SST variability. Such predictions are either achieved by physics-based models, which often yield poor predictions and computationally intensive, or using data-driven methods, which are skillful and computationally less intensive. However, rece...
Preprint
We propose a learning-based depth from focus/defocus (DFF), which takes a focal stack as input for estimating scene depth. Defocus blur is a useful cue for depth estimation. However, the size of the blur depends on not only scene depth but also camera settings such as focus distance, focal length, and f-number. Current learning-based methods withou...
Article
Full-text available
Scene graph generation (SGG) aims to detect objects and their relationships in an image, thereby enabling a detailed understanding of a complex scene for various real-world applications. In SGG applications such as robot vision, it is important to correctly detect all objects without recognizing any object as another kind of object or ignoring it....
Article
Full-text available
Sea-surface temperature (SST) images obtained by satellites contain noise and missing SSTs due to cloud covers. We propose a method for reconstructing denoised cloud-free SST images via deep-learning-based image inpainting. For denoizing, we use data-assimilation images to train a reconstruction network by considering the physical correctness of SS...
Article
We propose a learning-based multi-view stereo (MVS) method in scattering media, such as fog or smoke, with a novel cost volume, called the dehazing cost volume. Images captured in scattering media are degraded due to light scattering and attenuation caused by suspended particles. This degradation depends on scene depth; thus, it is difficult for tr...
Article
An encoder-decoder (Enc-Dec) model is one of the fundamental architectures in many computer vision applications. One desired property of a trained Enc-Dec model is to feasibly encode (and decode) diverse input patterns. Aiming to obtain such a model, in this paper, we propose a simple method called curiosity-guided fine-tuning (CurioFT), which puts...
Chapter
We propose a learning-based multi-view stereo (MVS) method in scattering media such as fog or smoke with a novel cost volume, called the dehazing cost volume. An image captured in scattering media degrades due to light scattering and attenuation caused by suspended particles. This degradation depends on scene depth; thus it is difficult for MVS to...
Article
Full-text available
In this paper, a two-stage method is proposed for predicting the catch of skipjack tuna ( Katsuwonus pelamis ) from a 2D sea environmental pattern. Following the assumption that sea water temperature and sea surface height (SSH) which fishermen often use for finding fishing spots has a correlation with the skipjack tuna catch, a new approach of us...
Chapter
Sub-surface ocean temperature (SSOT) profiles are highly significant for tropical cyclone triggers and its progressive behavior therefore, prediction of these profile is utmost important. Various past studies mention the successful use of soft computing methods in estimating SSOT profiles with good accu-racy. However, pre-dictions of SSOT at signif...
Conference Paper
Full-text available
Sub-surface ocean temperature (SSOT) profiles are highly significant for tropical cyclone triggers and its progressive behavior therefore, prediction of these profile is utmost important. Various past studies mention the successful use of soft computing methods in estimating SSOT profiles with good accuracy. However, pre-dictions of SSOT at sig...
Preprint
We propose a learning-based multi-view stereo (MVS) method in scattering media, such as fog or smoke, with a novel cost volume, called the dehazing cost volume. Images captured in scattering media are degraded due to light scattering and attenuation caused by suspended particles. This degradation depends on scene depth; thus, it is difficult for tr...
Chapter
Target shift, the different label distributions of source and target domains, is an important problem for practical use of unsupervised domain adaptation (UDA); as we do not know labels in target domain datasets, we cannot ensure an identical label distribution between the two domains. Despite this inaccessibility, modern UDA methods commonly try t...
Article
In this paper, a visual analytics system that are used to analyze the causal relation- ship between fishing catches data in adjacent sea areas is proposed. By using the proposed system, the distribution of sea surface temperature is visualized so that the sea areas can be divided accordingly. Next, the daily fishing efficiency in each sea area are...
Article
Full-text available
Seafloor mapping to create bathymetric charts of the oceans is important for various applications. However, making high-resolution bathymetric charts requires measuring underwater depths at many points in sea areas, and thus, is time-consuming and costly. In this work, treating gridded bathymetric data as digital images, we employ the image-process...
Article
Three-dimensional (3D) reconstruction and scene depth estimation from 2-dimensional (2D) images are major tasks in computer vision. However, using conventional 3D reconstruction techniques gets challenging in participating media such as murky water, fog, or smoke. We have developed a method that uses a continuous-wave time-of-flight (ToF) camera to...
Chapter
Full-text available
Indian Ocean Dipole (IOD) being one of the important climatic indices happens to be directly linked with floods and droughts occurring in Indian Ocean neighbouring rim countries. IOD is supposed to occur frequently due to the warming rate of the Indian Ocean which is slightly higher than the global ocean. This is therefore important to predict...
Chapter
In this paper, we propose a method to recover strokes from offline handwritten Chinese characters. The proposed method employs a fully convolutional network (FCN) to estimate the writing order of connected components in offline Chinese character images and a multi-task FCN to estimate the writing order and directions of strokes in each connected co...
Chapter
This paper proposes a new approach for meteorology; estimating sea surface temperatures (SSTs) by using deep learning. SSTs are essential information for ocean-related industries but are hard to measure. Although multi-spectral imaging sensors on meteorological satellites are used for measuring SSTs over a wide area, they cannot measure sea tempera...
Poster
Full-text available
Sea surface temperature (SST) is utmost important for locating fishing zones SST prediction is largely based on numerical models, but its output often largely deviate from ground truth due to many implicit assumptions This study attempts SST forecasts using meteorological parameters as inputs Further it also improves the forecasts by residual pr...
Preprint
Full-text available
This paper proposes a novel approach for unsupervised domain adaptation (UDA) with target shift. Target shift is a problem of mismatch in label distribution between source and target domains. Typically it appears as class-imbalance in target domain. In practice, this is an important problem in UDA; as we do not know labels in target domain datasets...
Chapter
This article discusses observation planning for identifying each person in indoor living environment such as office space using a mobile camera mounted on a drone. Since many people in the environment often keep their positions and postures during their work, it is difficult to observe the face of all the people by a fixed camera installed in the e...
Article
Full-text available
In this study, a stationary front is automatically detected from weather data using a U-Net deep convolutional neural network. The U-Net trained the transformation process from single/multiple physical quantities of weather data to detect stationary fronts using a 10-year data set. As a result of applying the trained U-Net to a 1-year untrained dat...
Chapter
This paper discusses the possibility of identifying different situations related to the students during a lecture from its video by classifying the situations that happen in the lecture based on the similarity in the posture of each student. The recognized situations can be used as indexes for the instructor to watch the video to further improve th...
Preprint
Three-dimensional (3D) reconstruction and scene depth estimation from 2-dimensional (2D) images are major tasks in computer vision. However, using conventional 3D reconstruction techniques gets challenging in participating media such as murky water, fog, or smoke. We have developed a method that uses a time-of-flight (ToF) camera to estimate an obj...
Article
Images captured in participating media such as murky water, fog, or smoke are degraded by scattered light. Thus, the use of traditional three-dimensional (3D) reconstruction techniques in such environments is difficult. In this paper, we propose a photometric stereo method for participating media. The proposed method differs from prvious studies wi...
Article
Full-text available
Vignetting is a common type of image degradation that makes peripheral parts of an image darker than the central part. Single-image devignetting aims to remove undesirable vignetting from an image without resorting to calibration, thereby providing high-quality images required for a wide range of applications. Previous studies into single-image dev...
Article
This article proposes a method to find intersections at which cars tend to deviate from the optimal route based on global positioning system (GPS) tracking data under the assumption that such deviations indicate that car navigation systems (CNSs) and road signage are not readily available. If the intended route is known, deviations can be enumerate...
Conference Paper
Full-text available
In recent years, congestion of famous sightseeing spots and public transportation has become a problem in tourism areas. It is a way to solve it that local governments forecast congestion by a congestion simulation based on a tourist transition model and adjust the operation of public transportation. In this research, we propose a method to constru...
Article
Full-text available
Images captured in participating media such as murky water, fog, or smoke are degraded by scattered light. Thus, the use of traditional three-dimensional (3D) reconstruction techniques in such environments is difficult. In this paper, we propose a photometric stereo method for participating media. The proposed method differs from previous studies w...
Article
Full-text available
Blind deconvolution (BD) is the problem of restoring sharp images from blurry images when convolution kernels are unknown. While it has a wide range of applications and has been extensively studied, traditional shift-invariant (SI) BD focuses on uniform blur caused by kernels that do not spatially vary. However, real blur caused by factors such as...
Conference Paper
We attempt to train a classifier to identify food items in images captured during food preparation processes. Food changes appearance significantly during the cooking process, and different foods can be mixed together. Thus, manually annotating individual food items during the preparation process is difficult. To train a classifier without manual a...
Article
In previous work on recognizing situations of lectures from their videos for reviewing those lectures to improve them, the behavior of looking ahead has been focused as the situation to be considered for the students, whereas various behaviors including pointing at slides, writing on the whiteboard, speaking to the students and so on have been cons...
Article
Image restoration is a fundamental problem in the field of image processing. The key objective of image restoration is to recover clean images from images degraded by noise and blur. Recently, a family of new statistical techniques called variational Bayes (VB) has been introduced to image restoration, which enables us to automatically tune paramet...
Article
Full-text available
Outlier detection and cluster number estimation is an important issue for clustering real data. This paper focuses on spectral clustering, a time-tested clustering method, and reveals its important properties related to outliers. The highlights of this paper are the following two mathematical observations: first, spectral clustering's intrinsic pro...
Conference Paper
Full-text available
As the recent rapid advances of machine learning, big data processing and Internet of things, on-site and personalized tourist support services called smart tourism services (STS) are emerging. STS require regional data (RD) collected mainly in the destination from various date owners. Many regional parties that provide STS are so small that they c...
Conference Paper
Analysis of transportation mode used by tourists in touristic destination areas provides basic information of tourism policy making for local governments or marketing strategy making for the tourism industry. With the technical advances of tracking devices, GPS supported smartphones sense the movement of tourists, and generate a large volumes of da...
Article
In this research, we aim at constructing a tourist behavior model using human factor and regional environmental factor, and implementing travel route recommendation system using the output of the tourist behavior model. Previous route recommendation systems recommend the travel route based on tourist behavior model using a certain factor that affec...
Conference Paper
We propose to estimate the position of each student in a classroom by observing the classroom with a camera attached on the notebook or tablet PC of the lecturer. The position of each student in the classroom is useful to keep observing his/her learning behavior as well as taking attendance, continuously during the lecture. Although there are many...
Conference Paper
We propose to measure the 3D arrangement of multiple portable information devices operated by a single user from his/her facial images captured by the cameras installed on those devices. Since it becomes quite usual for us to use multiple information devices at the same time, previous works have proposed various styles of cooperation among the devi...
Article
Calibrating the response function of a line scanner is very important in many fields of computer vision. We propose a method to reduce nonequivalence present in response functions of pixels. Contrary to current state-of-the-art methods our method uses a linear light source which is usually attached to line scanners for pixel-wise calibration. We de...
Article
Previous work on virtual object manipulation by a real object in an augmented reality (AR) environment mainly focuses on a rigid object as the real object. If a human hand is considered as the real object, direct manipulation of virtual objects by the real human hand is realized. For virtual object manipulation by a human hand, its position and pos...
Conference Paper
Full-text available
In this paper, we propose a method to measure specular objects regardless occlusion. The main contribution of this paper is that we have shown that the scattering of incident and specular reflection enable us to measure occluded surfaces. We locate objects in a tank filled with participating media, irradiate a laser beam to the objects, and observe...
Conference Paper
This article discusses the problem of recognizing groups of people in conversation with each other in an open space. Previous work on this problem takes an approach based on the knowledge that people in the same conversation group often makes a circular formation called an F-formation, by referring to the work in social psychology. Since the F-form...
Conference Paper
Full-text available
Most vision-based 3D acquisition methods including both passive and active methods have a limitation in that cameras must be able to observe the surface to be measured. If this is not possible, that is to say, if the surface is occluded, most of the methods cannot acquire the surface shape. In this paper, we present a method that can acquire the 3D...
Article
In this paper, we adopt the coded aperture technique to the alignment process for industrial machinery. As a special setting for assembly and inspection, such machinery uses an illumination of narrow wavelength range for imaging with less aberration. This leads to significant influence of light diffraction on the image restoration. Although the dif...
Conference Paper
This paper discusses virtual object manipulation using a human hand in an AR environment. For this purpose, we need to measure the configuration, which includes the position, pose and posture, of the human hand. However, if we employ a data glove for the measurement, the glove appears in the video images synthesized by AR. If we otherwise employ a...
Conference Paper
Full-text available
Most shape and reflectance acquisition methods use the light reflected from objects' surfaces. When the reflection cannot be observed, e.g., when the object's surface is black matte or highly specular, its shape and reflectance are difficult to acquire. In this paper, we propose a method that can measure shape and reflectance with another approach....
Conference Paper
In this paper, we propose a novel method of skeleton estimation for the purpose of constructing and manipulating individualized hand models via a data glove. To reconstruct actual hand accurately, we derive motion constraints in fully 6-DOF at the joints without assuming either a center of rotation or a joint axis. The constraints are derived by re...
Conference Paper
In this article, we discuss how to detect occasional social interaction by a group of people in an open space such as a hall by observing the environment by cameras. Since it is known in the field of social psychology that some characteristic arrangement is maintained by each group of people during interaction, previous works have tried to detect s...
Conference Paper
In this paper, we propose a novel method for reconstructing the shape model of a non-rigid object. We represent the non-rigid object as the union of rigid components, and acquire range images of the object and motion of each component while the object varies its shape. We acquire the range images using one-shot scanning, and we use marker-based mot...
Article
In this article, we discuss how to recognize each group of humans and objects that interact with each other, for some activity such as a conversation by some people looking at the same screen, from observation by cameras. Although the previous work on recognizing human behaviors by cameras mainly discusses those of a single person, such as moving f...
Conference Paper
Full-text available
This paper presents an artifact-free super resolution texture mapping from multiple-view images. The multiple-view images are upscaled with a learning-based super resolution technique and are mapped onto a 3D mesh model. However, mapping multiple-view images onto a D model is not an easy task, because artifacts may appear when different upscaled im...
Article
Full-text available
We propose a method for acquiring a 3D shape of human body segments accurately. Using a light stripe triangulation range finder, we can acquire accurate 3D shape of a motionless object in dozens of seconds. If the object moves during the scanning, the acquired shape would be distorted. Naturally, humans move slightly for making balance while standi...
Conference Paper
Full-text available
3D shapes are reconstructed from silhouettes obtained by multiple cameras with the volume intersection method. In recent work, methods of integrating silhouettes in time sequences have been proposed. The number of silhouettes can be increased by integrating silhouettes in multiple frames. The silhouettes of a rigid object in multiple frames are int...
Conference Paper
Full-text available
Photometric stereo is a method of recovering surface normals (needle map) from images. The surface integral of surface normals is used to reconstruct a depth map; however, the depth edges, which are discontinuous boundaries of the depth map, pose a problem for photometric stereo. When the surface of objects includes depth edges, the reconstructed d...
Article
The authors propose a method to measure the smooth curved surface of an object and the depressed surface of an object which cannot be measured using the visual hull method by using a needle diagram obtained via photometric stereo. Errors occur in the distance map as a result of depth edges when recovering a distance map using a needle diagram. In o...
Conference Paper
Full-text available
In this paper, we present a novel approach for extracting silhouettes by using a particular pattern that we call the random pattern. The volume intersection method reconstructs the shapes of 3D objects from their silhouettes obtained with multiple cameras. With the method, if some parts of the silhouettes are missed, the corresponding parts of the...
Conference Paper
Full-text available
In this paper, we propose an extension of light stripe triangulation for multiple of moving rigid objects. With traditional light stripe triangulation, the acquired shape of moving object would be distorted. If the subject is a rigid object, we can correct the distortion in the acquired shape based on its motion. However, when the subject consists...
Conference Paper
Full-text available
In this article, we discuss D shape reconstruction of an ob- ject in a rigid motion with the volume intersection method. When the object moves rigidly, the cameras change their rel- ative positions to the object at every moment. To estimate the motion correctly, we propose new feature points called out- crop points on the reconstructed 3D shape. Th...