
Tadahiro TaniguchiRitsumeikan University · Department of Human and Computer Intelligence
Tadahiro Taniguchi
PhD Eng.
About
225
Publications
23,483
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,897
Citations
Citations since 2017
Introduction
Additional affiliations
April 2010 - present
April 2008 - March 2010
April 2007 - March 2008
Publications
Publications (225)
Creating autonomous robots that can actively explore the environment, acquire knowledge and learn skills continuously is the ultimate achievement envisioned in cognitive and developmental robotics. Their learning processes should be based on interactions with their physical and social world in the manner of human learning and cognitive development....
In this paper, we present a method that picks up objects from a cells tray using “Finger-position Overlaid and Cropped Images". The proposed method enabled picking up objects from the cells tray to a small number of samples. The proposed method and a bassline method using regression were compared. Picking up experiments were conducted with 20 and 4...
Autonomous robots are required to actively and adaptively learn the categories and words of various places by exploring the surrounding environment and interacting with users. In semantic mapping and spatial language acquisition conducted using robots, it is costly and labor-intensive to prepare training datasets that contain linguistic instruction...
In recent years, the labor shortage has become a significant problem in Japan and other countries due to aging societies. However, service robots can play a decisive role in relieving human workers by performing various household and assistive tasks. Restroom cleaning is one of such challenging tasks that involve performing motion planning in a con...
The workforce shortage in the service industry, recently highlighted by the pandemic, has increased the need for automation. We propose an autonomous robot to fulfill this purpose. Our mobile manipulator includes an extendable and compliant end effector design, as well as a custom-made automated shelf, and it is capable of manipulating food product...
Robots employed in homes and offices need to adaptively learn spatial concepts using user utterances. To learn and represent spatial concepts, the robot must estimate the coordinate system used by humans. For example, to represent spatial concept “left,” which is one of the relative spatial concepts (defined as a spatial concept depending on the ob...
The human brain, among its several functions, analyzes the double articulation structure in spoken language, i.e., double articulation analysis (DAA). A hierarchical structure in which words are connected to form a sentence and words are composed of phonemes or syllables is called a double articulation structure. Where and how DAA is performed in t...
We propose a method that integrates probabilistic logic and spatial concept to enable a robot to acquire knowledge of the relationships between objects and places in a new environment with a few learning times. By combining logical inference with prior knowledge and cross-modal inference within spatial concept, the robot can infer the place of an o...
The paper describes the system development approach followed by researchers and engineers of the team NAIST-RITS-Panasonic (Japan) for their participation in the Future Convenience Store Challenge 2018, 2019 trials and 2020 (held in 2021). This international competition is about the development of robotic capabilities to execute complex tasks for r...
This paper proposes a new voice conversion (VC) task from human speech to dog-like speech while preserving linguistic information as an example of human to non-human creature voice conversion (H2NH-VC) tasks. Although most VC studies deal with human to human VC, H2NH-VC aims to convert human speech into non-human creature-like speech. Non-parallel...
Emergent communication, also known as symbol emergence, seeks to investigate computational models that can better explain human language evolution and the creation of symbol systems. This study aims to provide a new model for emergent communication, which is based on a probabilistic generative model. We define the Metropolis-Hastings (MH) naming ga...
In this study, we propose a head-to-head type (H2H-type) inter-personal multimodal Dirichlet mixture (Inter-MDM) by modifying the original Inter-MDM, which is a probabilistic generative model that represents the symbol emergence between two agents as multiagent multimodal categorization. A Metropolis--Hastings method-based naming game based on the...
Navigating to destinations using human speech instructions is an important task for autonomous mobile robots that operate in the real world. Spatial representations include a semantic level that represents an abstracted location category, a topological level that represents their connectivity, and a metric level that depends on the structure of the...
This paper proposes a probabilistic extension of SimSiam, a recent self-supervised learning (SSL) method. SimSiam trains a model by maximizing the similarity between image representations of different augmented views of the same image. Although uncertainty-aware machine learning has been getting general like deep variational inference, SimSiam and...
Domestic robots are often required to understand spoken commands in noisy environments, including service appliances' operating sounds. Most conventional domestic robots use electret condenser microphones (ECMs) to record the sound. However, the ECMs are known to be sensitive to the noise in the direction of sound arrival. The laser Doppler vibrome...
Using the spatial structure of various indoor environments as prior knowledge, the robot would construct the map more efficiently. Autonomous mobile robots generally apply simultaneous localization and mapping (SLAM) methods to understand the reachable area in newly visited environments. However, conventional mapping approaches are limited by only...
In this paper, we propose Multi-View Dreaming, a novel reinforcement learning agent for integrated recognition and control from multi-view observations by extending Dreaming. Most current reinforcement learning method assumes a single-view observation space, and this imposes limitations on the observed data, such as lack of spatial information and...
An industrial connector-socket insertion task requires sub-millimeter positioning and compensation of grasp pose of a connector. Thus high accurate estimation of relative pose between socket and connector is a key factor to achieve the task. World models are promising technology for visuo-motor control. They obtain appropriate state representation...
The present paper proposes a novel reinforcement learning method with world models, DreamingV2, a collaborative extension of DreamerV2 and Dreaming. DreamerV2 is a cutting-edge model-based reinforcement learning from pixels that uses discrete world models to represent latent states with categorical variables. Dreaming is also a form of reinforcemen...
Building a human-like integrative artificial cognitive system, that is, an artificial general intelligence (AGI), is the holy grail of the artificial intelligence (AI) field. Furthermore, a computational model that enables an artificial system to achieve cognitive development will be an excellent reference for brain and cognitive science. This pape...
This paper describes a computational model of multiagent multimodal categorization that realizes emergent communication. We clarify whether the computational model can reproduce the following functions in a symbol emergence system, comprising two agents with different sensory modalities playing a naming game. (1) Function for forming a shared lexic...
Human infants acquire their verbal lexicon from minimal prior knowledge of language based on the statistical properties of phonological distributions and the co-occurrence of other sensory stimuli. In this study, we propose a novel fully unsupervised learning method discovering speech units by utilizing phonological information as a distributional...
In this paper, we address the task of rearranging items with a robot. A rearrangement task is challenging because it requires us to solve the following issues: determine how to pick the items and plan how and where to place the items. In our previous work, we proposed to solve a rearrangement task by combining the symbolic and motion planners using...
We propose a system using a multimodal probabilistic approach to solve the human action recognition challenge. This is achieved by extracting the human pose from an ongoing activity from a single camera. This pose is used to capture additional body information using generalized features such as location, time, distances, and angles. A probabilistic...
This study aims to further develop the task of novel view synthesis by generative adversarial networks (GAN). The goal of novel view synthesis is to, given one or more input images, synthesize images of the same target content but from different viewpoints. Previous research showed that the unsupervised learning model HoloGAN achieved high performa...
Word and phone discovery are important tasks in the language development of human infants. Infants acquire words and phones from unsegmented speech signals using segmentation cues such as distributional, prosodic, and co-occurrence information. Many pre-existing computational models designed to represent this process tend to focus on distributional...
Social demand for robots to be our partners in daily life has been rapidly increasing. Cognitive robotics should play a major role in making robots our partners. To discuss the role of cognitive robotics, we organized the round table in December 2020. This review paper aimed at clarifying the role of cognitive robotics summarizing the discussion in...
We focus on the task of arranging objects using a robot. In our previous work, we proposed to use a Monte Carlo Tree Search (MCTS) and a Motion Feasibility Checker (MFC) to solve the task. However, the problem with the existing method is that is time-consuming. In this paper, we propose to use a Neural Network-based MFC (NN-MFC). This NN-MFC can qu...
This paper proposes methods for unsupervised lexical acquisition for relative spatial concepts using spoken user utterances. A robot with a flexible spoken dialog system must be able to acquire linguistic representation and its meaning specific to an environment through interactions with humans as children do. Specifically, relative spatial concept...
This paper proposes a hierarchical Bayesian model based on spatial concepts that enables a robot to transfer the knowledge of places from experienced environments to a new environment. The transfer of knowledge based on spatial concepts is modeled as the calculation process of the posterior distribution based on the observations obtained in each en...
This paper describes a computational model of multiagent multimodal categorization that realizes emergent communication. We clarify whether the computational model can reproduce the following functions in a symbol emergence system, comprising two agents with different sensory modalities playing a naming game. (1) Function for forming a shared lexic...
We propose a method for multimodal concept formation. In this method, unsupervised multimodal clustering and cross-modal inference, as well as unsupervised representation learning, can be performed by integrating the multimodal latent Dirichlet allocation (MLDA)-based concept formation and variational autoencoder (VAE)-based feature extraction. Mul...
Understanding information processing in the brain—and creating general-purpose artificial intelligence—are long-standing aspirations of scientists and engineers worldwide. The distinctive features of human intelligence are high-level cognition and control in various interactions with the world including the self, which are not defined in advance an...
Preserving the linguistic content of input speech is essential during voice conversion (VC). The star generative adversarial network-based VC method (StarGAN-VC) is a recently developed method that allows non-parallel many-to-many VC. Although this method is powerful, it can fail to preserve the linguistic content of input speech when the number of...
In this study, we argue that the development of semiotically adaptive cognition is indispensable for realizing remotely-operated service robots to enhance the quality of the new normal society. To enable a wide range of people to work from home in a pandemic like the current COVID-19 situation, the installation of remotely-operated service robots i...
This paper proposes methods for unsupervised lexical acquisition for relative spatial concepts using spoken user utterances. A robot with a flexible spoken dialog system must be able to acquire linguistic representation and its meaning specific to an environment through interactions with humans as children do. Specifically, relative spatial concept...
The installation of remotely-operated service robots in the environments of our daily life (including offices, homes, and hospitals) can improve work-from-home policies and enhance the quality of the so-called new normal. However, it is evident that remotely-operated robots must have partial autonomy and the capability to learn and use local semiot...
Tidy-up tasks by service robots in home environments are challenging in robotics applications because they involve various interactions with the environment. In particular, robots are required not only to grasp, move, and release various home objects but also to plan the order and positions for placing the objects. In this paper, we propose a novel...
This paper shows that StarGAN-VC, a spectral envelope transformation method for non-parallel many-to-many voice conversion (VC), is capable of emotional VC (EVC). Although StarGAN-VC has been shown to enable speaker identity conversion, its capability for EVC for Japanese phrases has not been clarified. In this paper, we describe the direct applica...
Using the spatial structure of various indoor environments as prior knowledge, the robot would construct the map more efficiently. Autonomous mobile robots generally apply simultaneous localization and mapping (SLAM) methods to understand the reachable area in newly visited environments. However, conventional mapping approaches are limited by only...
Infants acquire words and phonemes from unsegmented speech signals using segmentation cues, such as distributional, prosodic, and co-occurrence cues. Many pre-existing computational models that represent the process tend to focus on distributional or prosodic cues. This paper proposes a nonparametric Bayesian probabilistic generative model called t...
Building a humanlike integrative artificial cognitive system, that is, an artificial general intelligence, is one of the goals in artificial intelligence and developmental robotics. Furthermore, a computational model that enables an artificial cognitive system to achieve cognitive development will be an excellent reference for brain and cognitive s...
This paper proposes a hierarchical Bayesian model based on spatial concepts that enables a robot to transfer the knowledge of places from experienced environments to a new environment. The transfer of knowledge based on spatial concepts is modeled as the calculation process of the posterior distribution based on the observations obtained in each en...
As service robots are becoming essential to support aging societies, teaching them how to perform general service tasks is still a major challenge preventing their deployment in daily-life environments. In addition, developing an artificial intelligence for general service tasks requires bottom-up, unsupervised approaches to let the robots learn fr...
The introduction of systems and robots for automated services is important for reducing running costs and improving operational efficiency in the retail industry. To this aim, we develop a system that enables robot agents to display products in stores. The main problem in automating product display using common supervised methods with robot agents...
In semantic mapping, which connects semantic information to an environment map, it is a challenging task for robots to deal with both local and global information of environments. In addition, it is important to estimate semantic information of unobserved areas from already acquired partial observations in a newly visited environment. On the other...
Robots are required to not only learn spatial concepts autonomously but also utilize such knowledge for various tasks in a domestic environment. Spatial concept represents a multimodal place category acquired from the robot's spatial experience including vision, speech-language, and self-position. The aim of this study is to enable a mobile robot t...
In the present paper, we propose a decoder-free extension of Dreamer, a leading model-based reinforcement learning (MBRL) method from pixels. Dreamer is a sample- and cost-efficient solution to robot learning, as it is used to train latent state-space models based on a variational autoencoder and to conduct policy optimization by latent trajectory...
Whitelisting is considered an effective security monitoring method for networks used in industrial control systems, where the whitelists consist of observed tuples of the IP address of the server, the TCP/UDP port number, and IP address of the client (communication triplets). However, this method causes frequent false detections. To reduce false po...
We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. Howeve...
Autonomous service robots are required to adaptively learn the categories and names of various places through the exploration of the surrounding environment and interactions with users. In this study, we aim to realize the efficient learning of spatial concepts by autonomous active exploration with a mobile robot. Therefore, we propose an active le...
This paper proposes SpCoMapGAN, a method to generate the semantic map in a newly visited environment by training an inference model using previously estimated semantic maps. SpCoMapGAN uses generative adversarial networks (GANs) to transfer semantic information based on room arrangements to the newly visited environment. We experimentally show in s...
In the present paper, we propose an extension of the Deep Planning Network (PlaNet), also referred to as PlaNet of the Bayesians (PlaNet-Bayes). There has been a growing demand in model predictive control (MPC) in partially observable environments in which complete information is unavailable because of, for example, lack of expensive sensors. PlaNe...
Robots are required to not only learn spatial concepts autonomously but also utilize such knowledge for various tasks in a domestic environment. Spatial concept represents a multimodal place category acquired from the robot's spatial experience including vision, speech-language, and self-position. The aim of this study is to enable a mobile robot t...
Tidy-up tasks by service robots in home environments are challenging in the application of robotics because they involve various interactions with the environment. In particular, robots are required not only to grasp, move, and release various home objects, but also plan the order and positions where to put them away. In this paper, we propose a no...
State representation learning (SRL) in partially observable Markov decision processes has been studied to learn abstract features of data useful for robot control tasks. For SRL, acquiring domain-agnostic states is essential for achieving efficient imitation learning (IL). Without these states, IL is hampered by domain-dependent information useless...