Juxi Leitner

Juxi Leitner
LYRO Robotics

PhD, MSc, MSc (Tech), BSc

About

92
Publications
41,893
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,119
Citations
Introduction
Roboticist / coFounder at LYRO #Robotics, #ComputerVision & #ArtificialIntelligence. Former Research Lead at Australian Centre for Robotic Vision (ACRV) Before that I worked at the IDSIA Robotics Lab & received a PhD from the Università della Svizzera Italiana (USI) in robotic learning for vision and actions on the iCub humanoid. Previously I worked at the Advanced Concepts Team of the European Space Agency. I studied Space Robotics in a Joint European Master Programme (SpaceMaster).
Additional affiliations
August 2019 - present
LYRO Robotics
Position
  • Managing Director
November 2014 - December 2019
Queensland University of Technology
Position
  • PostDoc Position
September 2012 - January 2013
University of Lugano
Position
  • TA for Systems Programming
Education
February 2011 - September 2014
University of Lugano
Field of study
  • Artificial Intelligence and Robotics
January 2009 - April 2009
The University of Tokyo
Field of study
  • Intelligent Space Systems
August 2008 - August 2009
Aalto University
Field of study
  • Joint European Master in Space Science and Technology

Publications

Publications (92)
Article
Full-text available
This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and...
Article
Full-text available
We describe our software system enabling a tight integration between vision and control modules on complex, high-DOF humanoid robots. This is demonstrated with the iCub humanoid robot performing visual object detection, reaching and grasping actions. A key capability of this system is reactive avoidance of obstacle objects detected from the video s...
Chapter
Full-text available
Combining domain knowledge about both imaging processing and machine learning techniques can expand the abilities of Genetic Programming when used for image processing. We successfully demonstrate our new approach on several different problem domains. We show that the approach is fast, scalable and robust. In addition, by virtue of using off-the-sh...
Article
Full-text available
Curiosity is an essential driving force for science as well as technology, and has led mankind to explore its surroundings, all the way to our current understanding of the universe. Space science and exploration is at the pinnacle of each of these developments, in that it requires the most advanced technology, explores our world and outer space, an...
Conference Paper
We describe a hopping science payload solution designed to exploit the Moon's lower gravity to leap up to 20 m above the surface. The entire solar-powered robot is compact enough to fit within a $10$cm cube, whilst providing unique observation and mission capabilities by creating imagery during the hop. The LunaRoo concept is a proposed payload to...
Preprint
Full-text available
We propose a novel iterative approach for crossing the reality gap that utilises live robot rollouts and differentiable physics. Our method, RealityGrad, demonstrates for the first time, an efficient sim2real transfer in combination with a real2sim model optimisation for closing the reality gap. Differentiable physics has become an alluring alterna...
Chapter
The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing of everyday household items. Herein we present the design of our custom-built, Cartesian robot Cartman, which won the first place in the competition finals. We highlight our integrated, experience-centred design methodology and the...
Preprint
Full-text available
We present the Evolved Grasping Analysis Dataset (EGAD), comprising over 2000 generated objects aimed at training and evaluating robotic visual grasp detection algorithms. The objects in EGAD are geometrically diverse, filling a space ranging from simple to complex shapes and from easy to difficult to grasp, compared to other datasets for robotic g...
Preprint
Full-text available
Deep reinforcement learning has been shown to solve challenging tasks where large amounts of training experience is available, usually obtained online while learning the task. Robotics is a significant potential application domain for many of these algorithms, but generating robot experience in the real world is expensive, especially when each task...
Preprint
Full-text available
We present a benchmark to facilitate simulated manipulation; an attempt to overcome the obstacles of physical benchmarks through the distribution of a real world, ground truth dataset. Users are given various simulated manipulation tasks with assigned protocols having the objective of replicating the real world results of a recorded dataset. The be...
Preprint
Full-text available
This contribution comprises the interplay between a multi-modal variational autoencoder and an environment to a perceived environment, on which an agent can act. Furthermore, we conclude our work with a comparison to curiosity-driven learning.
Preprint
Full-text available
When learning behavior, training data is often generated by the learner itself; this can result in unstable training dynamics, and this problem has particularly important applications in safety-sensitive real-world control tasks such as robotics. In this work, we propose a principled and model-agnostic approach to mitigate the issue of unstable lea...
Article
Full-text available
Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labeling process is often expensive or even impractical in many robotic...
Article
Humans perform object manipulation in order to execute a specific task. Seldom is such action started with no goal in mind. In contrast, traditional robotic grasping (first stage for object manipulation) seems to focus purely on getting hold of the object—neglecting the goal of the manipulation. Most metrics used in robotic grasping do not account...
Article
Full-text available
We present a novel approach to perform object-independent grasp synthesis from depth images via deep neural networks. Our generative grasping convolutional neural network (GG-CNN) predicts a pixel-wise grasp quality that can be deployed in closed-loop grasping scenarios. GG-CNN overcomes shortcomings in existing techniques, namely discrete sampling...
Article
Juxi Leitner recounts how he and his team took part in — and won — the 2017 Amazon Robotics Challenge and reflects on the importance of solving big picture problems in robotics.
Preprint
Full-text available
We quantify the accuracy of various simulators compared to a real world robotic reaching and interaction task. Simulators are used in robotics to design solutions for real world hardware without the need for physical access. The `reality gap' prevents solutions developed or learnt in simulation from performing well, or at at all, when transferred t...
Article
Full-text available
Automated grasping has a long history of research that is increasing due to interest from industry. One grand challenge for robotics is Universal Picking: the ability to robustly grasp a broad variety of objects in diverse environments for applications from warehouses to assembly lines to homes. Although many researchers now openly share code and d...
Preprint
Full-text available
Camera viewpoint selection is an important aspect of visual grasp detection, especially in clutter where many occlusions are present. Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a d...
Preprint
Full-text available
Current end-to-end Reinforcement Learning (RL) approaches are severely limited by restrictively large search spaces and are prone to overfitting to their training environment. This is because in end-to-end RL perception, decision-making and low-level control are all being learned jointly from very sparse reward signals, with little capability of in...
Preprint
Full-text available
We investigate a reinforcement approach for distributed sensing based on the latent space derived from multi-modal deep generative models. Our contribution provides insights to the following benefits: Detections can be exchanged effectively between robots equipped with uni-modal sensors due to a shared latent representation of information that is t...
Conference Paper
Full-text available
This paper represents a step towards vision-based manipulation of plastic materials. Manipulating deformable objects is made challenging by: 1) the absence of a model for the object deformation, 2) the inherent difficulty of visual tracking of deformable objects, 3) the difficulty in defining a visual error and 4) the difficulty in generating contr...
Preprint
Full-text available
Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic...
Article
Full-text available
The application of deep learning in robotics leads to very specific problems and research questions that are typically not addressed by the computer vision and machine learning communities. In this paper we discuss a number of robotics-specific learning, reasoning, and embodiment challenges for deep learning. We explain the need for better evaluati...
Article
Full-text available
This paper presents a real-time, object-independent grasp synthesis method which can be used for closed-loop grasping. Our proposed Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality of grasps at every pixel. This one-to-one mapping from a depth image overcomes limitations of current deep learning grasping techniques, sp...
Article
The International Joint Conference on Neural Networks (IJCNN) was held in Anchorage (Alaska) in May 2017. This top conference in the field of neural networks included many tracks and special sessions. In particular, a special session on Machine Learning Methods Neural Networks applied to Vision and Robotics (MLMVR) was organized by the authors rece...
Conference Paper
Full-text available
While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models train...
Article
Full-text available
We present the grasping system and design approach behind $\textit{Cartman}$, the winning entrant in the 2017 Amazon Robotics Challenge. We investigate the design processes leading up to the final iteration of the system and describe the emergent solution by comparing it with key robotics design aspects. Following our experience, we propose a new d...
Article
Full-text available
Robotic manipulation and grasping in cluttered and unstructured environments is a current challenge for robotics. Enabling robots to operate in these challenging environments have direct applications from automating warehouses to harvesting fruit in agriculture. One of the main challenges associated with these difficult robotic manipulation tasks i...
Article
Full-text available
We present our approach for robotic perception in cluttered scenes that led to winning the recent Amazon Robotics Challenge (ARC) 2017. Next to small objects with shiny and transparent surfaces, the biggest challenge of the 2017 competition was the introduction of unseen categories. In contrast to traditional approaches which require large collecti...
Article
Full-text available
The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing, addressing development in robotic vision and manipulation. This paper presents the design of our custom-built. cost-effective robot system Cartman, which won first place in the competition finals by stowing 14 (out of 16) and picki...
Article
Full-text available
A modular method is proposed to learn and transfer visuo-motor policies from simulation to the real world in an efficient manner by combining domain randomization and adaptation. The feasibility of the approach is demonstrated in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter thro...
Conference Paper
Full-text available
A modular method is proposed to learn and transfer visuo-motor policies from simulation to the real world in an efficient manner by combining domain randomization and adaptation. The feasibility of the approach is demonstrated in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter thro...
Technical Report
Full-text available
In this paper we present our experience teaching Systems Programming in C to undergraduate students 1. Additionally to traditional Unix-like operating system approach, we employed a robotic platform – the e-puck mobile robot – to increase the students' motivation and improve their learning experience. A robotic platform provides high attraction for...
Article
Full-text available
In this paper we present our experience teaching Systems Programming in C to undergraduate students. Additionally to traditional Unix-like operating system approach, we employed a robotic platform the e-puck mobile robot-to increase the students' motivation and improve their learning experience. A robotic platform provides high attraction for stude...
Conference Paper
Full-text available
We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF visual servoing. The paper describes how to create a dataset simulating various perturbations (occlusions and lighting conditions) from a single real-world image of the scene. A convolutional neural network is fine-tuned using this dataset to estimat...
Article
Full-text available
This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently. Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.
Article
Full-text available
We propose to learn tasks directly from visual demonstrations by learning to predict the outcome of human and robot actions on an environment. We enable a robot to physically perform a human demonstrated task without knowledge of the thought processes or actions of the human, only their visually observable state transitions. We evaluate our approac...
Article
Full-text available
In this paper we describe a deep network architecture that maps visual input to control actions for a robotic planar reaching task with 100% reliability in real-world trials. Our network is trained in simulation and fine-tuned with a limited number of real-world images. The policy search is guided by a kinematics-based controller (K-GPS), which wor...
Article
Full-text available
Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress as they make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of co...
Article
Full-text available
We generalize Richardson-Lucy deblurring to 4-D light fields by replacing the convolution steps with light field rendering of motion blur. The method deals correctly with blur caused by 6-degree-of-freedom camera motion in complex 3-D scenes, without performing depth estimation. We include a novel regularization term that maintains parallax informa...