
Peter Ian CorkeQueensland University of Technology | QUT · School of Electrical Engineering and Computer Science
Peter Ian Corke
PhD, University of Melbourne
About
593
Publications
212,018
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
29,704
Citations
Introduction
Skills and Expertise
Education
February 1981 - October 1983
February 1977 - November 1980
Publications
Publications (593)
Curved refractive objects are common in the human environment, and have a complex visual appearance that can cause robotic vision algorithms to fail. Light-field cameras allow us to address this challenge by capturing the view-dependent appearance of such objects in a single exposure. We propose a novel image feature for light fields that detects a...
Curved refractive objects are common in the human environment, and have a complex visual appearance that can cause robotic vision algorithms to fail. Light-field cameras allow us to address this challenge by capturing the view-dependent appearance of such objects in a single exposure. We propose a novel image feature for light fields that detects a...
Human navigation in built environments depends on symbolic spatial information which has unrealized potential to enhance robot navigation capabilities. Information sources, such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth...
Resolved-rate motion control of redundant serial-link manipulators is commonly achieved using the Moore-Penrose pseudoinverse in which the norm of the control input is minimized. However, as kinematic singularities are a significant issue for robotic manipulators, we propose a Manipulability Motion Controller which chooses joint velocities which wi...
Human navigation in built environments depends on symbolic spatial information which has unrealised potential to enhance robot navigation capabilities. Information sources such as labels, signs, maps, planners, spoken directions, and navigational gestures communicate a wealth of spatial information to the navigators of built environments; a wealth...
In this paper we consider the problem of the final approach stage of closed-loop grasping where RGB-D cameras are no longer able to provide valid depth information. This is essential for grasping non-stationary objects; a situation where current robotic grasping controllers fail. We predict the image-plane coordinates of observed image features at...
The computer vision and robotics research communities are each strong. However progress in computer vision has become turbo-charged in recent years due to big data, GPU computing, novel learning algorithms and a very effective research methodology. By comparison, progress in robotics seems slower. It is true that robotics came later to exploring th...
To safely operate in the real world, robots need to evaluate how confident they are about what they see. A new competition challenges computer vision algorithms to not just detect and localize objects, but also report how certain they are.
Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labeling process is often expensive or even impractical in many robotic...
We present a novel approach to perform object-independent grasp synthesis from depth images via deep neural networks. Our generative grasping convolutional neural network (GG-CNN) predicts a pixel-wise grasp quality that can be deployed in closed-loop grasping scenarios. GG-CNN overcomes shortcomings in existing techniques, namely discrete sampling...
To be effective, robots will need to reliably operate in scenes with refractive objects in a variety of applications; however, refractive objects can cause many robotic vision algorithms, such as structure from motion, to become unreliable or even fail. We propose a novel method to distinguish between refracted and Lambertian image features using a...
We introduce Probabilistic Object Detection, the task of detecting objects in images and accurately quantifying the spatial and semantic uncertainties of the detections. Given the lack of methods capable of assessing such probabilistic object detections, we present the new Probability-based Detection Quality measure (PDQ). Unlike AP-based measures,...
Camera viewpoint selection is an important aspect of visual grasp detection, especially in clutter where many occlusions are present. Where other approaches use a static camera position or fixed data collection routines, our Multi-View Picking (MVP) controller uses an active perception approach to choose informative viewpoints based directly on a d...
This paper represents a step towards vision-based manipulation of plastic materials. Manipulating deformable objects is made challenging by: 1) the absence of a model for the object deformation, 2) the inherent difficulty of visual tracking of deformable objects, 3) the difficulty in defining a visual error and 4) the difficulty in generating contr...
Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic...
Robots must reliably interact with refractive objects in many applications; however, refractive objects can cause many robotic vision algorithms to become unreliable or even fail, particularly feature-based matching applications, such as structure-from-motion. We propose a method to distinguish between refracted and Lambertian image features using...
The application of deep learning in robotics leads to very specific problems and research questions that are typically not addressed by the computer vision and machine learning communities. In this paper we discuss a number of robotics-specific learning, reasoning, and embodiment challenges for deep learning. We explain the need for better evaluati...
This paper presents a real-time, object-independent grasp synthesis method which can be used for closed-loop grasping. Our proposed Generative Grasping Convolutional Neural Network (GG-CNN) predicts the quality of grasps at every pixel. This one-to-one mapping from a depth image overcomes limitations of current deep learning grasping techniques, sp...
This paper presents the design and implementation of an assisted control technology for a small multirotor platform for aerial inspection of fixed energy infrastructure. Sensor placement is supported by a theoretical analysis of expected sensor performance and constrained platform behaviour to speed up implementation. The optical sensors provide re...
While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models train...
We present the grasping system and design approach behind $\textit{Cartman}$, the winning entrant in the 2017 Amazon Robotics Challenge. We investigate the design processes leading up to the final iteration of the system and describe the emergent solution by comparing it with key robotics design aspects. Following our experience, we propose a new d...
Robotic manipulation and grasping in cluttered and unstructured environments is a current challenge for robotics. Enabling robots to operate in these challenging environments have direct applications from automating warehouses to harvesting fruit in agriculture. One of the main challenges associated with these difficult robotic manipulation tasks i...
We present our approach for robotic perception in cluttered scenes that led to winning the recent Amazon Robotics Challenge (ARC) 2017. Next to small objects with shiny and transparent surfaces, the biggest challenge of the 2017 competition was the introduction of unseen categories. In contrast to traditional approaches which require large collecti...
The Amazon Robotics Challenge enlisted sixteen teams to each design a pick-and-place robot for autonomous warehousing, addressing development in robotic vision and manipulation. This paper presents the design of our custom-built. cost-effective robot system Cartman, which won first place in the competition finals by stowing 14 (out of 16) and picki...
A modular method is proposed to learn and transfer visuo-motor policies from simulation to the real world in an efficient manner by combining domain randomization and adaptation. The feasibility of the approach is demonstrated in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter thro...
A modular method is proposed to learn and transfer visuo-motor policies from simulation to the real world in an efficient manner by combining domain randomization and adaptation. The feasibility of the approach is demonstrated in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter thro...
We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF visual servoing. The paper describes how to create a dataset simulating various perturbations (occlusions and lighting conditions) from a single real-world image of the scene. A convolutional neural network is fine-tuned using this dataset to estimat...
In the previous chapter we learned about corner detectors which find particularly distinctive points in a scene. These points can be reliably detected in different views of the same scene irrespective of viewpoint or lighting conditions. Such points are characterized by high image gradients in orthogonal directions and typically occur on the corner...
Image processing is a computational process that transforms one or more input images into an output image. Image processing is frequently used to enhance an image for human viewing or interpretation, for example to improve contrast. Alternatively, and of more interest to robotics, it is the foundation for the process of feature extraction which wil...
This chapter builds on the previous one and introduces some advanced visual servo techniques and applications. Section 16.1 introduces a hybrid visual servo method that avoids some of the limitations of the IBVS and PBVS schemes described previously.
In this chapter we consider the dynamics and control of a serial-link manipulator arm. The motion of the end-effector is the composition of the motion of each link, and the links are ultimately moved by forces and torques exerted by the joints. Section 9.1 describes the key elements of a robot joint control system that enables a single joint to fol...
In ancient times it was believed that the eye radiated a cone of visual flux which mixed with visible objects in the world to create a sensation in the observer – like the sense of touch, but at a distance – this is the extromission theory. Today we consider that light from an illuminant falls on the scene, some of which is reflected into the eye o...
In this chapter we discuss how images are formed and captured, the first step in robot and human perception of the world. From images we can deduce the size, shape and position of objects in the world as well as other characteristics such as color and texture which ultimately lead to recognition.
In the last chapter we discussed the acquisition and processing of images. We learned that images are simply large arrays of pixel values but for robotic applications images have too much data and not enough information. We need to be able to answer pithy questions such as what is the pose of the object? what type of object is it? how fast is it mo...
A robot’s end-effector moves in Cartesian space with a translational and rotational velocity – a spatial velocity. However that velocity is a consequence of the velocities of the individual robot joints. In this chapter we introduce the relationship between the velocity of the joints and the spatial velocity of the end-effector.
In the previous chapter we learned how to describe the pose of objects in 2- or 3-dimensional space. This chapter extends those concepts to poses that change as a function of time. Section 3.1 introduces the derivative of time-varying position, orientation and pose and relates that to concepts from mechanics such as velocity and angular velocity. D...
In our discussion of map-based navigation we assumed that the robot had a means of knowing its position. In this chapter we discuss some of the common techniques used to estimate the location of a robot in the world – a process known as localization.
Numbers are an important part of mathematics. We use numbers for counting: there are 2 apples. We use denominate numbers, a number plus a unit, to specify distance: the object is 2 m away. We also call this single number a scalar. We use a vector, a denominate number plus a direction, to specify a location: the object is 2 m due north. We may also...
Kinematics is the branch of mechanics that studies the motion of a body, or a system of bodies, without considering its mass or the forces acting on it.
Robot navigation is the problem of guiding a robot towards a goal. The human approach to navigation is to make maps and erect signposts, and at first glance it seems obvious that robots should operate the same way. However many robotic tasks can be achieved without any map at all, using an approach referred to as reactive navigation. For example, n...
This chapter discusses how a robot platform moves, that is, how its pose changes with time as a function of its control inputs. There are many different types of robot platform as shown on pages 95–97 but in this chapter we will consider only four important exemplars. Section 4.1 covers three different types of wheeled vehicle that operate in a 2-d...
The task in visual servoing is to control the pose of the robot’s end-effector, relative to the goal, using visual features extracted from an image of the goal object. As shown in Fig. 15.1 the camera may be carried by the robot or be fixed in the world. The configuration of Fig. 15.1a has the camera mounted on the robot’s end-effector observing th...
This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently. Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.
This paper presents the FLEXBOT project, a joint LIRMM-QUT effort to develop (in the near future) novel methodologies for robotic manipulation of flexible and deformable objects. To tackle this problem, and based on our past experiences, we propose to merge vision and force for manipulation control, and to rely on Model Predictive Control (MPC) and...
Farmers are under growing pressure to intensify production to feed a growing population while managing environmental impact. Robotics has the potential to address these challenges by replacing large complex farm machinery with fleets of small autonomous robots. This article presents our research toward the goal of developing teams of autonomous rob...
This paper presents a new image-based visual servoing approach that simultaneously solves the feature correspondence and control problem. Using a finite-time optimal control framework, feature correspondence is implicitly solved for each new image during the control selection, alleviating the need for additional image processing and feature trackin...
We investigate different strategies for active learning with Bayesian deep neural networks. We focus our analysis on scenarios where new, unlabeled data is obtained episodically, such as commonly encountered in mobile robotics applications. An evaluation of different strategies for acquisition, updating, and final training on the CIFAR-10 dataset s...
This paper proposes the first derivation, implementation, and experimental validation of light field image-based visual servoing. Light field image Jacobians are derived based on a compact light field feature representation that is close to the form measured directly by light field cameras. We also enhance feature detection and correspondence by en...
This paper proposes the design of a custom mirror-based light field camera adapter that is cheap, simple in construction, and accessible. Mirrors of different shape and orientation reflect the scene into an upwards-facing camera to create an array of virtual cameras with overlapping field of view at specified depths, and deliver video frame rate li...
This paper describes a vision-based obstacle detection and navigation system for use as part of a robotic solution for the sustainable intensification of broad-acre agriculture. To be cost-effective, the robotics solution must be competitive with current human-driven farm machinery. Significant costs are in high-end localization and obstacle detect...
Fine-grained classification is a relatively new field that has concentrated on using information from a single image, while ignoring the enormous potential of using video data to improve classification. In this work we present the novel task of video-based fine-grained object classification, propose a corresponding new video dataset, and perform a...
In this paper we describe a deep network architecture that maps visual input to control actions for a robotic planar reaching task with 100% reliability in real-world trials. Our network is trained in simulation and fine-tuned with a limited number of real-world images. The policy search is guided by a kinematics-based controller (K-GPS), which wor...
Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress as they make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of co...
Fine-grained classification is a relatively new field that has concentrated on using information from a single image, while ignoring the enormous potential of using video data to improve classification. In this work we present the novel task of video-based fine-grained object classification, propose a corresponding new video dataset, and perform a...
In this article, we discuss our experience in developing and implementing two massive open online courses (MOOCs) at Queensland University of Technology (QUT), Brisbane, Australia. The MOOCs, titled Introduction to Robotics and Robotic Vision, each ran for six weeks and comprised online lectures, assessments, programming exercises, and an optional...
We describe our software system enabling a tight integration between vision and control modules on complex, high-DOF humanoid robots. This is demonstrated with the iCub humanoid robot performing visual object detection, reaching and grasping actions. A key capability of this system is reactive avoidance of obstacle objects detected from the video s...
Vision tasks are complicated by the nonuniform apparent motion associated with dynamic cameras in complex 3D environments. We present a framework for light field cameras that simplifies dynamic-camera problems, allowing stationary-camera approaches to be applied. No depth estimation or scene modelling is required – apparent motion is disregarded by...
A Delay Tolerant Network (DTN) is a dynamic, fragmented, and ephemeral network formed by a large number of highly mobile nodes. DTNs are ephemeral networks with highly mobile autonomous nodes. This requires distributed and self-organised approaches to trust management. Revocation and replacement of security credentials under adversarial influence b...
We present a novel deep convolutional neural network (DCNN) system for fine-grained image classification, called a mixture of DCNNs (MixDCNN). The fine-grained image classification problem is characterised by large intra-class variations and small inter-class variations. To overcome these problems our proposed MixDCNN system partitions images into...