Figure - uploaded by Niklas Fiedler
Content may be subject to copyright.
Detailed distribution of annotations (left) and general statistics (right) of the real-world collection.

Detailed distribution of annotations (left) and general statistics (right) of the real-world collection.

Source publication
Conference Paper
Full-text available
We present a dataset specifically designed to be used as a benchmark to compare vision systems in the RoboCup Humanoid Soccer domain. The dataset is composed of a collection of images taken in various real-world locations as well as a collection of simulated images. It enables comparing vision approaches with a meaningful and expressive metric. The...

Contexts in source publication

Context 1
... real-world dataset contains 10,464 images and 101,432 annotations in eight different classes. In Table 2 we showcase metadata about the images and annotations present in the collection. Figure 2 shows exemplary annotations on images from the dataset. ...
Context 2
... real-world dataset contains 10,464 images and 101,432 annotations in eight different classes. In Table 2 we showcase metadata about the images and annotations present in the collection. Figure 2 shows exemplary annotations on images from the dataset. ...

Similar publications

Article
Full-text available
Basic synthetic information processing structures, such as logic gates, oscillators and flip-flops, have already been implemented in living organisms. Current implementations of these structures have yet to be extended to more complex processing structures that would constitute a biological computer. We make a step forward towards the construction...

Citations

... The data-driven approaches need large amounts of training data. This is an issue for many domains, but in the RC domain large quantities of annotated data for supervised learning are available as part of open data projects (Bestmann et al. 2022). While very powerful data-driven approaches exist, real-time constraints are still a limiting factor on embedded platforms like the autonomous robots used in the RC domain. ...
... See figure 2 for an example. (Bestmann et al., 2022) ...
Article
Full-text available
Robotics researchers have been focusing on developing autonomous and human-like intelligent robots that are able to plan, navigate, manipulate objects, and interact with humans in both static and dynamic environments. These capabilities, however, are usually developed for direct interactions with people in controlled environments, and evaluated primarily in terms of human safety. Consequently, human-robot interaction (HRI) in scenarios with no intervention of technical personnel is under-explored. However, in the future, robots will be deployed in unstructured and unsupervised environments where they will be expected to work unsupervised on tasks which require direct interaction with humans and may not necessarily be collaborative. Developing such robots requires comparing the effectiveness and efficiency of similar design approaches and techniques. Yet, issues regarding the reproducibility of results, comparing different approaches between research groups, and creating challenging milestones to measure performance and development over time make this difficult. Here we discuss the international robotics competition called RoboCup as a benchmark for the progress and open challenges in AI and robotics development. The long term goal of RoboCup is developing a robot soccer team that can win against the world’s best human soccer team by 2050. We selected RoboCup because it requires robots to be able to play with and against humans in unstructured environments, such as uneven fields and natural lighting conditions, and it challenges the known accepted dynamics in HRI. Considering the current state of robotics technology, RoboCup’s goal opens up several open research questions to be addressed by roboticists. In this paper, we (a) summarise the current challenges in robotics by using RoboCup development as an evaluation metric, (b) discuss the state-of-the-art approaches to these challenges and how they currently apply to RoboCup, and (c) present a path for future development in the given areas to meet RoboCup’s goal of having robots play soccer against and with humans by 2050.
... We used the TORSO-21 Dataset [12] to train and evaluate different variants of the network. The dataset is used because it resembles the intended deployment domain of the architecture. ...
... The number of filters before the output layers (purple, pink) depend on the number of predicted classes, this figure shows the layout for the TORSO-21 dataset with three bounding box and three segmentation classes. matches our use case, we have used the TORSO-21 Dataset [12] from the RoboCup soccer domain to train, test, and select from the various approaches mentioned in section III. Additionally, we have used the Cityscapes Dataset [14] to evaluate our final model architecture and to quantitatively compare it with other architectures. ...
... While this change is relatively small, it is still significant for classes like the field where the majority is easy to detect but the interesting edge cases are only solved by the deeper model. This includes cases where other intentionally unlabeled soccer fields are visible in the background, images with natural light (areas might be over-or underexposed), or cases where there are obstructions (people, robots, cables, laptops, ...) on the field that are part of the field class due to the simplified annotation method for field annotations described in the TORSO-21 paper [12]. Deeper encoder feature maps help in this regard, because they are less spatially specific, allowing to focus on a larger context while also having more non-linearities which allow for an approximation of more complex data distributions. ...
Conference Paper
Full-text available
Fast and accurate visual perception utilizing a robot's limited hardware resources is necessary for many mobile robot applications. We are presenting YOEO, a novel hybrid CNN which unifies previous object detection and semantic segmentation approaches using one shared encoder backbone to increase performance and accuracy. We show that it outperforms previous approaches on the TORSO-21 and Cityscapes datasets.
Chapter
This paper proposes a method to calibrate the model used for inverse perspective mapping of humanoid robots. It aims at providing a reliable way to determine the robot’s position given the known objects around it. The position of the objects can be calculated using coordinate transforms applied to the data from the robot’s vision device. Those transforms are dependent on the robot’s joint angles (such as knee, hip) and the length of some components (e.g. torso, thighs, calves). In practice, because of the sensitivity of the transforms with respect to the inaccuracies of the mechanical data, this calculation may yield errors that make it inadequate for the purpose of determining the objects’ positions. The proposed method reduces those errors using an optimization algorithm that can find offsets that can compensate those mechanical inaccuracies. Using this method, a kid-sized humanoid robot was able to determine the position of objects up to 2 m away from the itself with an average of 3.4 cm of error.
Conference Paper
Full-text available
We showcase a pipeline to train, evaluate, and deploy deep learning architectures for monocular depth estimation in the RoboCup Soccer Humanoid domain. In contrast to previous approaches, we apply the methods on embedded systems in highly dynamic but heavily constrained environments. The results indicate that our monocular depth estimation pipeline is usable in the RoboCup environment.
Thesis
Full-text available
In this work, a classifier for clothes was developed which solely relies on depth information. The task was approached using a neural network based on the PointNet architecture. The classification of clothes serves as a use case to investigate the usability of PointNet as a classifier of non-rigid objects. To train and evaluate the network, a new dataset was created which consists of samples of eight types of clothes grasped at a single random point. In the evaluation, diverse properties of the approach are shown and analyzed. The classifier was integrated into the ROS environment to allow its usage in various robot systems. Results of the evaluation indicate, that a sufficient classification accuracy can be reached when distinguishing general types of clothes. Furthermore, diverse tools were programmed which aid with the investigation of the recorded data and classification results.