Andreas Ess's research while affiliated with ETH Zurich and other places

Publications (21)

Chapter
Pedestrians do not walk randomly. While they move toward their desired destination, they avoid static obstacles and other pedestrians. At the same time they try not to slow down too much as well as not to speed up excessively. Studies coming from the field of social psychology show that pedestrians exhibit common behavioral patterns. For example th...
Article
Full-text available
We address the problem of vision-based nav- igation in busy inner-city locations, using a stereo rig mounted on a mobile platform. In this scenario seman- tic information becomes important: rather than mod- elling moving objects as arbitrary obstacles, they should be categorised and tracked in order to predict their fu- ture behaviour. To this end,...
Article
We report on a stereo system for 3D detection and tracking of pedestrians in urban traffic scenes. The system is built around a probabilistic environment model which fuses evidence from dense D reconstruction and image-based pedestrian detection into a consistent interpretation of the observed scene, and a multi-hypothesis tracker to reconstruct th...
Conference Paper
Full-text available
We consider the problem of data association in a multi-person tracking context. In semi-crowded environments, people are still discernible as individually moving entities, that undergo many interactions with other people in their direct surrounding. Finding the correct association is therefore difficult, but higher-order social factors, such as gro...
Conference Paper
This paper presents an integrated framework for mobile street-level tracking of multiple persons. In contrast to classic tracking-by-detection approaches, our framework employs an efficient level-set tracker in order to follow individual pedestrians over time. This low-level tracker is initialized and periodically updated by a pedestrian detector a...
Article
This paper addresses the use of social behavior models for the prediction of a pedestrian's future motion. Recently, such models have been shown to outperform simple constant velocity models in cases where data association becomes ambiguous, e.g. in case of occlusion, bad image quality, or low frame rates. However, to account for the multiple alter...
Article
In this paper, we address the problem of multiperson tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution that extracts as much visual information as possible and combines it through cognitive feedback cycles. We propose such an approach, which jointly esti...
Conference Paper
Full-text available
Object tracking typically relies on a dynamic model to predict the object's location from its past trajectory. In crowded scenarios a strong dynamic model is particularly important, because more accurate predictions allow for smaller search regions, which greatly simplifies data association. Traditional dynamic models predict the location for each...
Conference Paper
Full-text available
We present a wearable audio-visual capturing system, termed AWEAR 2.0, along with its underlying vision components that allow robust self-localization, multi-body pedestrian tracking, and dense scene reconstruction. Designed as a backpack, the system is aimed at supporting the cognitive abilities of the wearer. In this paper, we focus on the design...
Conference Paper
Full-text available
We address the problem of vision-based multi-person tracking in busy pedestrian zones using a stereo rig mounted on a mobile platform. Specifically, we are interested in the application of such a system for supporting path planning algorithms in the avoidance of dynamic obstacles. The complexity of the problem calls for an integrated solution, whic...
Conference Paper
Full-text available
In this paper, we address the problem of 3D articulated multi-person tracking in busy street scenes from a moving, human-level observer. In order to handle the complexity of multi-person interactions, we propose to pursue a two-stage strategy. A multi-body detection-based tracker first analyzes the scene and recovers individual pedestrian trajector...
Conference Paper
Full-text available
We present a mobile vision system for multi-person tracking in busy environments. Specifically, the system integrates continuous visual odometry computation with tracking-by-detection in order to track pedestrians in spite of frequent occlusions and egomotion of the camera rig. To achieve reliable performance under real-world conditions, it has lon...
Article
This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images...
Article
Full-text available
In this paper, we address the problem of multi-person tracking in busy pedestrian zones, using a stereo rig mounted on a mobile platform. The complexity of the problem calls for an integrated solution, which extracts as much visual information as possible and combines it through cognitive feedback. We propose such an approach, which jointly estimat...
Conference Paper
Full-text available
This paper investigates several aspects of 3D-2D camera pose estimation, aimed at robot navigation in poorly-textured scenes. The major contribution is a fast, linear algorithm for the general case with six or more points. We show how to specialise this to work with only four or five points, which is of utmost importance in a test and hypothesis fr...
Article
Full-text available
In this paper, we address the challenging problem of si-multaneous pedestrian detection and ground-plane estima-tion from video while walking through a busy pedestrian zone. Our proposed system integrates robust stereo depth cues, ground-plane estimation, and appearance-based ob-ject detection in a principled fashion using a graphical model. Object...
Conference Paper
Full-text available
Accurate staging of nodal cancer still relies on surgical exploration because many primary malignancies spread via lymphatic dissemination. The purpose of this study was to utilize nanoparticle-enhanced lymphotropic magnetic resonance imaging (LN-MRI) to explore semi-automated noninvasive nodal cancer staging. We present a joint image segmentation...
Conference Paper
Full-text available
This paper addresses the problem of camera self-calibration, bundle adjustment and 3D reconstruction from line segments in two images of poorly-textured indoor scenes. First, we generate line segment correspondences, using an extended version of our previously proposed matching scheme. The first main contribution is a new method to identify polyhed...
Article
Full-text available
We address the problem of vision-based multi- person tracking in busy inner-city locations using a stereo rig mounted on a mobile platform. Specifically, we are interested in the application of such a system for autonomous navigation and path planning. In such a scenario, semantic information about the moving scene objects becomes important. In ord...

Citations

... In traffic-sign detection, several problems may be involved like variations in perspective, illumination, occlusion, motion blur, and weatherworn deterioration of signs [5]. Some of the challenges that drivers confront are during time day, overcast weather day, and poor visibility [6]. The majority of the time, drivers are unaware of traffic signs. ...
... Agricultural research uses machine learning techniques such as artificial neural networks (ANNs), decision trees, K-means, k-nearest neighbors, and support vector machines (SVMs) [12]. Traditional methods for classifying images rely on manually created features such as SIFT [22], HoG [23], and SURF [24], and then apply learning algorithms to these feature spaces. They found that the effectiveness of each of these techniques largely depended on the underlying preset characteristics [14]. ...
... Therefore, a human aware navigation algorithm should incorporate human cooperation while predicting and planning trajectories (Fig. 1). Usefulness of this cooperation (or joint collision avoidance) is shown in different studies [8][9][10]. ...
... Furthermore, Multi-Object Tracking in video sequences is also widely used in other military and civil applications, such as sports players tracking and analysis [21], biology [23], robot navigation [6], and autonomous driving vehicles [7]. ...
... Generally, object tracking encompasses two subtasks, target discrimination and state estimation, which are the fundamental steps for an agent to sense the surrounding environment and conduct motion planning [5]. Over the last few years, 2D single object tracking task has been explored extensively [6][7][8][9]. Inspired by that success, many RGB-D based methods refer to the pattern of 2D tracking to conduct 3D tracking [10][11][12][13]. Although working well in the conventional 2D domain, these methods rely heavily on the RGB modality. ...
... We used three datasets in this study, MOT17-09 (Milan et al., 2016;Leal-Taixé et al., 2015;Ess, Leibe & Van Gool, 2007), MOT20-02 (Dendorfer et al., 2020), and Room Human Counting (RHC) (Pardamean et al., 2021) datasets, with 525, 2,782, and 1,195 images respectively. Both MOT17-09 and MOT20-02 are subsets of the MOT17 and MOT20 datasets which are originally used for object detection of people. ...
... Pioneers have predicted the motion of dynamic objects with Kalman filter [1], linear trajectory avoidance model [2], and social force model [3]. Compared to these traditional modeling techniques based on handcrafted features, deep learning algorithms that learn features automatically via optimizing loss functions have recently attracted researchers' attention. ...
... Hence, 15 historical positions and 25 ground truth positions are obtained in each trajectory. ETH/UCY: This dataset includes a total of 5 videos (ETH, HOTEL, UNIV, ZARA1, and ZARA2) from 4 different scenes (ZARA1 and ZARA2 from the same camera, but at different times) [44]. Totally 1536 pedestrians are in the crowds with challenging social interactions. ...
... Since initial prototypes such as [13], we find plenty of proposed wearable sensor applications [15]. Recently we find wearable systems proposed for recognizing daily activities [27,26]; other recent works, closer to ours, are oriented to specific assistance such as visually impaired or elderly recognition and navigation aid systems [10,14]. ...
... Furthermore, these algorithms are generally applied for a nodal size of 8 mm or larger. For MRI data, some researchers have utilized T1-weighted imaging (T1WI) and/or T2-weighted imaging (T2WI) for LNs detection and segmentation [20,21]. However, T2WI and diffusion-weighted imaging (DWI) are the most important sequences for nodal identification in clinical practice [22]. ...