Hans-Hellmut Nagel's research while affiliated with Karlsruhe Institute of Technology and other places

Publications (177)

Article
Die Auseinandersetzung mit der Vergangenheit hilft, die Zukunft zu gestalten. Die Existenz des Periodicums KI für nunmehr ein Vierteljahrhundert bietet Anlass, Fokus und Fundus für eine solche Auseinandersetzung. Die nachfolgenden Fragen laden die derzeit aktive Generation ein darüber nachzudenken, wie das Arbeitsgebiet Künstliche Intelligenz im ve...
Article
Cognitive visual tracking is the process of observing and understanding the behavior of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level...
Conference Paper
Some time ago, Computer Vision has passed the stage where it detected changes in image sequences, estimated Optical Flow, or began to track people and vehicles in videos. Currently, research in Computer Vision has expanded to extract descriptions of single actions or concatenations of actions from videos, sometimes even the description of agent beh...
Conference Paper
The study of 3D-model-based tracking in videos frequently has to be concerned with details of algorithms or their parameterisation. Time-consuming experiments have to be performed in this context which suggested to (at least partially) automate the evaluation of such experimental runs. A logic-based approach has been developed which generates Natur...
Article
A fully automatic initialization approach for 3D-model-based vehicle tracking has been developed, based on Edge-Element and Optical-Flow association. An entire automatic initialization and tracking system incorporating this approach achieves results comparable to those obtained by earlier experiments based on semi-interactive initialization, provid...
Article
Our 3D-model-based Computer Vision subsystem extracts vehicle trajectories from monocular digitized videos recording road vehicles in inner-city traffic. Steps are documented which import these quantitative geometrical results into a conceptual representation based on a Fuzzy Metric-Temporal Horn Logic (FMTHL, see [K.H. Schäfer, Unscharfe zeitlogis...
Chapter
PRO-ART (PROmetheus ARTificial intelligence) bezeichnet eines von sieben Teilprojekten des EUREKA-Projektes PROMETHEUS (PROgraM for a European Traffic with Highest Efficiency and Unprecedented Safety), das bereits vor mehr als zwanzig Jahren begonnen und vor dreizehn Jahren abgeschlossen worden ist – für unsere schnelllebige Zeit fast vor einer Ewi...
Article
An artificial cognitive vision system associates video signals with conceptual descriptions of the depicted time-varying scene. This linkage is mediated by knowledge representation formalisms. An experimental implementation of such an approach yielded initial results for the conceptual description of videos recorded at innercity traffic scenes, see...
Article
Full-text available
Motris, an integrated system for model-based tracking research, has been designed modularly to study the effects of algorithmic variations on tracking results. Motris attempts to avoid introducing bias into the relative assessment of alternative approaches. Such a bias may be caused by differences of implementation and parameterization if the compo...
Chapter
Advanced image sequence evaluation systems generate a voluminous amount of quantitative data which is increasingly difficult to assess. The challenge consists in abstracting from and reasoning with these data in order to create a more intuitive access to image evaluation results. This contribution reports about experiences and results gained by con...
Conference Paper
Full-text available
This contribution addresses the problem of detection and tracking of moving vehicles in image sequences from traffic scenes recorded by a stationary camera. In order to exploit the a priori knowledge about the shape and the physical motion of vehicles in traffic scenes, a parameterized vehicle model is used for an intraframe matching process and a...
Chapter
As a step towards a local analysis of local image features, the position, peak value, and covariance matrix of an isolated, noise-free multivariate Gaussian are determined in closed form from four ‘observables’, computed by gaussian-weighted averaging first and second powers of (up to second order) partial derivatives of a digitized greyvalue distr...
Chapter
The notion ‘cognitive vision system (CogVS)’ stimulates a wide spectrum of associations. In many cases, the attribute ‘cognitive’ is related to advanced abilities of living creatures, in particular of primates. In this context, a close association between the terms ‘cognitive’ and ‘vision’ appears natural, because it is well known that vision const...
Conference Paper
In a collection of scientific contributions like this one, it is usually left for the reader to create a coherent internal representation of the topical area discussed, based on the different aspects treated by the authors and on their points of view. The overall picture thus has to be constructed by integrating individual presentations with differ...
Book
During the last decade of the twentieth century, computer vision made considerable progress towards the consolidation of its fundaments, in particular regarding the treatment of geometry for the evaluation of stereo image pairs and of multi-view image recordings. Scientists thus began to look at basic computer vision solutions - irrespective of the...
Conference Paper
Full-text available
The textual description of video sequences exploits concep- tual knowledge about the behavior of depicted agents. An explicit repre- sentation of such behavioral knowledge facilitates not only the textual de- scription of video evaluation results, but can also be used for the inverse task of generating synthetic image sequences from textual descrip...
Conference Paper
Full-text available
An experimental comparison of 'Edge-Element Association (EEA)' and 'Marginalized Contour (MCo)' approaches for 3D model- based vehicle tracking in traffic scenes is complicated by the different shape and motion models with which they have been implemented orig- inally. It is shown that the steering-angle motion model originally asso- ciated with EE...
Article
Video surveillance systems usually have to operate on thousands or ten-thousands of frames. Interactive frame-byframe assessment of results, therefore, is time consuming and expensive. We present a simple and fast approach which allows automated cross-checking of image segmentations obtained by different algorithms or two versions of the same algor...
Article
An adequate natural language description of developments in a real-world scene can be taken as proof of "understanding what is going on." An algorithmic system that generates natural language descriptions from video recordings of road traffic scenes can be said to "understand" its input to the extent that algorithmically generated text is acceptabl...
Conference Paper
Full-text available
Tracking vehicles in image sequences of innercity road traffic scenes must be considered still to constitute a challenging task. Even if a-priori knowledge about the 3D shape of vehicles, of the background structure, and about vehicle motion is provided, (partial) occlusion and dense vehicle queues easily can cause initialization and tracking failu...
Conference Paper
Full-text available
3D-model-based tracking offers one possibility to explicate the manner in which spatial coherence can be exploited for the analysis of image sequences. Two seemingly different approaches towards 3D-model-based tracking are compared using the same digitized video sequences of road traffic scenes. Both approaches rely on the evaluation of greyvalue d...
Conference Paper
This investigation studies the contrast across boundaries of shadows which are cast by moving vehicles or selected stationary objects in a traffic scene onto the road surface. If a sufficient fraction of this subset of hypothetical shadow contours is overlapped by strong edge elements, such a finding is usually taken as a cue that directed sunshine...
Conference Paper
A long list of buzzwords which percolated through the computer vision community during the past thirty years leads to the question: does ‘Cognitive Vision Systems’ just denote another such ‘fleeting fad’? Upon closer inspection, many apparent ‘buzzwords’ refer to aspects of computer vision systems which became a legitimate target of widespread rese...
Article
Within various projects at the IITB, two road vehicles have been equipped and commissioned with systems for automatic lateral and longitudinal guidance which are based on computer vision: a small truck (Mercedes Benz 609D) and a sedan (BMW 735iL). The navigation system Travelpilot® has been integrated into the MB 609D in cooperation with Robert Bos...
Article
Occurrence' denotes a perceived element of a spatiotemporal development. In many languages, for example in English and German, occurrences correspond to verbphrases. We converted programmed recognition automata for German occurrence representations related to road traffic scenes into English occurrence specifications suitable for manipulation by a...
Conference Paper
An iterative adaptation process for the estimation of a Grayvalue Structure Tensor (GST) is studied experimentally: alternative adaptation rules, different parameterizations, and two convergence criteria are compared. The basic adaptation process converges for both synthetic and real image sequences in most cases towards essentially the same result...
Conference Paper
The complexity of inner city traffic areas presents a considerable challenge for driver support based on machine vision. This is due to a generally high density of different objects in the scene, sharing varying spatio-temporal relations, as well as difficult imaging conditions that complicate image evaluation. Road borders, vehicles, traffic signs...
Article
Based on geometric results obtained by an algorithmic video (sequence) evaluation, a generic conceptual representation will be instantiated into a representation of the specific temporal developments within the recorded scene. Such an instantiated conceptual representation will in turn provide the input for a text generation subsystem. This contrib...
Article
Currently available driver assistance systems (i) warn the driver based on vehicle state sensors (e.g., door open, outside temperature near or below the freezing point), (ii) offer route guidance information (navigation systems based on GPS and digital road maps), or—in some critical situations—(iii) even actively influence vehicle handling under c...
Article
The Fraunhofer-iitb group aims to contribute to Vigor by the design, implementation, and test of machine-vision-controlled robotic disassembly operations without the necessity to rely on calibrated cameras. We study in particular aspects which turn out to be relevant in a real-world application-oriented task domain, in the Fraunhofer-iitb case the...
Article
Introduction Exponential increases in the computing capacity of highly integrated processor chips have reached a state where real-time machine vision comes within reach. This expectation pertains not only to the use of expensive special purpose computers, but even in cases where merely standard workstations can be made available. The approach towar...
Conference Paper
Although a model-based vehicle tracking approach offers the promise to be more reliable than a purely data-driven one, based on the additional knowledge brought to bear during the tracking phase, a suitable initialisation of the tracking phase still presents considerable problems. Part of these difficulties are related to the appropriate choice of...
Conference Paper
Automatic disassembly of used products can not assume that CAD databases will provide precise knowledge about the pose, size, and shape of components to be manipulated because some of the components may have been repaired, repositioned, or replaced, thereby possibly invalidating the original construction data. In principle, the missing information...
Conference Paper
Driver support in inner-city road traffic based on machine vision still represents a considerable challenge. Model-based machine vision exploits a-priori knowledge, for example about the lane structure of roads and intersections, to select relevant image structures. Infrastructural objects, such as lamp posts or masts with attached traffic signs, o...
Conference Paper
A systematic categorization of an adaptively determined local estimate for an OF-vector allows to detect non-local `walls of discontinuities' which establish an essentially closed boundary around images of objects whose motion differs from that of foreground and background. The estimation process has been refined to the point where occasionally obs...
Conference Paper
Experience has shown that changes in illumination conditions may influence the robustness of robot control based on machine vision even if such illumination changes appear practically negligible to a human. CMOS cameras have become available with a larger dynamic range than conventional (high quality) CCD cameras. The use of such higher dynamic ran...
Article
Automatic disassembly tasks in the engine compartment of a used car constitute a challenge for control of a disassembly robot by machine vision. Experience in exploratory experiments under such conditions forced us to abandon data-driven aggregation of edge elements into straight-line data segments in favor of a direct association of individual edg...
Article
The image sequence evaluation system Xtrack detects, initializes, and tracks images of moving road vehicles in videosequences recorded at innercity tra#c scenes by a stationary camera. It recently has been supplemented, moreover, by subsystems which associate the geometric tracking results to conceptual representations of tra#c situations at innerc...
Conference Paper
Over the past thirty years, technical means to record, digitize, store, process, and present video image sequences have multiplied in capacity by several orders of magnitude. This large quantitative improvement is about to turn into a qualitative change regarding the research to be pursued within this area in the future. Whereas up to now an explor...
Conference Paper
Full-text available
Driver support in inner city road traffic still presents a considerable challenge for machine vision. Model-based machine vision becomes attractive in this context since it allows one to exploit the knowledge provided by a model in order to select relevant image structures. It requires, however, to make suitable models available to the computer vis...
Article
Our image evaluation system XTRACK tracks multiple-vehicle-configurations in image sequences. The resulting geometric state descriptions are associated with fuzzy attributes and relations and thereby form the basis for incremental characterization of traffic situations from the point of view of selected road users or observers. Knowledge representa...
Conference Paper
Segmentation of optical flow fields, estimated by spatio-temporally adaptive methods, is - under favourable conditions - reliable enough to track moving vehicles at intersections without using vehicle or road models. Already a single image plane trajectory per lane obtained in this manner offers valuable information about where lane markers should...
Article
A model-based vehicle tracking system for the evaluation of inner-city traffic video sequences has been systematically tested on about 15 minutes of real world video data. Methodological improvements during preparatory test phases affected—among other changes—the combination of edge element and optical flow estimates in the measurement process and...
Article
Quantitative geometric descriptions of the movements of persons are obtained by fitting the projection of a three-dimensional person model to consecutive frames of an image sequence. The kinematic of the person model is given by a homogeneous transformation tree and its body parts are modeled by right-elliptical cones. The values of a varying numbe...
Conference Paper
The interpretation of road traffic scenes recorded by a stationary video camera requires a sufficiently large field of view in order to capture non-trivial maneuvers. Vehicle images will thus be small. Their reliable tracking needs to combine edge-based approaches for precision with area-oriented approaches, for example based on a match between est...
Article
. This contribution describes an approach to the determination of the structure of several objects of a non-stationary polyhedral scene as well as their velocities using real-time stereo vision. While the determination of the structure of an object is based on the association of extracted 2D edge segments of both images obtained from the two camera...
Article
The introduction of model-based machine vision into the feedback loop of a robot manipulator usually implies that edge elements extracted from the current digitized video frame are matched to segments of a workpiece model which has been projected into the image plane according to the current estimate of the relative pose between the recording video...
Article
. During evaluation of real--world traffic scenes, we often encounter the situation that the vehicles under scrutiny are temporarily occluded by dynamic or stationary scene components. Vision--based tracking algorithms often fail in tracking vehicles under such conditions. Contextual knowledge about occlusions is expected to facilitate the vehicle...
Article
Recent progress in the detection and tracking of moving vehicles in image sequences facilitated systematic research regarding the automatic description of complex maneuver sequences at a level of abstraction corresponding to the concepts of situations and goals. We outline the design and implementation of a coherent system which links the evaluatio...
Article
This paper adresses the two main subtasks of vision-based disassembly, i.e. object recognition and visual tracking where the latter is part of a visual servoing approach. The specifics of these two subtasks and the relation between these specifics are presented and explained in a common framework. Experiments regarding the repeatability of visual s...
Article
This article describes an architectural and procedural redesign of an existing disassembly system that is able to automatically dismantle polyhedral workpieces. In order to cope with more complex tasks like the automatic disassembly of used cars, all system levels, namely the hardware level, the administration software level, and the control softwa...
Article
Full-text available
Annual increases in workstation capacity suggest that today between 5 and 15 times the computing power of 1994 should be available at roughly comparable costs. It there by becomes possible for a normal laboratory to provide at least approximately the computing power required for machine-vision-based control of robots. Experience with an experimenta...
Conference Paper
Synthetic image sequences are generated based on conceptual descriptions which have been extracted automatically by model- based tracking from video sequences recording vehicle manouvers in inner-city traffic scenes. A detailed comparison between original and synthesized image sequences offers clues as to which knowledge must be provided in additio...
Conference Paper
An image sequence evaluation process combines information from different information sources. One of these sources is a camera which records a scene and provides the acquired information as a digitized image sequence. A different source provides knowledge regarding signal processing and geometry, exploited in order to map the image sequence signal...
Conference Paper
Ein Ansatz zur modellgestützten Verfolgung von Kraftfahrzeugen in Videosequenzen von Kreuzungsszenen wurde an einer umfangreichen Stichprobe detailliert überprüft. Als Konsequenz konzentrierten sich weitere Verbesserungsbemühungen zunächst auf die Initialisierungsphase. Durch Nutzung des für die Verfolgungsphase ohnehin bereitzustellenden Wissens k...
Article
Temporal developments within a scene can be recorded by a video camera in the form of spatio-temporal grayvalue variations. The digitization and subsequent algorithmic evaluation of the resulting video sequence transforms, as a first step, the original signal into a geometric description which comprises the shape, position, and trajectory of bodies...
Conference Paper
The image sequence evaluation system Xtrack detects, initializes, and tracks images of moving road vehicles in video sequences recorded at innercity traffic scenes by a stationary camera. It recently has been supplemented, moreover, by subsystems which associate the geometric tracking results to conceptual representations of traffic situations at i...
Conference Paper
It is advantageous to use an independently mobile camera in visually servoed disassembly operations since its degrees of freedom can be used to bring the camera into an optimal view-pose. Apart from the vision-based closed-loop control that guides the disassembly robot, a second vision-based closed-loop control can be set up which upgrades the came...
Conference Paper
The abundance of geometric results from image sequence evaluation which is expected to shortly become available creates a new problem: how to present this material to a user without inundating him with unwanted details? A system design which attempts to cope not only with image sequence evaluation, but in addition with an increasing number of abstr...
Conference Paper
A grayvalue structure tensor provides knowledge about a local grayvalue variation. This knowledge can be used to devise a spatiotemporally adaptive optic flow estimation process. Such an adaptive estimation lowers the level at which the resulting optic flow (OF) field is disturbed by noise and estimation artefacts. This in turn substantially simpli...
Conference Paper
This contribution attempts to move beyond the status where single moving objects in video image sequences are tracked separately in the scene domain, based on individually adapted approaches and parameters. Instead, we investigate which performance can be achieved by a combination of approaches based on edge element orientation and on optical flow,...
Article
It has been shown in previous publications (Proceedings of the Fourth European Conference on Computer Vision 1996 (ECCV '96), 14–18 April 1996, Cambridge, UK, Lecture Notes in Computer Science 1065 (Vol. II), Springer-Verlag, Berlin, Heidelberg, 1996, pp. 388–399, pp. 485–494) that model-based tracking of partially occluded vehicles in image sequen...
Article
Vision-based automatic driving along innercity roads and across complex innercity intersections requires to detect and track road markings and lane boundaries in order to determine the position and orientation of the vehicle relative to the ground. The complexity of intersection scenes and the disturbances in the detected contours enforce the use o...
Article
The tracking of inner-city intersections under realistic conditions, i.e. dense traffic, increases the number of artefacts, like occluded or distorted objects, compared to the tracking of free country roads or highways. This requires the best possible evaluation of all visible parts of the intersection, as well as of all available sensor data, and...
Conference Paper
Model-based vehicle tracking in traffic image sequences can be made more robust by matching expected displacement rates of vehicle surface points to optical flow (OF) vectors computed from an image sequence. The capability to track vehicles uninterruptedly in this manner over extended image sequences results in the ability to investigate even small...
Conference Paper
A system for model-based tracking of road vehicles in digitized video sequences of traffic scenes has been generalized to handle “truck-and-trailer”-configurations. Whereas, previously, each vehicle had been modeled as a single rigid polyhedron, the generalisation handles two or more rigid components modeled as polyhedra with (one degree of freedom...
Conference Paper
Die Weiterentwicklung der Segmentierung eines adaptiv geschätzten Optischen-Fluß-Vektorfeldes unterstützt die Zuordnung von Segmenten zwischen aufeinander folgenden Aufnahmen. So lassen sich selbst kleine Abbilder von Straßenfahrzeugen in sogar leicht gestörten Bildfolgen normaler Konsum-Videokameras auch bei partieller Verdeckung durch Masten und...
Conference Paper
Ein Bildfolgenauswertungssystem zur Verfolgung sich bewegender Objekte in Straßenverkehrsszenen und zur begrifflichen Charakterisierung ihrer Verkehrssituation wird um die Behandlung zeitweise vollständig verdeckter Objekte ergänzt. Typische Verkehrssituationen werden hierzu begrifflich modelliert und unter Ausnutzung von automatisch extrahierten g...
Conference Paper
Modellgestützte Bildfolgenauswertung kann inzwischen - allerdings noch unter einer Reihe von einschränkenden Voraussetzungen - die Art, die Position und Orientierung sowie die Geschwindigkeit von Verkehrsteilnehmern automatisch aus Videobildfolgen von Verkehrsszenen extrahieren. Die Weiterentwicklung dieses Ansatzes führt zu wachsender Robustheit u...
Article
Full-text available
In this paper we propose a new vision based method for realizing automated disassembly tasks. We applied our method to identification and localization of parts of a car engine, but the method can be generalized to a broad range of assembly or disassembly tasks.
Conference Paper
Full-text available
Presents results of a model-based approach to visual tracking and pose estimation for a moving polyhedral tool in position-based visual servoing. This enables the control of a robot in look-and-move mode to achieve six degree of freedom goal configurations. Robust solutions of the correspondence problem-known as “matching” in the static case and “t...
Conference Paper
This contribution presents a position-based approach to visual servoing that allows the manipulation of quasi- or nonpolyhedral objects in a complex scene, namely the engine compartment of a used car. During a manipulation, the moving disassembly robot with its end-effector-mounted tool as well as the-possibly moving-object are observed by an indep...
Article
This contribution addresses the problem of pose estimation and tracking of vehicles in image sequences from traffic scenes recorded by a stationary camera. In a new algorithm, the vehicle pose is estimated by directly matching polyhedral vehicle models to image gradients without an edge segment extraction process. The new approach is significantly...
Conference Paper
The disassembly of engines of used cars based on visually servoed manipulators requires a state-of-the-art vision system which is able to evaluate images of dirty, partly occluded or possibly rearranged quasi- or non-polyhedral parts in real-time. As the pose of a stationary rigid part has, in general, six degrees of freedom, at least six degrees o...
Article
Model-based tracking of vehicles in real world image sequences of traffic may fail due to different reasons. A careful analysis of failed tracking experiments brought to light that one of these phenomena consists in an incorrect match of parts of the vehicle model to image features belonging to other scene components. This effect appears in particu...
Article
Full-text available
Although image understanding and natural language processing constitute two major areas of AI, they have mostly been studied independently of each other. Only a few attempts have been concerned with the integration of computer vision and the generation of natural language expressions for the description of image sequences. The aim of our joint effo...
Conference Paper
In order to automatically run a dismantling task, first a workpiece has to be recognized and its pose has to be estimated. This step is solved by a model-based recognition process using the information from three stationary cameras. Our workpiece model could be extracted from CAD-data and consists of planar patches and circular features, where the...
Conference Paper
The temporal changes of gray value structures recorded in an image sequence contain significantly more information about the recorded scene than the gray value structures of a single image. By incorporating optical flow estimates into the measurement function, our D pose estimation process exploits interframe information from an image sequence in a...
Conference Paper
Vehicles on downtown roads can be occluded by other vehicles or by stationary scene components such as traffic lights or road signs. After having recorded such a scene by a video camera, we noticed that the occlusion may disturb the detection and tracking of vehicles by previous versions of our computer vision approach. In this contribution we demo...
Article
Bildfolgenauswertung bezeichnet den Prozeß, der aus einer digitisierten Bildfolge Aussagen über die abgebildete Szene sowie über deren zeitliche Entwicklung ableitet. Solche Aussagen können auf unterschiedlichen Abstraktionsebenen formuliert werden. Jeder Abstraktionsebene läßt sich ein Teilprozeß zuordnen, dessen Implementation einer Komponente in...
Conference Paper
Unser Bildfolgenauswertungssystem XTrack berechnet aus aufgezeichneten Videobildfolgen von Straßenverkehrsszenen Beschreibungen von Fahrzeugverhalten in Form von Bewegungsverben. Desweiteren werden durch Verdeckungsmodellierung Verdeckungssituationen in Straßenverkehrsszenen in Form von Verdeckungsprädikaten beschrieben. In dem hier diskutierten An...